Agent Probe Lab is the first project slot for this site. The idea is deliberately small: one folder per experiment, one scoring script, and one writeup per result.
The first useful milestone is a public repo with a single repeatable probe and an honest writeup of where the agent fails.