2026 - building

Agent Probe Lab

A tiny playground for repeatable agent capability experiments with prompts, raw logs, and scoring scripts.

AstroTypeScriptLLMs

Agent Probe Lab is the first project slot for this site. The idea is deliberately small: one folder per experiment, one scoring script, and one writeup per result.

The first useful milestone is a public repo with a single repeatable probe and an honest writeup of where the agent fails.