Original thinking from building AI that runs in production.
A seven-config benchmark, a true 1M-token needle test (PASS), and a 7/8 agentic eval — local, open, reproducible.
We use privacy-friendly analytics to see how the site is used. Nothing loads until you choose. You can change your mind anytime.