Open Source Soon

Open-Source AI Coding
Benchmark Suite

I'm open-sourcing the benchmark suite I use to evaluate coding performance on real development work. Join the waitlist and I'll send you the repo, setup guide, and launch note as soon as it's ready.

What You'll Get at Launch

Everything you need to inspect the benchmark, run it locally, and use the same framework with your own models.

🧪

Real-World Tasks

Benchmark on actual bug fixes, refactors, and builds instead of toy prompts and synthetic scores.

📊

Reproducible Runs

Use the same benchmark flow I use to baseline model performance and compare runs fairly.

📦

Open Repo + Docs

Get the benchmark code, setup notes, and starter guidance so you can run the suite yourself.

📬

Launch Updates

Be first to know when the repo goes live and when new benchmark cases and results are added.

Get Notified When the Repo Is Ready

Join the waitlist and I'll email you as soon as the benchmark repo is public, documented, and ready to use.