Abundant
Code Evaluation & RL Data Trusted by 2 of the top 3 AI labs abundant.ai

We build 500+ coding tasks with a 30% frontier model pass rate, and the training environments to run them.

Domains covered
Web Backend Data Science ML Engineering ML Research SWE-Bench compatible Terminal Bench compatible Long-Horizon SWE Custom on request
Our methodology
01 — SOURCE
Real experts. No AI slop.
Every expert in our network is identified and recruited directly by our founders. Each one is interviewed in person by a calibrated domain expert with real industry experience in their field — no AI screeners, no recent grads setting the bar.
02 — BUILD
Task sample turnaround in under 1 day.
Every task is created by a domain expert in our network. We have 500+ tasks ready to deploy immediately, and we can scope and deliver 1,000 custom RL environments in 4 weeks.
03 — VALIDATE
We reject 20–40% more tasks than standard pipelines.
Most vendors stop at human review. We add deterministic verifier testing and agentic trace validation — running every task in a code sandbox to confirm that failures reflect model capability, not task ambiguity.
What our research partners say
"Abundant's format was the easiest for us to ingest. When we hit limitations with our training infrastructure, they quickly adapted."
Product LeadershipGoogle DeepMind
"Turned around high-quality data super fast — sometimes within 24 hours. Other vendors took months to finalize spec. Abundant did it in weeks."
Founder/CEOFirecrawl
"We showed Abundant's annotation platform to other teams, and everyone immediately wanted access. What started as a pilot for my team became the standard across our research org."
Research ScientistAdobe Research
Ready to get started?
500 tasks available immediately. Custom builds in weeks.
Request a sample Book a call →
Abundant
Code Evaluation & RL Data Trusted by 2 of the top 3 AI labs abundant.ai
By the numbers
We move faster than anyone else.
24h
Sample turnaroundFirst sample delivered within 24 hours
500
Tasks available todaySWE-bench + Terminal Bench, immediate delivery
100×
Yield vs. automated collection~10 PRs per repo automated → 100s per repo with Abundant
2 wks
To spec finalization"Other vendors took months to finalize spec. Abundant did so within weeks." — Data Ops Manager
We break our own tests before you do.
100%
of tasks pass human review, Oracle validation, and reward-hack stress testing before deliveryNothing leaves until it's verified end to end
30%
Frontier model pass rate calibrationCalibrated to be hard but solvable — not trivial, not impossible
1:8
Reviewer to task creator ratioDedicated QA on every task batch
Every
task is adversarially tested before delivery
We're the team you want to work with.
Founder
Jesse Hu
CEO
Ex-Waymo ML Engineer
TL for Data Quality & Evals
Co-author on Terminal Bench
Co-founder
Ke Huang
CTO
Google Assistant Eval Lead
GDM Trust & Safety
Brex Compliance & Ops Lead
Rishi Desai
Rishi Desai
Research Lead
Coding agents since 2023
Creator of SWE-Gen
Contributor to Terminal Bench
Meji Abidoye
Meji Abidoye
Co-founder, Data Quality Lead
Ex-AWS Infrastructure Lead
Early contributor to Terminal Bench and Harbor
Ready to get started?
500 tasks available immediately. Custom builds in weeks.
Request a sample Book a call →