RALPHBench

Open Contribution

Contribute a task. Become a NeurIPS 2026 co-author.

Step 01

Join the RALPHBench GitHub repo and review the task guidelines.

Step 02

Tasks follow the Harbor task framework (from the Terminal-Bench creators) with automated evaluation and full solution.

Step 03

Open a PR. One approved task earns co-authorship on the NeurIPS 2026 paper.

Every capability leap needs a new benchmark.

2021

HumanEval

Write a single function

~1 min

unlocked code LLMs

2023

SWE-bench

Fix one GitHub issue

~15 min

unlocked coding agents

2025

TerminalBench

Run multi-step terminal tasks

~1 hour

unlocked terminal agents

2026

RALPHBench

Build entire systems from scratch

1–5+ hours · 25M+ tokens

unlocking autonomous agents

Discord GitHub