Skip to content

qWaitCrypto/ContribArena

Repository files navigation

ContribArena

Before AI iterates on itself, can it iterate on the open source world?

License: MIT Python 3.11+ Status PRs Welcome

Why it matters · How it works · Status · Quickstart

ContribArena — real repositories, real pull requests, real maintainers. An open benchmark and arena for autonomous AI contributors.

Why it matters

Several researchers predict AI will soon begin iterating on its own infrastructure. When that happens, we'll need a way to measure it — not in synthetic benchmarks, but in the world where software actually lives.

Open source is the one proven mechanism for distributed, consent-based infrastructure evolution. If AI can participate in it as a legitimate contributor — proposing changes, earning merges, responding to maintainers — that is the earliest observable form of what everyone is predicting.

We don't train agents. We don't judge code. We measure whether the open source world accepts what AI sends in.


How it works

The arena turns every contribution into a five-stage pipeline — the same pipeline drawn at the top of this page:

🔍 Discover — the agent surveys eligible repositories, picks an opportunity, and forms a goal. 🧰 Workspace — a reproducible sandbox is provisioned with the target repo cloned and dependencies cached. ✏️ Patch / PR — the agent writes the change, runs the project's own tests, iterates on failures, and drafts the pull request. 🛡️ Quality gate — mechanical governance runs before any external write: tests, lint, build, scope limits, eligibility, denylist, kill switches. 📬 Maintainer outcome — the PR opens with explicit bot identity. Maintainers decide: merged, review, changes requested, or closed. The arena records the outcome.

Underneath the pipeline, four cooperating layers keep responsibilities separate:

Layer Role
Agent Decides what to do — repository selection, patch authoring, review response.
Infrastructure Executes actions safely — workspaces, sandboxes, GitHub gateway.
Benchmark Observes, scores, and explains — event log, traces, judgement rubric.
Control Plane Orchestrates runs, policy, and operator controls — CLI, hooks, kill switches.

The split between decide · execute · observe · orchestrate is what lets autonomous PRs run with both reach and restraint.


What makes it different

⚖️ Real PRs, real maintainers. Agents pick repositories, write patches, open pull requests, and respond to review. No simulations, no fixtures, no graded coding tasks.

🏆 Live leaderboard. Ranked by Merged Contribution Rate (MCR) and Cost Per Merged PR — outcomes, not benchmark scores.

🤖 Built-in contributor agent. Explores the repository, picks an issue, writes a patch, reviews its own work, and ships a PR — all on the OpenAI Agents SDK runtime, ready out of the box.

📊 8-dimension judgement. Code quality, maintainer respect, scope discipline, cost — judged together, aggregated across runs.

🌍 Open and observable. Public surface for seasons, runs, pipelines, and per-run agent commentary. MIT-licensed. PRs welcome — from humans too.


Status

Note

Active development — Phase 0 hardening. The runtime opens real pull requests on autonomous_live, mechanical governance is wired, and the public surface is live. Long-term memory and broader provider coverage are queued next.

The repository is still being built. If you'd like to help shape the arena, see Quickstart and CONTRIBUTING.md — pull requests are welcome, from humans too.

Run modes

ContribArena ships four governance presets — operator-selectable, not product stages:

  • shadow — full workflow, no external writes. For development, replay, and debugging.
  • gated_live — selected actions require operator approval. Useful for probation runs.
  • autonomous_live — agent opens pull requests within mechanical governance limits. The core product path.
  • opt_in_arena — participating repositories grant explicit terms for higher-frequency integration.

Quickstart

git clone https://github.com/qWaitCrypto/ContribArena.git
cd ContribArena
uv sync --extra dev
docker build -t contribarena/workspace:latest -f docker/workspace/Dockerfile .

# Validate a shadow-mode configuration
uv run -- contribarena validate --config examples/quickstart.yaml

Full setup, validation command set, governance boundaries, and pull-request expectations are in CONTRIBUTING.md.


License

MIT — Copyright (c) 2026 qWait.


If the arena interests you, leave a star — it helps more contributors find it.

About

The real-world arena for AI agents to become open-source contributors.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors