AgentPlane BitGN Adapter

Adapter scaffold for running AgentPlane-backed Codex execution against BitGN benchmarks.

AgentPlane is not submitted as a model. It is used as a control-plane profile around an executor:

benchmark runtime: BitGN PCM or ECOM
executor: Codex CLI
control layer: policy, step loop, proof bundle, score-detail capture

Status

Experimental. Current proven coverage is deliberately narrow:

bitgn/sandbox t01: pass, score 1.00.
bitgn/pac1-dev t01: pass, score 1.00.
bitgn/ecom1-dev t01: pass, score 1.00.

All non-t01 PAC1 and ECOM1 tasks are not passing in current evidence and must be treated as failing/unsupported until a live run proves otherwise. This repo is not leaderboard-ready.

Why this exists

BitGN evaluates observable agent behavior: runtime tool calls, files, task state, side effects, outcome codes, compliance, and security posture. That is the same surface where AgentPlane can add value: bounded policy, traceability, explicit outcomes, and failure evidence.

The near-term goal is not "AgentPlane beats everyone". The useful public claim is narrower:

AgentPlane can wrap a strong executor, preserve BitGN benchmark validity, and produce auditable evidence for why trials passed or failed.

Install

make sync

Install BitGN SDK dependencies from the same Buf registry used by the upstream samples:

make sync-bitgn

The SDK currently tracks Python 3.14 in the sample agents, so the Make targets create a Python 3.14 uv environment.

Authentication

Codex can use ChatGPT subscription auth:

codex login
codex login status

That path is useful for local smoke runs because the adapter invokes codex exec. For reproducible public runs, API-key auth is still cleaner because it is easier to document and recreate in CI or another machine.

BitGN official runs still need:

export BITGN_API_KEY="..."

PAC1 smoke

cp .env.example .env.local
$EDITOR .env.local
make oauth
make sandbox

scripts/bitgn_smoke.sh loads .env and then .env.local; keep secrets in one of those ignored files, not in committed config.

Sandbox is the first end-to-end check because it does not require a BitGN Platform key. PAC1 is the next check:

make pac1

ECOM smoke

Set:

BENCHMARK_ID=bitgn/ecom1-dev
BITGN_RUNTIME=ecom

Then run a single task:

make ecom

Proof bundle

Each trial writes:

.agentplane-bitgn/<benchmark-id>/<runtime>/<task-id>/<trial-id>/
  AGENTS.md
  proof.json

The proof bundle captures:

benchmark id and runtime
model id
task id and trial id
each JSON tool command requested by Codex
runtime observations, truncated for readability
final status

Documentation

Leaderboard realism

PAC1 live already has multiple 104/104 runs. A naive scaffold is unlikely to stand out there. The best AgentPlane path is:

Use PAC1 DEV to harden outcome selection, grounding refs, structured writes, and injection refusal.
Mine score_detail into regression cases.
Move to ECOM1, where policy books, payment state, SQL, fraud controls, and audit trails are closer to AgentPlane's control-plane strengths.
Publish a proof-backed run rather than only a score screenshot.

Integrity rules

Do not:

fetch benchmark solutions from the internet;
inspect hidden graders or oracle solutions;
alter BitGN scoring, task sets, or runtime contracts;
inject task-specific hints into the adapter policy;
claim leaderboard readiness without a reproducible run id and proof bundle.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
docs		docs
scripts		scripts
src/agentplane_bitgn_adapter		src/agentplane_bitgn_adapter
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentPlane BitGN Adapter

Status

Why this exists

Install

Authentication

PAC1 smoke

ECOM smoke

Proof bundle

Documentation

Leaderboard realism

Integrity rules

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentPlane BitGN Adapter

Status

Why this exists

Install

Authentication

PAC1 smoke

ECOM smoke

Proof bundle

Documentation

Leaderboard realism

Integrity rules

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages