Skip to content

Commit ac7a7bc

Browse files
committed
fix: add gh repo for public txrca bench data
1 parent 6ba85ff commit ac7a7bc

1 file changed

Lines changed: 3 additions & 1 deletion

File tree

src/content/posts/txrca-bench/index.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -328,4 +328,6 @@ Several extensions are natural follow-ups:
328328

329329
We are releasing the TxRCA-Bench benchmark data publicly — the 70 annotated exploit transactions with ground-truth root cause labels, the per-case workspaces (raw traces, event logs, contracts, ABIs, Solidity sources), all 490 raw agent outputs with both judges' scores, and the JSON output schema. Anyone should be able to re-score outputs, test their own agent against the same evidence, or extend the benchmark with new cases.
330330

331-
The agent runtime and scoring harness code is not being released at this time. However, because the underlying on-chain data is immutable and the benchmark is defined purely in terms of `(transaction_hash, chain_id)` plus a ground-truth label, the benchmark is trivially reproducible against any new agent: given the inputs, any agent can be run in any runtime, and its output scored against the same rubric.
331+
The agent runtime and scoring harness code is not being released at this time. However, because the underlying on-chain data is immutable and the benchmark is defined purely in terms of `(transaction_hash, chain_id)` plus a ground-truth label, the benchmark is trivially reproducible against any new agent: given the inputs, any agent can be run in any runtime, and its output scored against the same rubric.
332+
333+
::github{repo="sahuang/txrca-bench"}

0 commit comments

Comments
 (0)