Add AWB to 8.2 Benchmarks > Integrated Benchmarks by xmpuspus · Pull Request #260 · codefuse-ai/Awesome-Code-LLM

xmpuspus · 2026-05-24T01:22:36Z

Adds AWB (AI Workflow Benchmark) to section 8.2 Benchmarks > Integrated Benchmarks.

AWB is an open-source benchmark suite that evaluates AI coding workflows on 100 tasks across 8 categories (bug-fix, feature-addition, refactoring, code-review, debugging, multi-file, legacy-code, workflow) using real OSS repositories pinned at commit SHAs — not synthetic snippets.

Repo: https://github.com/xmpuspus/ai-workflow-benchmark
Methodology: https://github.com/xmpuspus/ai-workflow-benchmark/blob/main/METHODOLOGY.md
9 tool adapters: Claude Code (vanilla + custom), Cursor, Aider, Gemini CLI, Codex CLI, Windsurf, Copilot, Pi
Composite score across 7 capability dimensions plus derived cost_discipline
v1.2.0 adds OpenTelemetry-aligned trace artifacts and a Production Readiness Score
MIT licensed, Python 3.11+, pip install awb

Inserted chronologically after OmniCode [2026-02] at [2026-04].

AWB (AI Workflow Benchmark) evaluates AI coding workflows on 100 tasks across 8 categories using real OSS repositories pinned at commit SHAs. Scored across 7 capability dimensions; ships 9 adapters (Claude Code, Cursor, Aider, Gemini CLI, Codex CLI, Windsurf, Copilot, Pi).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AWB to 8.2 Benchmarks > Integrated Benchmarks#260

Add AWB to 8.2 Benchmarks > Integrated Benchmarks#260
xmpuspus wants to merge 1 commit into
codefuse-ai:mainfrom
xmpuspus:add-awb-benchmark

xmpuspus commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xmpuspus commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant