Commit 3529759
Decouple runner onboarding (#45)
* Decouple new-runner onboarding from shared files
Adding a runner used to require touching at least three shared files —
README.md, meta.schema.json, and collect_env.py — even when the work
was confined to a single accelerator family. This PR rewires those
touch points so contributors normally only edit files inside their
own runner folder.
What changed:
* README platforms matrix is now auto-generated from each runner's
meta.json (new optional suite_support / hardware_label fields).
README.md carries marker comments and tools/generate_platforms_matrix.py
splices the table in; CI can call --check to fail PRs that get out of
sync.
* meta.schema.json no longer hard-codes the set of accelerator
platforms. The platform field is now validated by a lowercase regex,
and the curated catalogue lives in schema/platforms.json — purely for
presentation (display label, sort order). validate_runners.py prints
a non-fatal warning when it meets an uncatalogued platform.
* collect_env.py is split into a thin orchestrator plus one
self-contained plug-in per accelerator family under runners/platforms/
(nvidia, amd, ascend, apple, google, moorethreads). Plug-ins are
auto-discovered; adding a new accelerator only requires dropping a
single file in that directory. env_info.json now carries an
accelerator_platform field identifying the active plug-in.
Side effects worth flagging:
* The regenerated README matrix now includes the apple_mlx_lm and
nvidia_sglang_c43a8309 runners that had been missed in the
hand-maintained table.
* All 7 existing runners gained explicit suite_support entries; no
behaviour change, just self-description used by the generator.
* runners/README.md got a new "Adding a new accelerator family"
section that documents the plug-in protocol.
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore: drop nvidia_sglang_6da83845 and tighten .gitignore
* Remove the older SGLang runner (nvidia_sglang_6da83845, sglang 0.4.0
/ torch 2.5.1 / transformers 4.46.3). The newer nvidia_sglang_c43a8309
(sglang 0.5.6 / torch 2.9.1 / EAGLE speculative decoding) supersedes
it in practice. No results in this repo reference the old hash and
there are no external consumers (pre-open-source), so we delete the
folder rather than mark deprecated_by — this is the last opportunity
to do so before the immutability rule kicks in.
* Expand .gitignore so the dozens of locally generated samples.jsonl
files under results/verified/** stop showing up as untracked, and
add the common IDE / lint / test-cache directories
(.idea/, .vscode/, .pytest_cache/, .mypy_cache/, .ruff_cache/,
.coverage*, .tox/) that contributors typically have.
Co-authored-by: Cursor <cursoragent@cursor.com>
* docs: open-source prep — community files, pyproject metadata, decoupled-flow walk-through
* Add CODE_OF_CONDUCT.md (Contributor Covenant 2.1) with a small
benchmark-specific addendum covering fabricated results and vendor
affiliation disclosure.
* Add SECURITY.md scoping the threat model (code that runs on
contributor machines + validator bypasses for fake leaderboard
entries) and pointing reporters at GitHub private security
advisories instead of public issues.
* Flesh out pyproject.toml with authors, maintainers, keywords,
Trove classifiers (license, audience, Python 3.10–3.12, platforms),
and the full set of project.urls (Homepage, Leaderboard,
Documentation, Repository, Issues, Changelog) so it renders nicely
on PyPI once we cut a release.
* Rewrite the 'Adding support for a new platform' section of
CONTRIBUTING.md to match the decoupled onboarding flow that landed
in the previous commit: a new runner on an existing platform no
longer needs to touch any shared file, and a brand-new accelerator
family only needs a single self-contained plug-in under
runners/platforms/. The section is renamed 'Adding a new runner' to
reflect what most contributors actually do, with a clearly marked
sub-section for the rarer 'new accelerator family' case.
* Repoint two README.md links that pointed at the old
'#adding-support-for-a-new-platform' anchor.
No behavioural changes to the framework or runners.
Co-authored-by: Cursor <cursoragent@cursor.com>
* ci(validate_pr): enforce README matrix sync and full-repo runner validation
* Run runners/validate_runners.py over **every** runner folder in the
repo (not just the ones touched in the current PR). This catches
drift introduced by shared changes — e.g. a meta.schema.json edit
that accidentally breaks an unrelated existing runner.
* Run tools/generate_platforms_matrix.py --check on every triggering
PR. The README 'Supported platforms' matrix is auto-generated from
each runner's meta.json; if a PR changes a runner's suite_support /
hardware_label or adds a new runner without regenerating the table,
the job now fails with a clear instruction to regenerate locally
and commit the result.
* Expand the workflow's paths trigger to cover the README, the
platforms catalogue (schema/platforms.json), and the generator
itself, so the matrix-sync check actually runs when those files
are modified.
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore: prune dead artefacts (utils/ duplicate, orphan runner_config examples)
Pre-1.0 cleanup before open-sourcing:
* `utils/run_all_{4,8}gpu.sh` — older duplicates of the same-named
scripts under `examples/`. Nothing references them; drop the folder.
* `configs/runner_configs/runner_*_{523da458,605db33a,9f42fabb}.yaml.example`
— stale templates whose runner folders were superseded by the current
hash IDs (`6c18cd8f`, `d4aa9fda`, `c43a8309`). Each surviving runner
has its own up-to-date `*.yaml.example` companion.
No code path or doc references any of these, so this is a pure delete.
Co-authored-by: Cursor <cursoragent@cursor.com>
* docs: open-source polish — logo, PR-first flow, community-facing language
Round of pre-1.0 documentation work driven by the question "what would a
first-time contributor see, and does it look maintained or maintainer-run?".
Visual identity
* New SVG mark + wordmark: a lightning bolt crossing a speedometer arc.
Lives under `docs/assets/` and renders via `<picture>` with separate
light/dark variants — GitHub README's `prefers-color-scheme` swap.
* The README header now uses the wordmark instead of an emoji + H1.
README slimming
* Drop the full `Repository structure` tree from the top page. Mature
projects (PyTorch, vLLM, llama.cpp) don't ship the tree on the front
door; the trimmed copy in DEVELOPMENT.md is enough for spelunkers.
* Quick-start step 4 is now "open a pull request" with `gh pr create`.
The issue-bot path is kept verbatim as a one-line escape hatch for
people who don't want to touch git.
* New top-level links to Discussions for Q&A and to `openclaw_skill/`
for the optional voice-driven launcher (clearly marked optional).
* Citation now credits "The AccelMark Contributors" alongside the
original author.
CONTRIBUTING rewrite of the submission flow
* The whole "Submitting your results" section was rewritten under a new
`## Submitting a result` anchor (referenced from README). PR path is
primary, with the bot-drafted-PR path as the no-git fallback.
* New paragraph documents the `configs/runner_configs/runner_<id>.yaml`
gitignore policy explicitly — only the `*.yaml.example` companions
ship; the live override file is strictly local.
* Verified-tier definition rephrased: it is hardware reproducibility,
not a maintainer privilege. Anyone with the same chip + runner can
open a reproduction PR and bump a community result to verified.
Community-facing language cleanup
* `results/README.md`, `suites/README.md`, `DEVELOPMENT.md`, and
`CONTRIBUTING.md` no longer describe verification / flagging /
suite-acceptance as maintainer-gated. They read as community
workflows that anyone can drive.
* Time SLAs ("within a day or two") and "maintainer reviews" copy
removed from the contribution path so the doc doesn't make promises
that depend on a single person.
`CODE_OF_CONDUCT.md` and `SECURITY.md` still mention maintainers
intentionally — those documents need a clear enforcement contact and
that's expected of any open-source repo.
Co-authored-by: Cursor <cursoragent@cursor.com>
* docs+leaderboard: tighten logo alignment, brand the site, label runners
Round-trip feedback from rendering the new README header in light mode:
* The icon was visually drifting below the wordmark because the SVG was
packing both "AccelMark" and a tagline into the same image, forcing
the icon to balance two text lines.
* The leaderboard site still used a bare emoji and had no favicon, so
there was no continuity between the README and the public site.
* When two runners share the same `framework` string (e.g. `vLLM` ships
both the stable runner and a future `vllm-0.20` one), result cards
rendered as indistinguishable "Qwen2.5-0.5B-Instruct · vLLM · BF16"
rows even though the `framework_version` field already disambiguates.
Logo + README
* `docs/assets/logo-wordmark{,-dark}.svg`: single-row mark of the form
`[icon] AccelMark`. ViewBox shrunk from 480×96 to 280×72 with the
icon's geometric centre put exactly on the cap-height midline of the
AccelMark glyphs. The "Cross-platform LLM inference benchmark"
tagline previously baked into the SVG is now a separate `<p>` under
the logo in README, so the brand mark stays compact and reusable.
* README rendering knob: `width="360"` (was 420) to fit the new aspect
ratio.
Leaderboard site branding
* New `leaderboard/site/favicon.svg` (copy of the standalone icon).
Registered via `<link rel="icon" type="image/svg+xml" …>` so the tab
picks it up immediately.
* `header h1` swapped the ⚡ emoji for the inline SVG mark, using a
dark-theme palette (#FCD34D bolt + #93C5FD gauge) that pops on the
#0d1117 background. Flex layout for vertical alignment between the
icon and the title.
Runner disambiguation on cards and tables
* Card layout (line 836): the framework field now reads
`${framework}${framework_version}`, e.g. `vLLM 0.5.5`. A `title=` on
the same span exposes `runner: <implementation_id>` on hover when the
user wants the precise hash.
* Table cell formatter (`formatFramework`): same inline version after
the framework name (rendered in a muted colour so the framework name
stays the dominant token), and `implementation_id` is added to the
hover tooltip alongside the existing version / script / notes lines.
Net effect for the open question raised in review: two vLLM runners on
the same hardware are now visually distinct without anyone editing the
runner's `_get_framework_name()` to fake a variant suffix.
Co-authored-by: Cursor <cursoragent@cursor.com>
* ci(generate_leaderboard): also redeploy when site or generator changes
Previously the leaderboard deploy workflow only fired on `results/**`
changes, so PRs that touched `leaderboard/site/index.html`,
`leaderboard/generate.py`, or platform metadata could land on main and
never reach the public site until somebody happened to merge a new
result.
Widen the `paths:` filter so any of these can trigger a redeploy:
* `leaderboard/**` — the static site and generator script
* `tools/generate_platforms_matrix.py` and `schema/platforms.json`
— the README platforms matrix inputs
(the workflow regenerates that too)
* `runners/*/meta.json` — runner metadata that the leaderboard
surfaces (framework, suite support,
hardware labels)
`workflow_dispatch` stays available as the escape hatch for forcing a
redeploy when nothing in the watched paths changed.
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore: drop pre-1.0 backward-compat shims and stale Suite C comments
All three removals were verified to have zero in-repo dependencies — every
suite.json and the entire codebase is already on the new format.
suite_C/suite.py — stale runner-backend gating
Eleven lines of commented-out code that gated each quantized format on
whether the runner declared the backend in SUPPORTED_QUANTIZATION_BACKENDS.
The strategy changed long ago: now we always send the format through and
let the inference engine report its own incompatibility (recorded in the
subprocess summary). The accompanying skip-reason `print` was updated to
match what actually causes the skip today (the *other* full-precision
baseline, e.g. FP16 on Ampere where the baseline is BF16).
benchmark_runner._parse_scenarios_config — flat-list legacy
Five lines that accepted suite.json with `"scenarios": ["accuracy", ...]`
instead of the documented `{"default": [...], "extra": [...]}`. All seven
suite.json files are on the dict form; flat-list was never documented for
external authors. Docstring and the DEVELOPMENT.md line referencing the
legacy form updated.
benchmark_runner._resolve_requests_path — per-suite requests.jsonl fallback
Ten lines that fell back to `suites/<id>/requests.jsonl` when a suite had
no `dataset` key. Every suite.json now declares `dataset:` and points at
`datasets/<name>/requests.jsonl`; there is no `suites/*/requests.jsonl`
anywhere in the repo. The function now requires `dataset` and produces a
pointed error message if it's missing.
Kept on purpose
`/v1/completions` in `serve/server.py` and the README — that is OpenAI's
own legacy endpoint (still widely used by older LangChain/llama.cpp/etc.
clients), not an AccelMark-internal compat shim, so removing it would
narrow the audience of the drop-in OpenAI replacement we advertise.
Net: -28 lines, +13 lines of clearer code paths, no functional change.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>1 parent b4098f1 commit 3529759
47 files changed
Lines changed: 2431 additions & 1855 deletions
File tree
- .github/workflows
- configs/runner_configs
- docs/assets
- leaderboard/site
- results
- runners
- amd_vllm_rocm_6c18cd8f
- apple_mlx_lm_9546b8b5
- ascend_vllm_ascend_d4aa9fda
- google_vllm_tpu_68cc9ffa
- nvidia_sglang_6da83845
- nvidia_sglang_c43a8309
- nvidia_vllm_47f5d58e
- platforms
- schema
- suites
- suite_C
- tools
- utils
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
4 | | - | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
5 | 7 | | |
6 | 8 | | |
7 | 9 | | |
8 | 10 | | |
9 | 11 | | |
10 | 12 | | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
11 | 17 | | |
12 | | - | |
| 18 | + | |
| 19 | + | |
13 | 20 | | |
14 | 21 | | |
15 | 22 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
4 | | - | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
10 | 14 | | |
11 | 15 | | |
12 | 16 | | |
| |||
66 | 70 | | |
67 | 71 | | |
68 | 72 | | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
69 | 91 | | |
70 | 92 | | |
71 | 93 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
| 4 | + | |
3 | 5 | | |
4 | 6 | | |
5 | 7 | | |
6 | 8 | | |
7 | 9 | | |
8 | 10 | | |
9 | | - | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
10 | 19 | | |
11 | | - | |
12 | | - | |
13 | | - | |
14 | | - | |
15 | | - | |
16 | | - | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
17 | 34 | | |
18 | 35 | | |
19 | 36 | | |
| 37 | + | |
| 38 | + | |
20 | 39 | | |
21 | | - | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
22 | 43 | | |
23 | 44 | | |
24 | | - | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
25 | 48 | | |
26 | 49 | | |
27 | | - | |
| 50 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
0 commit comments