bench: refresh all benchmark suites at v0.5.908 (2026-05-14)#765
Merged
Conversation
Full rerun of polyglot, JSON polyglot, honest_bench, and suite/ microbenchmarks on an otherwise-idle machine. Confirms that yesterday's v0.5.891 sweep (#745 follow-up) was dominated by parallel cargo-build contamination — σ on Perry compute cells dropped from 25-57 ms to 0.3-2.2 ms. Key results: - Compute polyglot matches v0.5.585 historical numbers within 1-4 ms across all 9 cells (default + --fast-math); fast-math cleanly reproduces 8× / 3.6× / 2.9× speedups on loop_overhead / math_intensive / accumulate. - honest_bench: Perry slightly faster on all 3 workloads vs v0.5.891 (image_conv 365 → 354 ms; json_full 1155 → 1098 ms); 300/300 output-matched rows. - #745 partial fix verification: JSON polyglot RSS dropped 254 → 227 MB roundtrip and 411 → 309 MB iterate after v0.5.900's GC trigger-ratchet fix. Residual ~150 MB gap vs v0.5.279 baseline flagged on the issue. - suite/: method_calls back to 9 ms (yesterday's 25 ms was noise); closure/factorial regressions vs v0.5.173 persist as known follow-ups. Docs refreshed: top-level README, benchmarks/README, polyglot RESULTS{,_AUTO,_OPT}.md, honest_bench REPORT.md (+ regenerated charts), json_polyglot RESULTS.md (auto), suite/results/RESULTS.md (new). All with 2026-05-14 / v0.5.908 datestamps and historical deltas vs v0.5.891 and v0.5.279.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Full rerun of all four benchmark suites (compute polyglot, JSON polyglot, honest_bench, suite/) against Perry v0.5.908 on an otherwise-idle machine, plus a full doc refresh.
issue-665-resolver-opt-inworktree) was running through part of the 2026-05-13 v0.5.891 sweep, inflating σ on every Perry compute cell. Today's idle-machine numbers drop σ from 25-57 ms to 0.3-2.2 ms — and Perry compute medians land back within 1-4 ms of the v0.5.585 historical baseline across all 9 cells.benchmarks/suite/results/RESULTS.md.Diff highlights
README.md,benchmarks/README.md,benchmarks/polyglot/RESULTS{,_AUTO,_OPT}.md,benchmarks/honest_bench/REPORT.md,benchmarks/json_polyglot/RESULTS.md— refreshed tables + prose at v0.5.908 / 2026-05-14benchmarks/honest_bench/charts/*.png,results/results.json,results/metadata.json,results/summary.txt— regenerated from this sweepbenchmarks/suite/results/RESULTS.md— new file; thesuite/run_benchmarks.shrunner doesn't write a permanent results file, so this committed one captures the v0.5.908 snapshot + delta tables vs v0.5.891 and v0.5.173 baselinesTest plan
benchmarks/polyglot/RESULTS_AUTO.mdbodyPERRY_FAST_MATH=1rerun →RESULTS_AUTO.mdaddendumbenchmarks/json_polyglot/RESULTS.mdbenchmarks/suite/results/RESULTS.mdpython3 scripts/report.pyregeneratesREPORT.mdfromresults.jsonpython3 scripts/plot.pyregeneratescharts/*.png_TBD_placeholders from yesterday's partial sweep filled inv0.5.891/2026-05-13confirmed remaining hits are intentional historical referencesNo code changes — pure docs + bench results refresh. CI's
cargo-test/parity/compile-smoke/api-docs-drift/security-auditpaths shouldn't have anything to verify here.Maintainer note: per CLAUDE.md flow, no version bump or CHANGELOG entry in this branch — those go on at merge time.