-
Notifications
You must be signed in to change notification settings - Fork 217
chore: upgrade performance benchmark #2816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
47f9aa7
bdbb7d3
adfa2bf
c827c6f
3c94349
4d82cb9
96f1c13
cdd37ca
987e854
9599c52
9a61a65
86d0e6d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,6 +8,10 @@ on: | |
| pull_request: | ||
| branches: | ||
| - main | ||
| # `labeled` lets applying the `performance-benchmark` label trigger a rerun | ||
| # without needing an empty commit; the job-level `if` filters out unrelated | ||
| # label changes so we don't burn CI on every label add. | ||
| types: [opened, synchronize, reopened, labeled] | ||
|
|
||
| env: | ||
| CI: true | ||
|
|
@@ -16,6 +20,9 @@ env: | |
| jobs: | ||
| historical-versions: | ||
| runs-on: ubuntu-latest | ||
| # Skip the job when a `labeled` event fires for an unrelated label — | ||
| # opens/pushes/reopens always run; only the label firehose is filtered. | ||
| if: github.event.action != 'labeled' || github.event.label.name == 'performance-benchmark' | ||
| steps: | ||
| - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 | ||
| - name: Install pnpm for Rebilly/api-definitions | ||
|
|
@@ -33,8 +40,9 @@ jobs: | |
| run: npm i -g hyperfine | ||
|
|
||
| - name: Add more versions to test | ||
| # Run only on the release branch (changeset-release/main): | ||
| if: github.head_ref == 'changeset-release/main' | ||
| # Runs on the release PR (changeset-release/main) automatically, or on | ||
| # any PR carrying the `performance-benchmark` label for opt-in deep-dive. | ||
| if: github.head_ref == 'changeset-release/main' || contains(github.event.pull_request.labels.*.name, 'performance-benchmark') | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, I'm not sure how useful is to run the historical version benchmarks on individual PRs. |
||
| run: | | ||
| cd tests/performance/ | ||
| cat package.json | jq ".dependencies = $(cat package.json | jq ._enhancedDependencies)" > package.json | ||
|
|
@@ -52,12 +60,16 @@ jobs: | |
| run: | | ||
| cd tests/performance/ | ||
| npm run test # This command is generated and injected into package.json in the previous step. | ||
| cat benchmark_check.md | ||
| cat benchmark_bundle.md benchmark_lint.md benchmark_check-config.md | ||
| npm run chart # Creates benchmark_chart.md with the performance bar chart. | ||
|
|
||
| - name: Comment PR | ||
| # If the PR carries the `performance-benchmark` label, each push posts a | ||
| # fresh comment (so variance across reruns can be compared in-thread); | ||
| # otherwise the comment is overwritten in place so normal PRs keep a | ||
| # single result. | ||
| if: ${{ github.event.pull_request.head.repo.full_name == github.repository }} | ||
| uses: thollander/actions-comment-pull-request@24bffb9b452ba05a4f3f77933840a6a841d1b32b # v3.0.1 | ||
| with: | ||
| file-path: tests/performance/benchmark_chart.md | ||
| comment-tag: historical-versions-comparison | ||
| comment-tag: ${{ contains(github.event.pull_request.labels.*.name, 'performance-benchmark') && format('historical-versions-rerun-{0}', github.run_id) || 'historical-versions-comparison' }} | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This must be needless if we remove the labeling approach. Am I right? |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,39 +1,65 @@ | ||
| import fs from 'node:fs'; | ||
|
|
||
| const content = fs.readFileSync('benchmark_check.json', 'utf8'); | ||
| const json = JSON.parse(content); | ||
| const arr = json.results.map((r) => [ | ||
| r.command.replace(/^node node_modules\/([^/]+)\/.*/, (_, cliVersion) => cliVersion), | ||
| r.mean, | ||
| r.stddev, | ||
| ]); | ||
| const minMean = Math.min(...arr.map(([_, mean]) => mean)); | ||
|
|
||
| const constructBarForChart = (mean, min) => { | ||
| const median = (xs) => { | ||
| const sorted = [...xs].sort((a, b) => a - b); | ||
| const mid = Math.floor(sorted.length / 2); | ||
| return sorted.length % 2 ? sorted[mid] : (sorted[mid - 1] + sorted[mid]) / 2; | ||
| }; | ||
|
|
||
| const medianAbsoluteDeviation = (xs, centre) => median(xs.map((x) => Math.abs(x - centre))); | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doesn't hyperfine calculate the deviation already, does it?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| const constructBarForChart = (value, min) => { | ||
| if (min <= 0) return 'N/A'; | ||
| const slownessRatio = mean / min; | ||
| const slownessFactor = slownessRatio - 1; | ||
| const slownessFactor = value / min - 1; | ||
| const maxBarLength = 30; | ||
| const visualFactor = Math.min(1, slownessFactor); | ||
| const length = Math.floor(visualFactor * maxBarLength); | ||
| const length = Math.floor(Math.min(1, slownessFactor) * maxBarLength); | ||
| return '▓' + '▓'.repeat(length); | ||
| }; | ||
|
|
||
| const loadResults = (jsonPath) => { | ||
| const json = JSON.parse(fs.readFileSync(jsonPath, 'utf8')); | ||
| return new Map( | ||
| json.results.map((r) => { | ||
| const cliVersion = r.command.replace(/^node node_modules\/([^/]+)\/.*/, (_, v) => v); | ||
| return [cliVersion, { median: r.median, mad: medianAbsoluteDeviation(r.times, r.median) }]; | ||
| }) | ||
| ); | ||
| }; | ||
|
|
||
| const findFastest = (results) => | ||
| [...results.values()].reduce((best, r) => (r.median < best.median ? r : best)); | ||
|
|
||
| const renderCell = (entry, fastest) => { | ||
| const bar = constructBarForChart(entry.median, fastest.median); | ||
| const factor = entry.median / fastest.median; | ||
| if (entry === fastest) { | ||
| return `${bar} ${factor.toFixed(2)}x (Fastest)`; | ||
| } | ||
| const relativeUnc = | ||
| factor * Math.sqrt((entry.mad / entry.median) ** 2 + (fastest.mad / fastest.median) ** 2); | ||
| return `${bar} ${factor.toFixed(2)}x ± ${relativeUnc.toFixed(2)}`; | ||
| }; | ||
|
|
||
| const operations = [ | ||
| { name: 'Bundle', file: 'benchmark_bundle.json' }, | ||
| { name: 'Lint', file: 'benchmark_lint.json' }, | ||
| { name: 'Check Config', file: 'benchmark_check-config.json' }, | ||
| ]; | ||
|
|
||
| const columns = operations.map(({ name, file }) => { | ||
| const data = loadResults(file); | ||
| return { name, data, fastest: findFastest(data) }; | ||
| }); | ||
| const versions = [...columns[0].data.keys()]; | ||
|
|
||
| const output = [ | ||
| '| CLI Version | Mean Time ± Std Dev (s) | Relative Performance (Lower is Faster) |', | ||
| '|---|---|---|', | ||
| ...arr.map(([cliVersion, mean, stddev]) => { | ||
| const bar = constructBarForChart(mean, minMean); | ||
| const meanFormatted = mean.toFixed(3); | ||
| const stddevFormatted = stddev.toFixed(3); | ||
| const relativeSpeedFactor = (mean / minMean).toFixed(2); | ||
| const factorSuffix = mean === minMean ? 'x (Fastest)' : 'x'; | ||
|
|
||
| const timeWithStddev = `${meanFormatted}s ± ${stddevFormatted}s`; | ||
| const performanceDisplay = `${bar} ${relativeSpeedFactor}${factorSuffix}`; | ||
|
|
||
| return `| ${cliVersion} | ${timeWithStddev} | ${performanceDisplay} |`; | ||
| }), | ||
| '## Performance Benchmark', | ||
| '', | ||
| `| CLI Version | ${columns.map((c) => c.name).join(' | ')} |`, | ||
| `|---|${columns.map(() => '---').join('|')}|`, | ||
| ...versions.map( | ||
| (v) => `| ${v} | ${columns.map((c) => renderCell(c.data.get(v), c.fastest)).join(' | ')} |` | ||
| ), | ||
| ].join('\n'); | ||
|
|
||
| process.stdout.write(output); | ||

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really think we need to rerun benchmarks too often. Most of the time the current flow is sufficient to see if there's a noticeable difference in perf. If you decide to leave the labeling though, please remove the comments as they explain a trivial case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes in workflow was introduced only for testing purposes for this PR, i will remove this after PR will be approved. I left note in the description of PR regarding it.