Skip to content

Commit 1b4fad6

Browse files
committed
Make --timing opt-in rather than opt-out (--no-timing).
1 parent 1142534 commit 1b4fad6

4 files changed

Lines changed: 53 additions & 40 deletions

File tree

perf/README.md

Lines changed: 22 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -23,20 +23,23 @@ at the bottom of this file.
2323
The command we reach for whenever CPU-profiling paged.js:
2424

2525
```
26-
node measure.mjs --no-timing --render-only --cpu-profile --cpu-sampling 100
26+
node measure.mjs --render-only --cpu-profile --cpu-sampling 100
2727
```
2828

29-
(`run.bat` forwards the same args.) The detach-pages handler is now
30-
injected by default, since it's the shipping fix and matching
31-
production is the right baseline for any profiling work. Pass
32-
`--no-detach-pages` if you specifically need the pre-fix O(n²)
33-
baseline for an A/B against the original quadratic.
29+
(`run.bat` forwards the same args.) Two defaults match what most
30+
profiling work needs:
31+
32+
- **detach-pages is on.** It's the shipping fix; matching production
33+
is the right baseline for any profiling work. Pass
34+
`--no-detach-pages` for an A/B against the original O(n²) quadratic.
35+
- **timing is off.** The `timing-handler.js` per-page `console.log`
36+
relay costs ~2 % of render self-time on the 1638-page book and
37+
muddies bottom-up profile tables. Pass `--timing` when you want the
38+
per-page CSV + first/last-quartile summary; otherwise `timing.csv`
39+
is empty and `summary.txt` says so.
3440

3541
Flag rationale:
3642

37-
- `--no-timing` -- skip the per-page `console.log` relay from
38-
`timing-handler.js`. The relay costs ~2 % of render self-time on
39-
the 1638-page book and muddies the bottom-up view.
4043
- `--render-only` -- bail out after `PagedPolyfill.preview()`
4144
returns. Skips meta extraction, `parseOutline`, `page.pdf`, and
4245
the pdf-lib roundtrip / incremental writer. ~47 s saved per run
@@ -66,7 +69,7 @@ The harness and core probes:
6669
| --- | --- |
6770
| `measure.mjs` | Puppeteer harness. Drives the same flow as `docs/render-book.mjs` (loads the vendored paged.js bundle, runs `PagedPolyfill.preview()`, calls `page.pdf()`, then either the pdf-lib roundtrip or the incremental writer), with optional CPU profiling, in-page handler injection, and DOM-accessor instrumentation. Auto-pins to a fixed core mask on Windows via `pin-cpu.mjs` (see below) for stable measurements; pass `--no-affinity` to opt out. |
6871
| `pin-cpu.mjs` | Shared shim used by `measure.mjs`, `profile-load.mjs`, `profile-roundtrip.mjs`, and `ab-css.mjs`. On Windows, auto-relaunches the parent Node process under `start /affinity 0x5500 /high` (cores 4-7 physical, thread 0 each, on an 8C16T AMD Ryzen 7) so puppeteer's Chromium children inherit the mask + priority at spawn time. Reduces single-run CPU sample-time variance from ~15-25 % on a stock dev box to ~3 %. No-op on non-Windows; opt out per-invocation with `--no-affinity` or `PERF_PINNED=1`; override mask with `PERF_AFFINITY=<hex>`. |
69-
| `timing-handler.js` | `Paged.Handler` that records per-page wall time + heap into `window.__pagedTiming` and streams a line per page to the console. Always injected. |
72+
| `timing-handler.js` | `Paged.Handler` that records per-page wall time + heap into `window.__pagedTiming` and streams a line per page to the console. Injected when `--timing` is passed; off by default because the per-page console relay costs ~2 % of render self-time. |
7073
| `detach-pages.js` | `Paged.Handler` that hides each completed page from the layout tree (registered against `finalizePage`). The shipping fix. Injected by default (both by `measure.mjs` and by `docs/book.bat`); pass `--no-detach-pages` to measure the pre-fix baseline. |
7174
| `instrument-flush-ops.js` | Wraps `getComputedStyle`, `getBoundingClientRect`, and the `offsetWidth` / `clientWidth` / `scrollWidth` family with counters + per-call timing. Injected by `--instrument`. |
7275
| `instrument-detach.js` | Counters around `detach-pages.js`'s removeChild / restore cycle. |
@@ -168,6 +171,7 @@ run.bat # defaults to ..\docs\_site-pdf\book.h
168171
run.bat path\to\some-other.html # explicit input
169172
run.bat --out my-run # explicit output directory
170173
run.bat --no-detach-pages # opt out of the detach-pages fix (measure pre-fix O(n²) baseline)
174+
run.bat --timing # collect per-page wall time + heap (writes timing.csv + quartile summary)
171175
run.bat --cpu-profile # CPU-profile the render phase
172176
run.bat --render-only # bail out after render (skip generate + process, ~47s saved)
173177
run.bat --clone-count # report Layout.append clones appended vs survivors per page
@@ -189,17 +193,17 @@ Outputs land in `perf/results/<ISO-timestamp>/`:
189193

190194
- `book.pdf` -- the rendered PDF, byte-equivalent to what
191195
`book.bat` produces.
192-
- `timing.json` -- full record: phase totals, sub-phase breakdowns
193-
(`parseOutline`, `page.pdf`, pdf-lib load / setOutline / save),
194-
and the per-page render entries.
195-
- `timing.csv` -- one row per page,
196-
`page,dur_ms,heap_start_mb,heap_end_mb,elapsed_s`.
197-
- `summary.txt` -- the three phase totals, plus first-quarter vs
198-
last-quarter average per-page render cost and ratio.
196+
- `timing.json` -- phase totals + sub-phase breakdowns
197+
(`parseOutline`, `page.pdf`, pdf-lib load / setOutline / save).
198+
Per-page render entries are populated only when `--timing` is set.
199+
- `timing.csv` -- one row per page, `page,dur_ms,heap_start_mb,
200+
heap_end_mb,elapsed_s`. Empty (header only) without `--timing`.
201+
- `summary.txt` -- the three phase totals; with `--timing` also adds
202+
first-quarter vs last-quarter average per-page render cost + ratio.
199203

200204
## Reading the output
201205

202-
The summary prints something like:
206+
With `--timing`, the summary prints something like:
203207

204208
```
205209
pages : 1638

perf/ab-css.mjs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,7 @@ if (swappedHtml === BOOK_HTML) {
165165
function runOnce(outDir) {
166166
const r = spawnSync('node', [
167167
'measure.mjs', SWAP_HTML_PATH,
168-
'--no-timing', '--render-only', '--tracing',
168+
'--render-only', '--tracing',
169169
'--out', outDir,
170170
], { stdio: ['ignore', 'pipe', 'pipe'] });
171171
const err = r.stderr?.toString() ?? '';

perf/instrument-clones.js

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,11 @@
1212
// - At afterRendered, summarise totals + per-page distribution.
1313
//
1414
// Cost: O(1) per append + one tree walk per finalized page. Run with
15-
// --no-timing --additional-script ..\perf\instrument-clones.js
15+
// --additional-script ..\perf\instrument-clones.js
1616
// from a measure.mjs invocation. (detach-pages.js is on by default;
17-
// add --no-detach-pages to compare against the pre-fix baseline.)
17+
// add --no-detach-pages to compare against the pre-fix baseline.
18+
// The timing handler is off by default; if you also pass --timing,
19+
// its per-page console relay will mix with this probe's output.)
1820
// Numbers are reported via console.log which measure.mjs forwards
1921
// to stdout.
2022

perf/measure.mjs

Lines changed: 26 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
// [--heap-profile] [--heap-sampling <bytes>]
2626
// [--tracing]
2727
// [--no-detach-pages] [--instrument] [--time-hooks]
28-
// [--incremental] [--chrome-outline] [--no-timing]
28+
// [--incremental] [--chrome-outline] [--timing]
2929
// [--clone-count] [--render-only]
3030
//
3131
// --render-only bails out after the render phase. Skips meta extraction,
@@ -34,11 +34,13 @@
3434
// phase matters; trims ~45s off the full ~55s book run. No book.pdf is
3535
// written, and the timing.json / summary.txt omit generate/process.
3636
//
37-
// --no-timing skips the per-page timing-handler.js injection. The handler
38-
// adds a per-page console.log relayed via CDP that costs ~2% of render
39-
// self-time on the 1638-page book. Use when profiling for the cleanest
40-
// possible bottom-up table; loses the per-page CSV and the first/last
41-
// quartile summary in return.
37+
// --timing injects timing-handler.js. The handler records per-page wall
38+
// time + heap to window.__pagedTiming (so the harness can emit
39+
// timing.csv and the first/last-quartile summary) and streams a per-page
40+
// console.log relayed via CDP. The relay costs ~2 % of render self-time
41+
// on the 1638-page book, which is why the handler isn't on by default --
42+
// profile-clean runs and most A/B comparisons don't need it. Pass it
43+
// when you want the per-page CSV.
4244
//
4345
// detach-pages.js is injected by default -- a Paged.Handler that hides
4446
// each completed page from the layout tree. This is the shipping fix
@@ -115,7 +117,7 @@ let instrument = false;
115117
let timeHooks = false;
116118
let incremental = false;
117119
let chromeOutline = false;
118-
let noTiming = false;
120+
let timing = false;
119121
let cloneCount = false;
120122
let renderOnly = false;
121123
let tracing = false;
@@ -133,7 +135,8 @@ for (let i = 0; i < args.length; i++) {
133135
else if (a === '--time-hooks') timeHooks = true;
134136
else if (a === '--incremental') incremental = true;
135137
else if (a === '--chrome-outline') chromeOutline = true;
136-
else if (a === '--no-timing') noTiming = true;
138+
else if (a === '--timing') timing = true;
139+
else if (a === '--no-timing') timing = false; // accepted for backwards compat; default since the relay cost was measured
137140
else if (a === '--clone-count') cloneCount = true;
138141
else if (a === '--render-only') renderOnly = true;
139142
else if (a === '--tracing') tracing = true;
@@ -159,7 +162,7 @@ const instrumentPath = resolve(__dirname, 'instrument-flush-ops.js');
159162
const timeHooksPath = resolve(__dirname, 'time-hooks.js');
160163
const cloneCountPath = resolve(__dirname, 'instrument-clones.js');
161164
const required = [pagedScriptPath];
162-
if (!noTiming) required.push(handlerPath);
165+
if (timing) required.push(handlerPath);
163166
if (detachPages) required.push(detachPagesPath);
164167
if (instrument) required.push(instrumentPath);
165168
if (timeHooks) required.push(timeHooksPath);
@@ -241,7 +244,7 @@ try {
241244
});
242245

243246
await page.addScriptTag({ path: pagedScriptPath });
244-
if (!noTiming) {
247+
if (timing) {
245248
await page.addScriptTag({ path: handlerPath });
246249
}
247250
if (detachPages) {
@@ -471,24 +474,24 @@ try {
471474
console.log(`[harness] total ${fmtMs(totalMs)}`);
472475

473476
// Persist results -------------------------------------------------
474-
const timing = noTiming
475-
? { pages: [], phases: {}, pageCount: null }
476-
: await page.evaluate(() => window.__pagedTiming);
477+
const timingData = timing
478+
? await page.evaluate(() => window.__pagedTiming)
479+
: { pages: [], phases: {}, pageCount: null };
477480
if (finalPdf) {
478481
const pdfPath = join(outDir, 'book.pdf');
479482
writeFileSync(pdfPath, Buffer.from(finalPdf));
480483
}
481484

482485
const record = {
483486
input: inputPath,
484-
pageCount: timing.pageCount,
487+
pageCount: timingData.pageCount,
485488
pdfBytes: finalPdf ? finalPdf.length : null,
486489
cpuProfile: profilePath,
487490
phases: {
488491
render: {
489492
ms: renderMs,
490-
perPage: timing.pages,
491-
phaseMarks: timing.phases,
493+
perPage: timingData.pages,
494+
phaseMarks: timingData.phases,
492495
},
493496
},
494497
totalMs,
@@ -509,7 +512,7 @@ try {
509512
writeFileSync(join(outDir, 'timing.json'), JSON.stringify(record, null, 2));
510513

511514
const csv = ['page,dur_ms,heap_start_mb,heap_end_mb,elapsed_s'];
512-
for (const p of timing.pages) {
515+
for (const p of timingData.pages) {
513516
csv.push([
514517
p.idx,
515518
p.dur.toFixed(2),
@@ -520,10 +523,14 @@ try {
520523
}
521524
writeFileSync(join(outDir, 'timing.csv'), csv.join('\n'));
522525

523-
const pages = timing.pages;
526+
const pages = timingData.pages;
524527
const summary = [];
525528
summary.push(`input : ${inputPath}`);
526-
summary.push(`pages : ${pages.length}`);
529+
if (timing) {
530+
summary.push(`pages : ${pages.length}`);
531+
} else {
532+
summary.push(`pages : (per-page timing not collected; pass --timing for the CSV)`);
533+
}
527534
if (finalPdf) {
528535
summary.push(`pdf size : ${(finalPdf.length / 1024 / 1024).toFixed(1)} MB`);
529536
}

0 commit comments

Comments
 (0)