|
| 1 | +# AGENTS.md |
| 2 | + |
| 3 | +## JS Runtime Performance Harness (`benches/js-runtime-perf/`) |
| 4 | + |
| 5 | +### Overview |
| 6 | + |
| 7 | +Retrack embeds a Deno/V8 runtime to execute user-supplied extractor and formatter scripts. |
| 8 | +Retrack keeps a single long-lived worker thread that owns one V8 isolate and receives work |
| 9 | +over an `mpsc` channel. This harness measures the latency, throughput, and peak RSS delta |
| 10 | +of that runtime so changes to the architecture (context-per-call, pooling, shared HTTP client, |
| 11 | +startup snapshots, etc.) can be evaluated with real numbers. |
| 12 | + |
| 13 | +The harness is self-contained: it lives inside the `retrack` workspace, links against the |
| 14 | +real `retrack::js_runtime::JsRuntime`. |
| 15 | + |
| 16 | +The harness is **advisory / warn-only**. CI records a new history entry on every push to |
| 17 | +`main` and prints a table with per-metric deltas, but it never fails a build on |
| 18 | +regressions. Thresholds in `.perf/config.json` only control when warnings are emitted. |
| 19 | + |
| 20 | +### Scenario catalogue |
| 21 | + |
| 22 | +All scenarios use a default `JsRuntimeConfig` with a 10 MiB heap and a 10s execution |
| 23 | +budget, matching production settings. |
| 24 | + |
| 25 | +| Scenario | What it measures | |
| 26 | +|----------------------------|------------------------------------------------------------------------------------------------------------------------| |
| 27 | +| `cold_start_trivial` | Full worker-thread startup: `JsRuntime::init()` + fresh V8 isolate + first script execution, trivial script. | |
| 28 | +| `steady_state_trivial` | Serial executions of a trivial script through a single long-lived `JsRuntime`. | |
| 29 | +| `steady_state_extractor` | Realistic extractor: decodes a `Uint8Array` response body, parses JSON, filters/maps, re-encodes the result. | |
| 30 | +| `concurrent_extractors_8x` | `tokio::spawn` burst of `N` extractor calls sharing one `Arc<JsRuntime>`; exposes the single-worker-thread bottleneck. | |
| 31 | + |
| 32 | +The last scenario is deliberately designed to show that Retrack's current mpsc-based |
| 33 | +architecture serialises concurrent work onto one worker thread, which is the exact shape |
| 34 | +of the bottleneck we want any future optimisation to address. |
| 35 | + |
| 36 | +### Running locally |
| 37 | + |
| 38 | +```bash |
| 39 | +# Full run + comparison table + history append (from components/retrack/) |
| 40 | +make perf ANALYZE=1 |
| 41 | + |
| 42 | +# Run only, no history touch (useful when iterating locally and discarding results) |
| 43 | +make perf |
| 44 | + |
| 45 | +# Re-analyze an existing /tmp/perf.json (e.g. downloaded from CI) without rerunning |
| 46 | +make perf-analyze |
| 47 | + |
| 48 | +# Smoke test (fast) |
| 49 | +make perf ANALYZE=1 PERF_ITERATIONS=20 PERF_WARMUP=5 |
| 50 | + |
| 51 | +# Single scenario |
| 52 | +make perf ANALYZE=1 PERF_SCENARIOS=steady_state_extractor |
| 53 | + |
| 54 | +# Custom output path |
| 55 | +make perf PERF_OUTPUT=/tmp/perf-baseline.json |
| 56 | + |
| 57 | +# View HTML report (opens scripts/perf-report.html, then load .perf/history.jsonl) |
| 58 | +make perf-report |
| 59 | +``` |
| 60 | + |
| 61 | +`make perf` produces `/tmp/perf.json` and prints a one-line summary per scenario. When |
| 62 | +`ANALYZE=1` is set it then invokes `scripts/analyze-perf.ts`, which compares the fresh |
| 63 | +report to the last entry in `.perf/history.jsonl`, prints a table with Δp50/Δp99/Δops/Δrss |
| 64 | +columns, and appends to history **only when at least one tracked metric moved by more |
| 65 | +than 0.1 %** (see "History append gating" below). `make perf-analyze` is the same |
| 66 | +analyze-only tail, exposed separately for re-analyzing a file without rerunning the |
| 67 | +harness. |
| 68 | + |
| 69 | +### Interpreting the output |
| 70 | + |
| 71 | +The printed table uses the last recorded history entry as the baseline: |
| 72 | + |
| 73 | +``` |
| 74 | +Scenario p50 p99 throughput rss Δp50 Δp99 Δops Δrss |
| 75 | +steady_state_extractor 1.45ms 1.82ms 688.9/s 512KB -2.1% -3.0% +1.4% 0.0% |
| 76 | +``` |
| 77 | + |
| 78 | +- **Δp50 / Δp99**: percentage change in latency vs the previous run. Warnings fire when |
| 79 | + these exceed the thresholds in `.perf/config.json` (`p50`, `p99`). |
| 80 | +- **Δops**: percentage change in throughput. Warnings fire on a _decrease_ below |
| 81 | + `-thresholds.throughput` (i.e. getting slower). |
| 82 | +- **Δrss**: percentage change in peak RSS delta. Warnings fire above |
| 83 | + `thresholds.peakRssDeltaKb`. |
| 84 | + |
| 85 | +A first run prints "First run recorded - no comparison available." and establishes the |
| 86 | +baseline. |
| 87 | + |
| 88 | +### History append gating |
| 89 | + |
| 90 | +`scripts/analyze-perf.ts` does not append unconditionally. It diffs the fresh report |
| 91 | +against the last entry in `.perf/history.jsonl` across a whitelist of tracked metrics |
| 92 | +(`p50_us`, `p90_us`, `p99_us`, `max_us`, `throughput_ops_per_sec`, `peak_rss_delta_kb`). |
| 93 | +If every tracked metric on every scenario is within ±0.1 % of the previous entry, the |
| 94 | +file is left untouched and the CLI prints `All tracked metrics within ±0.1% of the |
| 95 | +previous run; history not updated.` When something moves, the append happens and the |
| 96 | +output names the scenario/metric that tripped the threshold. |
| 97 | + |
| 98 | +This matters for the CI commit step: because `history.jsonl` is modified only on |
| 99 | +material movement, the `git diff --cached --quiet || git commit` check becomes an |
| 100 | +effective "commit only if something changed" — pushes with steady-state numbers no |
| 101 | +longer produce noisy chore commits on `main`. |
| 102 | + |
| 103 | +The threshold is hard-coded at `HISTORY_APPEND_THRESHOLD_PCT = 0.1` in |
| 104 | +`scripts/analyze-perf.ts`. Adjust there if it proves too tight or too loose. |
| 105 | +Scenario additions/removals are treated as unconditionally material (always appended). |
| 106 | +Structural zero-valued metrics (e.g. `peak_rss_delta_kb = 0`) are handled explicitly — |
| 107 | +`0 → 0` is unchanged, `0 → anything` or `anything → 0` triggers an append. |
| 108 | + |
| 109 | +### CI contract |
| 110 | + |
| 111 | +- `.github/workflows/ci.yml` has a `ci-perf` job that runs on every push to `main`. |
| 112 | +- It builds the harness in release mode, runs `make perf ANALYZE=1` (which produces |
| 113 | + the report, prints the delta table, and appends to history only on material |
| 114 | + movement), uploads `/tmp/perf.json` as an artefact, and commits the updated |
| 115 | + `.perf/history.jsonl` back to `main` with `[skip ci]` in the commit message. |
| 116 | +- The commit step is a no-op when nothing moved — `history.jsonl` is unmodified, so |
| 117 | + `git diff --cached --quiet` is true. |
| 118 | +- The job **never fails on regressions**. Warnings are visible in the job log; acting on |
| 119 | + them is a human decision. |
| 120 | + |
| 121 | +### File locations |
| 122 | + |
| 123 | +``` |
| 124 | +benches/js-runtime-perf/Cargo.toml # Workspace member, depends on `retrack` + `retrack-types` |
| 125 | +benches/js-runtime-perf/src/main.rs # CLI driver |
| 126 | +benches/js-runtime-perf/src/measure.rs # hdrhistogram recorder, peak RSS probe |
| 127 | +benches/js-runtime-perf/src/report.rs # JSON output shape (camelCase top-level) |
| 128 | +benches/js-runtime-perf/src/scenarios/*.rs # One scenario per file |
| 129 | +benches/js-runtime-perf/scripts/*.js # JS fixtures loaded via `include_str!` |
| 130 | +src/lib.rs # Minimal library target exposing `js_runtime` + `config` |
| 131 | +.perf/config.json # Scenario list + warning thresholds |
| 132 | +.perf/history.jsonl # Append-only history (one JSON per run) |
| 133 | +scripts/analyze-perf.ts # Node 22 analyzer (reads /tmp/perf.json) |
| 134 | +scripts/perf-report.html # Standalone HTML viewer for history.jsonl |
| 135 | +``` |
| 136 | + |
| 137 | +### Tuning |
| 138 | + |
| 139 | +- To relax or tighten warnings, edit `.perf/config.json`. Values are percentages. |
| 140 | +- To add a scenario: create a module under `benches/js-runtime-perf/src/scenarios/`, |
| 141 | + register it in `scenarios.rs` (both the `ALL` slice and the `run` dispatcher), and add |
| 142 | + its name to `.perf/config.json`. |
| 143 | +- Benchmark results are platform-sensitive. History entries include `env.os`, `env.arch`, |
| 144 | + and `env.cpuModel` for this reason; absolute numbers from a laptop are not directly |
| 145 | + comparable to those from a CI runner. |
0 commit comments