Browser-based benchmark for @data-client/react measuring mount/update scenarios. Includes TanStack Query, SWR, and a plain-React baseline for reference. Built with Webpack via @anansi/webpack-config. Results are reported to CI via rhysd/github-action-benchmark.
The repo has two benchmark suites:
examples/benchmark(Node) — Measures the JS engine only:normalize/denormalize,Controller.setResponse/getResponse, reducer throughput. No browser, no React. Use it to validate core and normalizr changes.examples/benchmark-react(this app) — Measures the full React rendering pipeline: same operations driven in a real browser, with layout and paint. Use it to validate@data-client/reactchanges; other libraries are included for reference.
- What we measure: Wall-clock time from triggering an action (e.g.
init(100)orupdateUser('user0')) until a MutationObserver detects the expected DOM change in the benchmark container. Optionally we also record React Profiler commit duration and, withBENCH_TRACE=true, Chrome trace duration. - Why: Scenarios are chosen to exercise areas where caching strategies differ: shared-entity updates, referential stability, and derived-view memoization. See js-framework-benchmark "How the duration is measured" for a similar timeline-based approach.
- Statistical: Warmup runs are discarded; we report median and 95% CI (as percentage of median). Timing scenarios (navigation and mutation) use convergent mode: a single page load per scenario, with warmup iterations followed by adaptive measurement iterations where each iteration produces one sample and convergence is checked inline. This eliminates page-reload overhead between samples for faster, lower-variance results. Deterministic scenarios (ref-stability) run once. Memory scenarios use a separate outer loop with a fresh page per round.
- No CPU throttling: Runs at native speed with more samples for statistical significance rather than artificial slowdown. Convergent timing scenarios use 5 warmup + up to 50 measurement iterations (small) or 3 warmup + up to 40 (large). Early stopping triggers when 95% CI margin drops below the target percentage.
The primary purpose is to track data-client's own performance — catch regressions and validate improvements. Other libraries are included for context; CI runs data-client only.
Scenarios are designed to isolate the data framework layer: fetching, caching, update propagation, and rendering in response to data changes. Real-world applications will have additional performance considerations (routing, animation, third-party scripts, etc.) beyond what is measured here.
All implementations share presentational components, fixture data, fetch functions, and the useBenchState harness. They only diverge where each library's data layer requires it, using idiomatic patterns from that library's documentation. No implementation builds custom state management on top of its library.
- Hot path (in CI, data-client only) — JS-only: init (fetch + render), update propagation, ref-stability, sorted-view. No simulated network. CI runs only data-client scenarios to track regressions; other libraries are benchmarked locally.
- With network (local) — Same shared-author update but with simulated network delay (consistent ms per "request"). Normalized caches propagate via a single store update; query-keyed caches invalidate and refetch affected queries. Not run in CI — run locally with
yarn bench(noCIenv) to include these. - Memory (local only) — Heap delta after repeated mount/unmount cycles.
- Startup (local only) — FCP and task duration via CDP
Performance.getMetrics.
Hot path (CI)
- Get list (
getlist-100,getlist-500) — Time to show a ListView component that auto-fetches 100 or 500 issues from the list endpoint, then renders (unit: ops/s). Exercises the full fetch + normalization + render pipeline. - Get list sorted (
getlist-500-sorted) — Mount 500 issues through a sorted/derived view. data-client usesuseQuery(sortedIssuesQuery)withQueryschema memoization; other libraries useuseMemo+ sort. - Update entity (
update-entity) — Time to update one issue and propagate to the UI (unit: ops/s). - Update entity sorted (
update-entity-sorted) — After mounting a sorted view, update one entity. data-client'sQuerymemoization avoids re-sorting when sort keys are unchanged. - Update entity multi-view (
update-entity-multi-view) — Update one issue that appears simultaneously in a list, a detail panel, and a pinned-cards strip. Normalized caches propagate via a single store write; query-keyed caches invalidate and refetch each query. - Update user (scaling) (
update-user,update-user-10000) — Update one shared user with 1,000 or 10,000 mounted issues to test subscriber scaling. - Ref-stability (
ref-stability-issue-changed,ref-stability-user-changed) — Count of components that received a new object reference after an update (unit: count; smaller is better). - Invalidate and resolve (
invalidate-and-resolve) — data-client only; invalidates a cached endpoint and immediately re-resolves. Measures Suspense boundary round-trip.
With network (local comparison)
- Update shared user with network (
update-shared-user-with-network) — Same as above with a simulated delay (e.g. 50 ms) per "request."
Memory (local only)
- Memory mount/unmount cycle (
memory-mount-unmount-cycle) — Mount 500 issues, unmount, repeat 10 times; report JS heap delta (bytes) via CDP. Surfaces leaks or unbounded growth.
Startup (local only)
- Startup FCP (
startup-fcp) — First Contentful Paint time via CDPPerformance.getMetrics. - Startup task duration (
startup-task-duration) — Total main-thread task duration via CDP (proxy for TBT).
Illustrative relative results with baseline = 100% (plain React useState/useEffect, no data library). For throughput rows, each value is (library ops/s ÷ baseline ops/s) × 100 — higher is faster. For ref-stability rows, the ratio uses the “refs changed” count — lower is fewer components that saw a new object reference. Figures are rounded from the Latest measured results table below (network simulation on); absolute ops/s will vary by machine, but library-to-library ratios are usually similar.
| Category | Scenarios (representative) | data-client | tanstack-query | swr | baseline |
|---|---|---|---|---|---|
| Navigation | getlist-100, getlist-500, getlist-500-sorted |
~98% | ~99% | ~99% | 100% |
| Navigation | list-detail-switch-10 |
~2381% | ~225% | ~218% | 100% |
| Mutations | update-entity, update-user, update-entity-sorted, update-entity-multi-view, unshift-item, delete-item, move-item |
~8672% | ~97% | ~99% | 100% |
| Scaling (10k items) | update-user-10000 |
~9290% | ~96% | ~100% | 100% |
Median ops/s per scenario; range is approximate 95% CI margin from the runner (stats.ts). Network simulation uses response-size-based delays (NETWORK_SIM_CONFIG in bench/scenarios.ts: 40 ms base latency + 1 ms per 20 records) so list refetches after an author update pay extra latency compared to normalized propagation.
Run: 2026-03-22, Linux (WSL2), yarn build:benchmark-react, static preview + env -u CI npx tsx bench/runner.ts --network-sim true (all libraries; memory scenarios not included). Numbers are machine-specific; use them for relative comparison between libraries, not as absolutes.
| Scenario | data-client | tanstack-query | swr | baseline |
|---|---|---|---|---|
| Navigation | ||||
getlist-100 |
20.45 ± 2.3% | 20.62 ± 0.8% | 20.73 ± 0.2% | 20.73 ± 0.5% |
getlist-500 |
12.53 ± 2.8% | 12.80 ± 0.2% | 12.71 ± 0.3% | 12.84 ± 0.2% |
getlist-500-sorted |
12.92 ± 5.1% | 12.93 ± 1.1% | 12.90 ± 0.7% | 13.16 ± 3.6% |
list-detail-switch-10 |
17.38 ± 8.7% | 1.64 ± 1.7% | 1.59 ± 1.4% | 0.73 ± 0.1% |
| Mutations | ||||
update-entity |
666.67 ± 9.0% | 6.98 ± 0.4% | 7.09 ± 0.4% | 7.23 ± 0.8% |
update-user |
801.28 ± 9.4% | 7.04 ± 0.5% | 7.18 ± 0.1% | 7.24 ± 1.3% |
update-entity-sorted |
625.00 ± 10.8% | 7.10 ± 0.0% | 7.10 ± 1.2% | 7.29 ± 0.9% |
update-entity-multi-view |
645.83 ± 7.6% | 7.14 ± 0.2% | 7.16 ± 0.1% | 7.29 ± 0.3% |
update-user-10000 |
144.93 ± 1.7% | 1.49 ± 0.6% | 1.56 ± 1.7% | 1.56 ± 1.5% |
unshift-item |
465.37 ± 3.6% | 6.90 ± 0.4% | 7.18 ± 0.2% | 7.21 ± 0.3% |
delete-item |
833.33 ± 6.0% | 6.93 ± 0.1% | 7.17 ± 0.7% | 7.19 ± 0.7% |
move-item |
333.33 ± 8.9% | 6.76 ± 0.6% | 6.99 ± 0.3% | 6.97 ± 0.2% |
[Measured on a Ryzen 9 7950X; 64 GB RAM; Ubuntu (WSL2); Node 24.12.0; Chromium (Playwright)]
| Category | Scenarios | Typical run-to-run spread |
|---|---|---|
| Stable | getlist-*, update-entity, update-entity-sorted, ref-stability-* |
2-5% |
| Moderate | update-user-*, update-entity-multi-view, list-detail-switch-10 |
5-10% |
| Volatile | memory-mount-unmount-cycle, startup-*, (react commit) suffixes |
10-25% |
Regressions >5% on stable scenarios or >15% on volatile scenarios are worth investigating.
- Higher is better for throughput (ops/s). Lower is better for ref-stability counts and heap delta (bytes).
- Ref-stability:
issueRefChangedanduserRefChangedcount how many components received a new object reference. Normalized caches preserve referential equality for unchanged entities; query-keyed caches typically create new references on each cache write. - React commit: Reported as
(react commit)suffix entries. These measure React ProfileractualDurationand isolate React reconciliation cost from layout/paint. - Report viewer: Toggle the "Base metrics", "React commit", and "Trace" checkboxes to filter the comparison table. Use "Load history" to compare multiple runs over time.
- Add a new app under
src/<lib>/index.tsx(e.g.src/urql/index.tsx). - Implement the
BenchAPIinterface onwindow.__BENCH__:init,updateEntity,updateUser,unmountAll,getRenderedCount,captureRefSnapshot,getRefStabilityReport, and optionallymountUnmountCycle,mountSortedView. Use the shared presentationalIssuesRowfrom@shared/componentsand fixtures from@shared/data. The harness (useBenchState) provides defaultinit,unmountAll,mountUnmountCycle,getRenderedCount, and ref-stability methods; libraries only need to supplyupdateEntity,updateUser, and any overrides. - Add the library to
LIBRARIESinbench/scenarios.ts. - Add a webpack entry in
webpack.config.cjsfor the new app and anHtmlWebpackPluginentry so the app is served at/<lib>/. - Add the dependency to
package.jsonand runyarn install.
-
Install system dependencies (Linux / WSL) Playwright needs system libraries to run Chromium. If you see "Host system is missing dependencies to run browsers":
sudo env PATH="$PATH" npx playwright install-deps chromiumThe
env PATH="$PATH"is needed becausesudodoesn't inherit your shell's PATH (where nvm-managed node/npx live). -
Build and run
yarn build:benchmark-react yarn workspace example-benchmark-react preview & sleep 5 cd examples/benchmark-react && yarn bench
Or from repo root after a build: start preview in one terminal, then in another run
yarn workspace example-benchmark-react bench. -
Without React Compiler
The default build includes React Compiler. To measure impact without it:
cd examples/benchmark-react yarn build:no-compiler # builds without babel-plugin-react-compiler yarn preview & sleep 5 yarn bench:no-compiler # labels results with [no-compiler] suffix
Or as a single command:
yarn bench:run:no-compiler.Results are labelled
[no-compiler]so you can compare side-by-side with the default run by loading both JSON files into the report viewer's history feature.Env vars for custom combinations:
REACT_COMPILER=false— disables the Babel plugin at build timeBENCH_LABEL=<tag>— appends[<tag>]to all result names at bench timeBENCH_PORT=<port>— port forpreviewserver and bench runner (default5173)BENCH_BASE_URL=<url>— full base URL override (takes precedence overBENCH_PORT)
-
Filtering scenarios
The runner supports CLI flags (with env var fallbacks) to select a subset of scenarios:
CLI flag Env var Description --lib <names>BENCH_LIBComma-separated library names (e.g. data-client,swr)--size <small|large>BENCH_SIZERun only small(cheap, full rigor) orlarge(expensive, reduced runs) scenarios--action <group|action>BENCH_ACTIONFilter by action group ( mount,update,mutation,memory) or exact action name. Memory is not run by default; use--action memoryto include.--scenario <pattern>BENCH_SCENARIOSubstring filter on scenario name CLI flags take precedence over env vars. Examples:
yarn bench --lib data-client # only data-client yarn bench --size small # only cheap scenarios (full warmup/measurement) yarn bench --action mount # init, mountSortedView yarn bench --action memory # memory-mount-unmount-cycle (heap delta; opt-in category) yarn bench --action update --lib swr # update scenarios for swr only yarn bench --scenario sorted-view # only sorted-view scenarios
Convenience scripts:
yarn bench:small # --size small yarn bench:large # --size large yarn bench:dc # --lib data-client
-
Scenario sizes
Scenarios are classified as
smallorlargebased on their cost:- Small (convergent: 5 warmup + 5–50 measurement iterations):
getlist-100,update-entity,invalidate-and-resolve,unshift-item,delete-item - Small (deterministic, single run):
ref-stability-* - Large (convergent: 3 warmup + 5–40 measurement iterations):
getlist-500,getlist-500-sorted,update-user,update-user-10000,update-entity-sorted,update-entity-multi-view,list-detail-switch-10 - Memory (opt-in, 1 warmup + 3 measurement rounds):
memory-mount-unmount-cycle— run with--action memory
Timing scenarios use convergent mode (single page load, inline convergence per scenario). Each group uses its own warmup/measurement config. Use
--sizeto run only one group. - Small (convergent: 5 warmup + 5–50 measurement iterations):
The runner prints a JSON array in customBiggerIsBetter format (name, unit, value, range) to stdout. In CI this is written to react-bench-output.json and sent to the benchmark action.
To view results locally, open bench/report-viewer.html in a browser and paste the JSON (or upload react-bench-output.json) to see a comparison table and bar chart.
Set BENCH_TRACE=true when running the bench to enable Chrome tracing for duration scenarios. Trace files are written to disk; parsing and reporting trace duration is best-effort and may require additional tooling for the trace zip format.
For the same granularity of V8 optimization investigation available in examples/benchmark (Node), two modes pass --js-flags to Chromium via Playwright:
yarn bench:trace # --trace-opt --trace-deopt → v8-trace.log
yarn bench:deopt # --prof → v8-logs/v8-<pid>.logBoth default to --lib data-client --size small for focused, fast investigation. Add other flags as needed (e.g. --scenario update-entity).
bench:trace (BENCH_V8_TRACE=true) launches Chromium with --js-flags="--trace-opt --trace-deopt". The runner uses Playwright launchServer (not launch) so the root browser process stdout/stderr can be piped to v8-trace.log — Browser from launch() does not expose process(). This is the browser equivalent of examples/benchmark's start:trace — look for optimization and deoptimization lines for functions of interest.
bench:deopt (BENCH_V8_DEOPT=true) launches Chromium with --js-flags="--prof". V8 writes per-process profiling logs to v8-logs/v8-<pid>.log. Chromium is multi-process, so several files are created; the renderer log (typically the largest) contains the benchmark's hot path. Process it with:
node --prof-process v8-logs/v8-<pid>.log > processed.txtBoth env vars can be combined for simultaneous trace output and profiling logs. The convenience scripts can be overridden:
BENCH_V8_TRACE=true yarn bench --lib data-client --scenario update-entity
BENCH_V8_DEOPT=true yarn bench --lib data-client --size large