Skip to content

Build NJS-GROWTH-07 benchmark harness, measured result charts, and AI-rule-safety carousel#15

Merged
SebaSOFT merged 2 commits into
mainfrom
growth/njs-growth-07-benchmark-harness-results
Jun 23, 2026
Merged

Build NJS-GROWTH-07 benchmark harness, measured result charts, and AI-rule-safety carousel#15
SebaSOFT merged 2 commits into
mainfrom
growth/njs-growth-07-benchmark-harness-results

Conversation

@SebaSOFT

Copy link
Copy Markdown
Owner

Summary

Unblocks the NJS-GROWTH-07 proof assets by building a real benchmark harness, running it to produce measured actual_benchmark data, and publishing a results showcase generated from that data — plus a qualitative AI-rule-safety carousel. Grounded in the internal NJS-GROWTH-07 proof-assets brief: no src/ changes, no release, and the live playground / README GIF stay deferred until the playground is stable.

Benchmark harness (benchmarks/, consumes built dist/)

  • 5 engine adapters — @sebasoft/neuron-js, json-rules-engine, json-logic-js, hand-coded-typescript, rule-engine-js — over 3 example-backed scenarios (pricing-discount, eligibility-approval, workflow-routing) × 3 sizes (smoke/small/medium).
  • Fairness gate: every engine must reproduce the scenario's canonical decision before timing, so all engines are measured doing equivalent work.
  • Metrics: batched throughput / p50 / p95, fresh-process cold start, esbuild minified bundle size, and Neuron-JS validateScript / explainExecution overhead deltas (competitors have no equivalent step → 0 with a row note).
  • run.ts emits benchmarks/results/latest.actual.json (result_kind=actual_benchmark, is_placeholder=false, claims_allowed=true).
  • Scripts: yarn benchmark, yarn benchmark:charts. Added esbuild devDependency; biome now lints benchmarks/.

Proof assets

  • benchmarks/charts/generate.ts renders the 5 benchmark-chart-*.svg and the generated showcase page docs/benchmarks/results.md from the same source file, so charts and table can never drift from measured data. Every visible number exists verbatim in the result file.
  • 5-slide AI-rule-safety carousel SVGs (qualitative, no numbers) + asset metadata; added the missing methodology-card metadata.
  • README, VitePress nav/sidebar, and the benchmarks index link the results; methodology now documents how each metric is measured.

Privacy

Scrubbed private chaos-vault/... paths and internal agent notes from the 7 published docs/assets that were leaking them onto the live site (these rendered as visible page text, not just source comments).

Validation

  • git diff --check
  • yarn lint ✅ (49 files, now including benchmarks/)
  • yarn test ✅ (13 files / 68 tests, incl. new tests/contracts/benchmark-results.test.ts)
  • yarn examples
  • yarn build
  • yarn docs:build ✅ — charts + carousel ship in the build; no chaos-vault strings in docs/.vitepress/dist

Guardrails

  • No npm release. No src/ changes.
  • Charts generated only from measured actual_benchmark output; placeholder fixture untouched.
  • Playground app + README GIF/proof-strip remain deferred per the brief (blocked until the playground is stable).

🤖 Generated with Claude Code

…y carousel

Unblocks NJS-GROWTH-07 proof assets by producing measured benchmark data and
publishing it, plus a qualitative AI-rule-safety carousel.

Harness (benchmarks/, consumes built dist/, no src changes):
- 5 engine adapters (@sebasoft/neuron-js, json-rules-engine, json-logic-js,
  hand-coded-typescript, rule-engine-js) over 3 example-backed scenarios
  (pricing-discount, eligibility-approval, workflow-routing) x 3 sizes.
- Fairness gate: every engine must reproduce the scenario's canonical decision
  before timing.
- Metrics: batched throughput/p50/p95, fresh-process cold start, esbuild
  minified bundle size, and Neuron-JS validateScript/explainExecution overhead
  deltas (competitors have no equivalent step -> 0 with a note).
- run.ts emits benchmarks/results/latest.actual.json
  (result_kind=actual_benchmark, is_placeholder=false, claims_allowed=true).
- Scripts: yarn benchmark, yarn benchmark:charts. esbuild devDependency added;
  biome now lints benchmarks/.

Proof assets:
- benchmarks/charts/generate.ts renders the 5 benchmark-chart-*.svg and the
  generated docs/benchmarks/results.md showcase from the same source file, so
  charts/table can never drift from measured data. Every visible number exists
  verbatim in the result file.
- 5-slide AI-rule-safety carousel SVGs (qualitative, no numbers) + asset
  metadata; added methodology-card metadata.
- README + VitePress nav/sidebar + benchmarks index link the results; methodology
  documents how each metric is measured.

Tests: tests/contracts/benchmark-results.test.ts validates the committed result
file against the schema, the full matrix, and the overhead-attribution rule.

Privacy: scrubbed private chaos-vault paths and internal agent notes from the
seven published docs/assets that leaked them onto the site.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive benchmark harness for the neuron-js rules engine, comparing it against competitors across multiple scenarios and profiles, and integrates the generated charts and results into the documentation. The review feedback highlights several key improvement opportunities to make the harness more robust and maintainable, including resolving a constructor argument mismatch in neuron-plugins.ts, adding a default error-handling case to the scenario switch statement in the neuron-js adapter, replacing fragile dynamic export indexing with named exports in cold-start-child.ts, extracting duplicated state-reading logic into a shared utility, and refactoring scenario decision functions to avoid hardcoded values.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread benchmarks/harness/neuron-plugins.ts
Comment thread benchmarks/harness/adapters/neuron-js.ts
Comment thread benchmarks/harness/cold-start-child.ts Outdated
Comment thread benchmarks/harness/neuron-plugins.ts
Comment thread benchmarks/harness/scenarios.ts
- neuron-plugins: drop extraneous 6th constructor arg in resolveParam
  (StateNumberParameter takes 5).
- neuron-js adapter: add default case throwing on unhandled scenario.
- cold-start: adapters now expose a named `adapter` export; cold-start-child
  imports `{ adapter }` instead of Object.values(...)[0].
- extract shared readPath into harness/read-path.ts; reuse in neuron-plugins
  and the hand-coded adapter (removes duplication).
- scenarios: pricing decide/canonical derive from shared PRICING_* constants
  instead of a hardcoded subtotal duplicated in data.

Re-ran yarn benchmark + benchmark:charts; lint/test/examples/build/docs:build
all green; no chaos-vault strings in dist.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@SebaSOFT SebaSOFT merged commit 953b9a4 into main Jun 23, 2026
2 checks passed
@SebaSOFT SebaSOFT deleted the growth/njs-growth-07-benchmark-harness-results branch June 23, 2026 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants