Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Benchmark fixtures

This directory contains benchmark data artifacts for NJS-GROWTH-07.

- `sample-results.placeholder.json` is deterministic placeholder data for chart and playground wiring.
- It is not benchmark output and must not be used for public performance claims.
- Real benchmark output must match `docs/public/benchmarks/results.schema.json` and set `result_kind` to `actual_benchmark`.

See `docs/benchmarks/methodology.md` for the competitor, scenario, input-size, and metric contract.
115 changes: 115 additions & 0 deletions benchmarks/sample-results.placeholder.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
{
"schema_version": "1.0.0",
"result_kind": "placeholder_sample",
"is_placeholder": true,
"claims_allowed": false,
"generated_at": "2026-01-01T00:00:00.000Z",
"disclaimer": "This file contains deterministic placeholder sample data for visual-agent wiring only. These are not benchmark results and must not be used as performance claims.",
"methodology": "docs/benchmarks/methodology.md",
"competitors": [
"@sebasoft/neuron-js",
"json-rules-engine",
"json-logic-js",
"hand-coded-typescript",
"rule-engine-js"
],
"scenarios": ["pricing-discount", "eligibility-approval", "workflow-routing"],
"input_sizes": ["smoke", "small", "medium", "large"],
"results": [
{
"engine": "@sebasoft/neuron-js",
"scenario": "pricing-discount",
"input_size": "smoke",
"warmup_iterations": 10,
"measured_iterations": 100,
"throughput_decisions_per_second": 1000,
"p50_ms": 1,
"p95_ms": 2,
"cold_start_ms": 25,
"bundle_size_minified_bytes": 5000,
"validation_overhead_ms": 0.5,
"explanation_overhead_ms": 0.75,
"node_version": "v0.0.0-placeholder",
"package_version": "0.0.0-placeholder",
"commit_sha": "0000000000000000000000000000000000000000",
"result_kind": "placeholder_sample",
"notes": "synthetic placeholder row for chart wiring only; not benchmark results and not a performance claim."
},
{
"engine": "json-rules-engine",
"scenario": "eligibility-approval",
"input_size": "smoke",
"warmup_iterations": 10,
"measured_iterations": 100,
"throughput_decisions_per_second": 1001,
"p50_ms": 1.1,
"p95_ms": 2.1,
"cold_start_ms": 26,
"bundle_size_minified_bytes": 5001,
"validation_overhead_ms": 0.6,
"explanation_overhead_ms": 0.85,
"node_version": "v0.0.0-placeholder",
"package_version": "0.0.0-placeholder",
"commit_sha": "0000000000000000000000000000000000000000",
"result_kind": "placeholder_sample",
"notes": "synthetic placeholder row for chart wiring only; not benchmark results and not a performance claim."
},
{
"engine": "json-logic-js",
"scenario": "workflow-routing",
"input_size": "smoke",
"warmup_iterations": 10,
"measured_iterations": 100,
"throughput_decisions_per_second": 1002,
"p50_ms": 1.2,
"p95_ms": 2.2,
"cold_start_ms": 27,
"bundle_size_minified_bytes": 5002,
"validation_overhead_ms": 0.7,
"explanation_overhead_ms": 0.95,
"node_version": "v0.0.0-placeholder",
"package_version": "0.0.0-placeholder",
"commit_sha": "0000000000000000000000000000000000000000",
"result_kind": "placeholder_sample",
"notes": "synthetic placeholder row for chart wiring only; not benchmark results and not a performance claim."
},
{
"engine": "hand-coded-typescript",
"scenario": "pricing-discount",
"input_size": "smoke",
"warmup_iterations": 10,
"measured_iterations": 100,
"throughput_decisions_per_second": 1003,
"p50_ms": 1.3,
"p95_ms": 2.3,
"cold_start_ms": 28,
"bundle_size_minified_bytes": 5003,
"validation_overhead_ms": 0.8,
"explanation_overhead_ms": 1.05,
"node_version": "v0.0.0-placeholder",
"package_version": "0.0.0-placeholder",
"commit_sha": "0000000000000000000000000000000000000000",
"result_kind": "placeholder_sample",
"notes": "synthetic placeholder row for chart wiring only; not benchmark results and not a performance claim."
},
{
"engine": "rule-engine-js",
"scenario": "eligibility-approval",
"input_size": "smoke",
"warmup_iterations": 10,
"measured_iterations": 100,
"throughput_decisions_per_second": 1004,
"p50_ms": 1.4,
"p95_ms": 2.4,
"cold_start_ms": 29,
"bundle_size_minified_bytes": 5004,
"validation_overhead_ms": 0.9,
"explanation_overhead_ms": 1.15,
"node_version": "v0.0.0-placeholder",
"package_version": "0.0.0-placeholder",
"commit_sha": "0000000000000000000000000000000000000000",
"result_kind": "placeholder_sample",
"notes": "synthetic placeholder row for chart wiring only; not benchmark results and not a performance claim."
}
]
}
11 changes: 11 additions & 0 deletions docs/.vitepress/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ export default defineConfig({
{ text: 'Concepts', link: '/overview' },
{ text: 'Examples', link: '/use-cases/runnable-examples' },
{ text: 'Schemas', link: '/schemas-validation-explainability' },
{ text: 'Proof Assets', link: '/benchmarks/methodology' },
{ text: 'Comparisons', link: '/comparisons/' },
{ text: 'Integrations', link: '/integrations/' },
{ text: 'AI Docs', link: '/ai-coding-assistants' },
Expand Down Expand Up @@ -42,6 +43,16 @@ export default defineConfig({
{ text: 'Dynamic Routing', link: '/use-cases/dynamic-routing' }
]
},
{
text: 'Proof Assets',
items: [
{ text: 'Benchmarks & Visual Proof', link: '/benchmarks/' },
{ text: 'Benchmarks', link: '/benchmarks/methodology' },
{ text: 'Visual Proof System', link: '/benchmarks/visual-proof-system' },
{ text: 'Prompt Kit', link: '/benchmarks/prompt-kit' },
{ text: 'Asset Folder', link: '/benchmarks/assets/' }
]
},
{
text: 'Comparisons',
items: [
Expand Down
58 changes: 58 additions & 0 deletions docs/benchmarks/assets/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# NJS-GROWTH-07 benchmark asset folder

Recommended folder for visual proof assets:

```text
docs/benchmarks/assets/
README.md
source-data/
benchmark-results.example.json
benchmark-results.schema.json
prompts/
benchmark-infographic.md
explainability-trace-diagram.md
playground-readme-gif-storyboard.md
ai-rule-safety-carousel.md
readme-proof-strip.md
storyboard/
playground-readme-gif.md
generated/
benchmark-chart-throughput.svg
benchmark-chart-cold-start.svg
benchmark-chart-bundle-size.svg
benchmark-chart-validation-overhead.svg
benchmark-chart-explanation-overhead.svg
explainability-trace-diagram.svg
readme-proof-strip.svg
```

## Folder rules

- `source-data/` stores benchmark input data and schemas only. Generated assets must cite these files when they include measurements.
- `prompts/` stores frozen prompts copied or split from `docs/benchmarks/prompt-kit.md` for repeatable generation.
- `storyboard/` stores GIF/clip scripts, frame lists, and capture notes.
- `generated/` stores exported SVG/PNG/GIF/MP4 assets.

## Data integrity rules

- Do not place fabricated benchmark results in `source-data/`.
- Example files must be named `.example.*` and must use obviously non-claiming values or schema-only structures.
- Real benchmark files must include `node_version`, `package_version`, `commit_sha`, `warmup_iterations`, and `measured_iterations`.
- Any generated benchmark chart must name its source data file in adjacent metadata or in the SVG metadata block.

## Accessibility rules

Every final asset needs:

- Alt text.
- Source data link when metrics appear.
- Color contrast check notes.
- Export dimensions.
- Creation date and generator/tool used.

## README integration rule

The README proof strip should not be merged into the README until either:

1. it contains no benchmark numbers and is clearly a proof-structure asset; or
2. it uses real benchmark output from the harness and cites the exact source data.
26 changes: 26 additions & 0 deletions docs/benchmarks/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Benchmarks and visual proof

NJS-GROWTH-07 defines the public proof system for Neuron-JS benchmarks, playground captures, explanation diagrams, README media, and social proof assets.

This section is a design and generation foundation. It does not publish benchmark claims. Benchmark numbers must come from measured harness output only.

## Files

- [Benchmark methodology and result contract](./methodology.md): competitor set, scenario matrix, input-size matrix, metric definitions, and placeholder-data policy.
- [Result JSON schema](/benchmarks/results.schema.json): machine-readable contract for benchmark output and downstream chart data.
- [Visual proof system](./visual-proof-system.md): palette, typography, composition, diagram style, chart rules, social-card constraints, and README proof strip guidance.
- [Visual proof prompt kit](./prompt-kit.md): reusable prompts for benchmark infographics, explainability trace diagrams, playground README GIF storyboards, AI-rule safety carousels, and README proof strips.
- [Asset folder recommendation](./assets/): recommended `docs/benchmarks/assets/` structure and data-integrity rules.

## Proof promise

Neuron-JS visual proof assets should make four claims visible:

1. Rules are serializable JSON data.
2. Schema validation happens before runtime.
3. Execution is deterministic through a developer-owned registry and Synapse.
4. Explanation traces show why a rule matched or failed.

## Data policy

Do not fabricate benchmark results. When data is unavailable, use non-numeric structure, empty states, or labels such as `pending measured data`.
93 changes: 93 additions & 0 deletions docs/benchmarks/methodology.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Neuron-JS benchmark methodology and result contract

Source of record: `chaos-vault/50-research/neuron-js-growth-plan.md` lines 294-318, supported by `neuron-js-marketing-assets-benchmark.md` lines 87-100 and `neuron-js-social-demand-gap.md` lines 160-169. Hindsight recall was queried for this task and returned no relevant stored memories; the vault notes above are the governing source.

This page defines the NJS-GROWTH-07 benchmark harness contract. It is intentionally conservative: no benchmark claim is valid until the executable harness emits `result_kind: "actual_benchmark"` with `is_placeholder: false` and `claims_allowed: true`.

## Competitor set

The harness/result contract covers exactly these engines for the first public proof bundle:

| Engine | Adapter key | Role |
| --- | --- | --- |
| Neuron-JS | `@sebasoft/neuron-js` | First-party rules engine under test. |
| json-rules-engine | `json-rules-engine` | Closest default Node.js JSON rules-engine competitor. |
| JsonLogic | `json-logic-js` | Portable JSON predicate format competitor. |
| Hand-coded TypeScript | `hand-coded-typescript` | Baseline for direct conditional logic without engine overhead. |
| rule-engine-js | `rule-engine-js` | Smaller modern competitor selected because it installs/builds in this repository. |

`rulepilot` remains an alternate candidate only if `rule-engine-js` becomes infeasible later. Do not mix both in the same first chart set without updating `docs/public/benchmarks/results.schema.json`.

## Scenario matrix

| Scenario | Inputs represented | Why it exists |
| --- | --- | --- |
| `pricing-discount` | tier, region, coupon, cart total, account age | Shows business-rule pricing decisions and validation/explanation overhead. |
| `eligibility-approval` | age, country, verification status, risk score, account flags | Shows policy/approval style decisions with clear pass/fail outcomes. |
| `workflow-routing` | channel, urgency, customer segment, confidence score, escalation flags | Shows deterministic workflow routing and trace usefulness. |

## Input-size matrix

| Profile | Decisions | Usage |
| --- | ---: | --- |
| `smoke` | 100 | Correctness and trace sanity. |
| `small` | 1,000 | Local development feedback. |
| `medium` | 10,000 | Chartable throughput. |
| `large` | 100,000 | Optional; run only if runtime remains practical in CI/local machines. |

## Stable result fields

Every result row must contain these fields. Units and source definitions are duplicated in `docs/public/benchmarks/results.schema.json` for machine consumers.

| Field | Unit | Source |
| --- | --- | --- |
| `engine` | identifier | Harness adapter name for the implementation under test; fixed enum for cross-run joins. |
| `scenario` | identifier | Scenario slug produced by the benchmark scenario matrix. |
| `input_size` | profile | Named workload profile, not raw row count, so visual assets can group runs consistently. |
| `warmup_iterations` | decisions | Number of unmeasured warmup decisions completed before timing. |
| `measured_iterations` | decisions | Number of measured decisions included in timing statistics. |
| `throughput_decisions_per_second` | decisions/second | Measured decisions divided by elapsed measured wall-clock seconds. |
| `p50_ms` | milliseconds | Median per-decision latency from measured iterations. |
| `p95_ms` | milliseconds | 95th percentile per-decision latency from measured iterations. |
| `cold_start_ms` | milliseconds | Wall-clock time to load/import the engine adapter and execute the first decision in a fresh process or isolated worker. |
| `bundle_size_minified_bytes` | bytes | Minified adapter+engine bundle byte count from the configured bundler output. |
| `validation_overhead_ms` | milliseconds | Additional median latency for validation-enabled execution versus validation-disabled execution on the same scenario/input profile. |
| `explanation_overhead_ms` | milliseconds | Additional median latency for trace/explanation-enabled execution versus trace-disabled execution on the same scenario/input profile. |
| `node_version` | semver/runtime | Node.js version reported by `process.version` for the benchmark run. |
| `package_version` | semver or source | Package version or source label for the engine adapter under test. |
| `commit_sha` | git sha | Repository commit SHA for the Neuron-JS benchmark harness and local implementation. |

## Placeholder data policy

`benchmarks/sample-results.placeholder.json` is synthetic wiring data only. It exists so chart, carousel, README-strip, and playground work can consume stable fields before real measurements exist.

Visual/publication agents must reject files where any of these are true:

- `result_kind` is `placeholder_sample`.
- `is_placeholder` is `true`.
- `claims_allowed` is `false`.
- row `notes` contains `synthetic placeholder`.

Synthetic sample values must never be used in README copy, social posts, npm copy, or public performance claims.

## Initial harness contract

A future executable harness should:

1. Build adapters for `@sebasoft/neuron-js`, `json-rules-engine`, `json-logic-js`, `hand-coded-typescript`, and `rule-engine-js`.
2. Run each adapter against `pricing-discount`, `eligibility-approval`, and `workflow-routing` fixtures.
3. Execute `smoke`, `small`, and `medium` profiles by default; gate `large` behind an explicit flag.
4. Measure cold start separately from warm throughput.
5. Measure validation and explanation overhead as delta timings against the same scenario/input profile.
6. Emit JSON matching `docs/public/benchmarks/results.schema.json`.
7. Set `result_kind: "actual_benchmark"`, `is_placeholder: false`, and `claims_allowed: true` only for real measured output.

## Visual asset consumers

The first visual asset bundle can safely bind to these paths:

- Schema: `docs/public/benchmarks/results.schema.json`
- Placeholder sample: `benchmarks/sample-results.placeholder.json`
- Methodology: `docs/benchmarks/methodology.md`

Use the placeholder file for layout only. Replace it with real harness output before making any chart label, README proof strip, or social asset that implies measured performance.
Loading