Skip to content

Latest commit

 

History

History
72 lines (53 loc) · 4.06 KB

File metadata and controls

72 lines (53 loc) · 4.06 KB

Phase 13 — Benchmarks & Chaos

Goal

Measure Brain against the spec's latency/throughput targets, then break it on purpose to validate the recovery story. The output is a set of reproducible baselines + a chaos harness that drives the Phase 14 acceptance gates.

Prerequisites

  • Phase 12 complete (phase-12-complete tag). Benchmarks read the metric counters that Phase 12 wires; chaos asserts use the same counters to verify recovery.

Reading list

  1. spec/19_benchmarks/02_performance_targets.md
  2. spec/19_benchmarks/02_performance_targets.md
  3. spec/19_benchmarks/04_benchmark_methodology.md
  4. spec/18_failure_recovery/07_chaos_testing.md

Outputs

  • Per-operation criterion benches in each runtime crate.
  • benches/load_generator.rs — sustained-rate end-to-end load harness.
  • tests/chaos/ — kill-at-point, I/O fault, network failure, corruption injection scenarios.
  • tests/soak/ — 48 h continuous-load rig (run on dedicated infra; not CI).
  • Performance report committed to docs/performance/baselines-<date>.md.
  • Tag: phase-13-complete.

Sub-tasks

Task 13.1 — Per-operation criterion benches

Reads: spec §02/02, §02/03, §02/07. Writes: benches/*.rs in brain-storage, brain-index, brain-ops, brain-planner, brain-server (one bench harness per crate; one benchmark per spec'd operation). Done when: every cognitive operation has a criterion baseline; results table commits to docs/performance/baselines-<date>.md; spec §14 latency targets met on reference hardware.

Task 13.2 — End-to-end load generator

Writes: benches/load_generator.rs (binary) — sustains a configurable rate of mixed encode / recall / link traffic over the SDK; reports p50/p95/p99 and per-op error rates. Done when: generator hits spec §02/03 throughput targets without saturating CPU; emits a CSV summary suitable for diffing across runs.

Task 13.3 — Chaos harness

Reads: spec §02/07. Writes: tests/chaos/{kill_during_wal_write,io_fault,network_partition,bit_flip,resource_exhaustion}.rs. Done when: each scenario reproduces the spec'd failure mode and asserts the spec'd recovery behaviour (no data loss, no silent corruption, fail-stop where mandated). Loom coverage for the select concurrency-critical paths flagged in §02/07.

Task 13.4 — Soak rig

Writes: tests/soak/{driver,asserts}.rs — drives sustained mixed traffic for 48 h; samples memory, fd count, latency every 60 s; fails the run on memory leak / latency drift / error rate exceeding spec §02/04 thresholds. Done when: soak completes one 48 h run on dedicated infra with no failures; results land in docs/performance/soak-<date>.md.

Status: scaffolding shipped as crates/brain-sdk-rust/examples/soak.rs. The driver + sampler + drift-checker are CLI-driven; the 48 h reference run is operator-side (spec §02/15 puts soak at "weekly" cadence — never in CI). Final exit code reflects pass/fail; CSV + SOAK_RESULT pass=true|false ... summary line are suitable for committing to docs/performance/soak-<date>.md.

Phase exit checklist

  • Sub-tasks 13.1–13.4 scaffolded.
  • Performance baselines doc + per-crate criterion benches shipped.
  • Storage-layer chaos (random_kill, bit_flip, io_fault) green in CI.
  • Operator-run 48 h soak result recorded in docs/performance/soak-<date>.md.
  • Network-partition / resource-exhaustion / time-anomaly chaos (operator-infra dependent; tracked as Phase 14 acceptance scenarios).
  • Tag phase-13-complete once the soak result lands.

Notes

Benchmarks need a quiet machine — no other tenants, fixed CPU governor, no thermal throttling. The methodology doc in spec §02/07 covers this; follow it precisely or the numbers are worthless.

Chaos tests intentionally bring the process down. Run them in a sandbox; never against a real corpus.