You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
→ cascade L1..L4 (L1-L3 only if A12b slipped) → B-Compose (alpha) →
392
+
emit
393
+
-**Client**: stock `<video>` element over HLS, no special player
394
+
-**Metrics**: median FPS, p95 frame time, stutter events (count of
395
+
inter-frame gaps > 33 ms), exposed on a Prom endpoint
396
+
397
+
### Gates
398
+
399
+
| Gate | Target | What it proves |
400
+
|---|---|---|
401
+
|**SG1: median FPS**| ≥ 60 fps for a 1080p H.264 input, 10-minute Big Buck Bunny | Steady-state throughput; the cascade isn't dropping behind |
402
+
|**SG2: p95 frame time**| ≤ 20 ms | No bundle is silently decomposing into its constituent ops |
403
+
|**SG3: stutter count**| 0 events > 33 ms over 10 minutes | Every bundle honors its latency contract under sustained load |
404
+
|**SG4: closure swap**| Same SG1-SG3 envelope with `splat4d-nars-compose` feature on | NARS-revision path has the same latency class as alpha, as designed |
405
+
406
+
SG4 is conditional on G3 (the NARS truth-revision kernel) being
407
+
ULP-correct against the scalar reference; an SG4 failure with G3
408
+
passing is evidence that the NARS B-Compose kernel has a worse latency
409
+
class than alpha and must be re-staged before the W7 closure swap.
410
+
346
411
## Forbidden constraints
347
412
348
413
Five invariants the sprint MUST NOT violate:
@@ -353,11 +418,29 @@ Five invariants the sprint MUST NOT violate:
353
418
worker stubs the L4 path (returns `Err(NotReadyL4)`) and ships L1-L3
354
419
addressing only.
355
420
356
-
2.**No `crate::simd::*` extension from inside PR-X4**. Any new SIMD
357
-
primitive (e.g., a missing lane width for G2 INT4×32) must be
358
-
proposed against `vertical-simd-consumer-contract.md` and land in
359
-
ndarray's `src/simd_*.rs` before PR-X4 consumes it. PR-X4 must not
360
-
reach for raw `std::arch::*` intrinsics.
421
+
2.**PR-X4 consumes — and must not extend — the following SIMD bundles
422
+
from `ndarray::simd`.** Each bundle is a fused multi-op transaction
423
+
with its own latency budget; reaching past a bundle into raw
424
+
`std::arch::*` intrinsics, or proposing new lane primitives without
425
+
going through `vertical-simd-consumer-contract.md`, breaks the
426
+
contract and re-introduces the bespoke-binner pathology v1 is
427
+
leaving behind.
428
+
429
+
| Bundle | Composition | Cognitive role |
430
+
|---|---|---|
431
+
|**B-Splat**|`splat_f32x16`, `splat_i32x16`| Broadcast a Gaussian center / NARS truth-value across the 16 tile lanes of a single L_k cell. The identity of a single belief across its support. |
432
+
|**B-Gather-FMA**|`gather_idx_f32x16` ∘ `fmadd_f32x16`| Pick up the 16 neighbouring Gaussians of a tile and fuse-multiply-add their contributions in one shot. Evidence-aggregation across siblings. |
433
+
|**B-Pack-Dot**|`pack_int4x32` ∘ `dot_i4x32_to_i32` ∘ `dequant_f32`| The INT4×32 packed dot of A4. SH-coefficient evaluation, NARS confidence × frequency products. Three backends (AVX-512 VNNI, NEON UDOT, scalar) with parity tests. |
434
+
|**B-Cascade-Permute**|`shuffle_lanes_4x4` ∘ `transpose_16x16`| Cross-tier rotation L_k → L_{k+1}. The 4×4 stride identity made executable — without this bundle the cascade is just a hierarchy of independent grids. |
435
+
|**B-Compose**|`hreduce_sum_f32x16` for alpha; `revise_truth_f32x16` for NARS | Closure-swappable horizontal reduction. The `splat4d-nars-compose` feature gate selects which kernel binds; same lane width, same latency class, different algebra. |
436
+
|**B-Interleave-Transpose**|`interleave_f32x16` ∘ `transpose_inplace`| Row-major splat3d ↔ lane-major splat4d. Boundary primitive between v1 binner and v2 cascade. |
437
+
438
+
The forbidden thing is reaching past a bundle into its internal lane
439
+
primitives — that breaks the latency contract that the A6 Railway
440
+
smoke gates (SG2 p95 ≤ 20 ms, SG3 zero stutter) are designed to
441
+
falsify. Missing bundles get proposed against
442
+
`vertical-simd-consumer-contract.md` and land in `src/simd_*.rs`
443
+
before PR-X4 consumes them; PR-X4 itself never adds a primitive.
361
444
362
445
3.**No write to lance-graph upstream**. PR-X4 lives entirely in
363
446
ndarray (`src/hpc/splat3d_v2/`, `src/hpc/splat4d/`). It consumes
@@ -427,12 +510,16 @@ Five invariants the sprint MUST NOT violate:
427
510
PR-X4 promotes splat3d from "bespoke 16×16 tile binner" to "typed
428
511
multi-resolution cognitive evolution operator" with the
429
512
(4×4)×(4×4)×(4×4)×(4×4) tier scheme as its load-bearing structural
430
-
identity. Slots at W4-W5 (5 workers). Consumes GridLake +
513
+
identity. Slots at W4-W5 (6 workers). Consumes GridLake +
0 commit comments