v0.8.1 — substrate-native tape primitives + broadcast-backward fix
What's new
Two new tape autograd primitives and a latent backward-broadcast bug fix that unblocks S-MOD + substrate-K end-to-end training in OMC.
tape_phi_log(x, scale=10.0) — substrate-native fused op
ln(|x · scale| + 1) / (π · ln φ) in one tape node. Replaces the four-op composition (tape_abs → tape_mul_scalar → tape_log → tape_div_scalar) with a single op whose backward derives directly from the substrate basis. Defined at zero (boring tape_log(0) returns −∞), exposes π·ln φ at the AST level rather than hiding it in a scalar constant.
This is the precedent-setting substrate-native primitive. The protocol — composed reference + fused alternative + unit-level equivalence proof + end-to-end training A/B — can now be applied to other substrate-native fused ops (substrate_resample, attractor_snap, attractor-modulated-backward variants).
tape_abs(x) — boring PyTorch parity
Element-wise |x|. Filled the obvious hole — the autograd tape had tape_log, tape_exp, tape_sin, etc., but no absolute value.
Pre-existing broadcast-backward bug, fixed in the same chapter
tape_div and tape_mul backwards panicked with col-broadcast denominators. The prom_substrate_softmax α>0 path ends in tape_div(attn_unnorm[N, N], row_sums[N, 1]) and indexed bv.at(i, j) for j up to N−1 in a [N, 1] matrix — out of bounds. Means S-MOD + substrate-K had never actually trained end-to-end in OMC; it would panic at first backward.
Both backwards now iterate the dy shape, reduce indices against each operand's actual extent, and accumulate gradient sums across broadcast axes. L1-MH + S-MOD α=1.0 can finally cross-validate in pure-OMC Prometheus.
A/B in pure-OMC Prometheus
examples/prometheus_q6_ab.omc, substrate-K transformer, seq_len=6, d_model=8, ff_dim=16, 80 AdamW steps, 3 seeds:
| mean val | Δ vs off | composed − fused | |
|---|---|---|---|
| off (no Q6) | 2.5692 | — | — |
| composed Q6 | 2.5530 | −0.0162 (−0.63%) | — |
| fused Q6 | 2.5530 | −0.0162 (−0.63%) | 1.2 × 10⁻⁷ |
Composed and fused agree to ~1e-7 after 80 forward+backward AdamW steps — floating-point accumulation-noise floor. The substrate-native primitive matches the boring composed reference exactly under training, confirming the abstraction is free.
Q6 itself wins 2/3 seeds at this tiny scale — first OMC-side cross-validation of the PyTorch Q6 finding (−12.15% 6/6 seeds at TinyShakespeare L1-MH).
Tests
examples/tests/test_tape_abs_phi_log.omc— 12 primitive unit tests (forward, backward, edge cases, composed-vs-fused equivalence)examples/tests/test_q6_modulate.omc— 4 modulation-dispatch tests
Full suite: 1103/1103 OMC tests pass.
Files
omnimcode-core/src/interpreter.rs—TapeOp::Abs,TapeOp::PhiLog(usize, f64), broadcast-aware Mul/Div backwardexamples/lib/prometheus.omc—prom_q6_modulate(q, scale, gamma, mode)+q6_modefield on substrate-K layerexamples/prometheus_q6_ab.omc— A/B harnessexperiments/prometheus_parity/TAPE_PRIMITIVES_AB.md— full writeup