Skip to content

v0.8.1 — substrate-native tape primitives + broadcast-backward fix

Choose a tag to compare

@RandomCoder-lab RandomCoder-lab released this 17 May 20:30
· 289 commits to master since this release

What's new

Two new tape autograd primitives and a latent backward-broadcast bug fix that unblocks S-MOD + substrate-K end-to-end training in OMC.

tape_phi_log(x, scale=10.0) — substrate-native fused op

ln(|x · scale| + 1) / (π · ln φ) in one tape node. Replaces the four-op composition (tape_abstape_mul_scalartape_logtape_div_scalar) with a single op whose backward derives directly from the substrate basis. Defined at zero (boring tape_log(0) returns −∞), exposes π·ln φ at the AST level rather than hiding it in a scalar constant.

This is the precedent-setting substrate-native primitive. The protocol — composed reference + fused alternative + unit-level equivalence proof + end-to-end training A/B — can now be applied to other substrate-native fused ops (substrate_resample, attractor_snap, attractor-modulated-backward variants).

tape_abs(x) — boring PyTorch parity

Element-wise |x|. Filled the obvious hole — the autograd tape had tape_log, tape_exp, tape_sin, etc., but no absolute value.

Pre-existing broadcast-backward bug, fixed in the same chapter

tape_div and tape_mul backwards panicked with col-broadcast denominators. The prom_substrate_softmax α>0 path ends in tape_div(attn_unnorm[N, N], row_sums[N, 1]) and indexed bv.at(i, j) for j up to N−1 in a [N, 1] matrix — out of bounds. Means S-MOD + substrate-K had never actually trained end-to-end in OMC; it would panic at first backward.

Both backwards now iterate the dy shape, reduce indices against each operand's actual extent, and accumulate gradient sums across broadcast axes. L1-MH + S-MOD α=1.0 can finally cross-validate in pure-OMC Prometheus.

A/B in pure-OMC Prometheus

examples/prometheus_q6_ab.omc, substrate-K transformer, seq_len=6, d_model=8, ff_dim=16, 80 AdamW steps, 3 seeds:

mean val Δ vs off composed − fused
off (no Q6) 2.5692
composed Q6 2.5530 −0.0162 (−0.63%)
fused Q6 2.5530 −0.0162 (−0.63%) 1.2 × 10⁻⁷

Composed and fused agree to ~1e-7 after 80 forward+backward AdamW steps — floating-point accumulation-noise floor. The substrate-native primitive matches the boring composed reference exactly under training, confirming the abstraction is free.

Q6 itself wins 2/3 seeds at this tiny scale — first OMC-side cross-validation of the PyTorch Q6 finding (−12.15% 6/6 seeds at TinyShakespeare L1-MH).

Tests

  • examples/tests/test_tape_abs_phi_log.omc — 12 primitive unit tests (forward, backward, edge cases, composed-vs-fused equivalence)
  • examples/tests/test_q6_modulate.omc — 4 modulation-dispatch tests

Full suite: 1103/1103 OMC tests pass.

Files

  • omnimcode-core/src/interpreter.rsTapeOp::Abs, TapeOp::PhiLog(usize, f64), broadcast-aware Mul/Div backward
  • examples/lib/prometheus.omcprom_q6_modulate(q, scale, gamma, mode) + q6_mode field on substrate-K layer
  • examples/prometheus_q6_ab.omc — A/B harness
  • experiments/prometheus_parity/TAPE_PRIMITIVES_AB.md — full writeup