Commit 8d8c214
🥂 v0.8.4 substrate-builtins: fused AdamW Rust builtin → 40× CPU / 96× GPU end-to-end
Three Rust builtins replace the OMC-side inner-loop helpers that were
the v0.8.2 wall-clock bottleneck:
substrate_smod_matrix(scores, alpha)
substrate_resample_matrix(v, scale)
substrate_adamw_update(cur, grad, m, v, lr, b1, b2, eps, wd, step)
The first two (modulator matrix construction) did NOT move end-to-end
wall-clock when shipped alone. Profiling-by-fixing found the real
bottleneck: prom_adamw_step. It ran ~15 OMC-side element-wise loops
per parameter per step: _prom_zip(_prom_scale(...), _prom_scale(...),
"add") chained through several stages. At d_model=256 with 6 params,
~6M OMC ops per training step.
Replacing the AdamW inner block with one Rust builtin:
v0.8.2 baseline CPU 25.81 s/step GPU 25.88 s/step
v0.8.4 modulators CPU 26.38 s/step GPU 26.28 s/step ← no change
v0.8.4 + AdamW CPU 0.65 s/step GPU 0.27 s/step ← 40× / 96×
The three chapters now compound:
v0.8.2 wired GPU in (no end-to-end win, OMC overhead dominated)
v0.8.3 found substrate-shaped 8×32 tile (114 GFLOPS, no end-to-end change)
v0.8.4 removes OMC overhead, both prior wins finally pay out
GPU/CPU split at v0.8.4 is 2.4× — what we'd expect from the matmul
speedup at d_model=256. Future scale-ups (d_model=512+, multi-block,
longer sequences) get BOTH benefits compositionally.
Loss agrees with v0.8.2 to 5e-5 (f32 GPU roundtrip noise). Identical
training trajectory.
What this unlocks immediately:
- L1-MH + S-MOD α=1.0 OMC cross-validation (task #264)
- Larger-scale substrate-attention (task #265)
- Q6 OMC cross-validation at real training length (v0.8.1 was 80 steps)
Files:
omnimcode-core/src/interpreter.rs three builtins + flatten helpers
examples/lib/prometheus.omc wrappers + adamw uses builtin
examples/tests/test_substrate_modulator_builtins.omc 8 tests
experiments/prometheus_parity/SUBSTRATE_BUILTINS_WIN.md
1111/1111 OMC tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>1 parent d1fa0a2 commit 8d8c214
5 files changed
Lines changed: 591 additions & 59 deletions
File tree
- examples
- lib
- tests
- experiments/prometheus_parity
- omnimcode-core/src
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
16 | 17 | | |
17 | 18 | | |
18 | 19 | | |
| |||
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
37 | 94 | | |
38 | 95 | | |
39 | 96 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
700 | 700 | | |
701 | 701 | | |
702 | 702 | | |
| 703 | + | |
| 704 | + | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
703 | 709 | | |
704 | | - | |
705 | | - | |
706 | | - | |
707 | | - | |
708 | | - | |
709 | | - | |
710 | | - | |
711 | | - | |
712 | | - | |
713 | | - | |
714 | | - | |
715 | | - | |
716 | | - | |
717 | | - | |
718 | | - | |
719 | | - | |
720 | | - | |
| 710 | + | |
721 | 711 | | |
722 | 712 | | |
723 | 713 | | |
| |||
726 | 716 | | |
727 | 717 | | |
728 | 718 | | |
| 719 | + | |
| 720 | + | |
729 | 721 | | |
730 | | - | |
731 | | - | |
732 | | - | |
733 | | - | |
734 | | - | |
735 | | - | |
736 | | - | |
737 | | - | |
738 | | - | |
739 | | - | |
740 | | - | |
741 | | - | |
742 | | - | |
743 | | - | |
744 | | - | |
745 | | - | |
746 | | - | |
747 | | - | |
| 722 | + | |
748 | 723 | | |
749 | 724 | | |
750 | 725 | | |
| |||
1108 | 1083 | | |
1109 | 1084 | | |
1110 | 1085 | | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
1111 | 1092 | | |
1112 | 1093 | | |
1113 | 1094 | | |
1114 | 1095 | | |
1115 | | - | |
1116 | | - | |
1117 | 1096 | | |
1118 | | - | |
1119 | | - | |
1120 | | - | |
1121 | | - | |
1122 | | - | |
1123 | 1097 | | |
1124 | | - | |
1125 | | - | |
1126 | | - | |
1127 | | - | |
1128 | | - | |
1129 | | - | |
1130 | | - | |
1131 | | - | |
1132 | | - | |
1133 | | - | |
1134 | | - | |
1135 | | - | |
1136 | 1098 | | |
1137 | | - | |
1138 | | - | |
1139 | | - | |
1140 | | - | |
| 1099 | + | |
| 1100 | + | |
1141 | 1101 | | |
1142 | | - | |
1143 | 1102 | | |
1144 | 1103 | | |
1145 | 1104 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
0 commit comments