You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
JIT lowering for binary/ternary/array-input harmonic primitives
Extends the harmonic-intrinsic JIT (commit 621a80a) with three new
arity classes and the hot-path measurement showing real-world impact:
NEWLY JIT'D (8 primitives):
Binary i64,i64 -> i64:
gcd(a, b) — Euclidean GCD
lcm(a, b) — least common multiple
safe_mod(a, b) — substrate-folded modulus
Ternary i64,i64,i64 -> i64:
mod_pow(b, e, m) — fast modular exponentiation
Array-input i64(ptr) -> i64 (uses L1.6 buffer layout):
arr_sum_int(arr) — wrapping sum
arr_product(arr) — wrapping product
arr_min_int(arr) — min element
arr_max_int(arr) — max element
The array-input pattern is what completes the integration story.
The intrinsic shim reads `*p` as length and walks slots 1..=len
exactly like the L1.6 input bridge does — same layout, same
extern fn signature. Lets a JIT'd fn build an array inline (or
receive one as a parameter) and aggregate it without bouncing
back to tree-walk for the reduction.
ARCHITECTURAL NOTE — float-returning intrinsics deferred:
harmony_value(n) and value_danger(n) return f64 (as i64 bit-pattern
in the JIT). The cli-side dispatch closure currently wraps every
i64 return as Value::HInt, which would corrupt the float
interpretation when these are returned at the top level. The shims
exist in lib.rs but are NOT in the dual_band intercept table. To
enable, mirror the L1.6 returns_array_int plumbing for a
returns_float flag (small additional work; deferred because nothing
in current hot paths needs it).
HOT-PATH IMPACT — NSL-KDD harmonic_anomaly fit:
Tree-walk: 363 ms
JIT (pre-L1.6, arrays in dispatch): 363 ms (no JIT actually used)
JIT (post-L1.6 input bridge): 191 ms (1.9x)
JIT (+ harmonic-primitive intrinsics): 107 ms (3.4x total)
Same 15 of 53 user fns JIT (ha.score etc.); the additional speedup
comes from their INTERNAL calls to attractor_distance / is_attractor
/ nth_fibonacci / etc. now being native instead of bouncing back
to tree-walk for each invocation.
TESTS: 24 in jit_harmonic_intrinsics.rs (was 16 + 8 new — gcd, lcm,
safe_mod, mod_pow, arr_sum_int internal, arr_product internal,
arr_min/max_int internal, plus a combined substrate workload that
exercises NewArray + ArrayIndex + harmonic_unalign + ArrayLen
without tree-walk fallback). All pass. 77 codegen tests total
(was 69 + 8). 161 OMC tests still green.
docs/jit_real_world.md updated with the 3.4x measurement.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: docs/jit_real_world.md
+3-2Lines changed: 3 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -113,9 +113,10 @@ Same workload (`examples/datascience/nsl_kdd_validation.omc`, 5000 rows):
113
113
|---|--:|--:|
114
114
| Tree-walk | 363 ms | n/a |
115
115
| JIT (pre-L1.6) | 363 ms | 1 of 4 user fns |
116
-
|**JIT (post-L1.6)**|**191 ms**|**15 of 53 user fns** (incl. `ha.score`) |
116
+
| JIT (post-L1.6) | 191 ms | 15 of 53 user fns (incl. `ha.score`) |
117
+
|**JIT (+ harmonic-primitive intrinsics)**|**107 ms**| 15 of 53 user fns (same fns, but inner harmonic calls now native) |
117
118
118
-
**1.9× wall-clock speedup on the real harmonic_anomaly workload.** The hot-loop fn `ha.score` now actually runs through the JITinstead of falling back to tree-walk.
119
+
**3.4× wall-clock speedup on the real harmonic_anomaly workload.** The hot-loop fn `ha.score` now runs through the JIT, and its inner calls to `attractor_distance` / `is_attractor` / `nth_fibonacci` / etc. are also native code instead of bouncing back to the tree-walk builtin dispatch per call.
119
120
120
121
Synthetic microbench (sum over arr_range(0, 1000), 1000 iterations):
0 commit comments