Commit d24d5c3
committed
perf(inline): depth-1 recursive inlining pass
Adds `inline::run_module_recursive`, a sibling to the standard
inliner that unrolls one level of direct self-recursive calls.
Operates on a snapshot of the function body taken before any
modification; the snapshot's own self-calls still reference the
function id, so inlining the snapshot at one of the live function's
call sites leaves a residual of depth-1 real recursive calls. No
infinite expansion.
A relaxed classifier permits direct self-`Call`s inside the body
(those become the residual); the standard classifier rejects every
internal `Call` as `Unsupported`. Otherwise the same shape rules
apply: small total instruction count, no Atomic / Fence / Indirect,
single-value Return.
Runs ONCE after the optimisation fixed-point and before tco /
drop_insert. Body cap is 96 insts (~3x growth per inline keeps the
worst case under 300); per-function site cap is 4. Naive fib fits;
mandel_count fits; tight tail-recursive accumulators fit. Anything
exotic is skipped.
Bench impact (macOS aarch64, `--no-cache`):
fib(40): 490 ms → 343 ms (30 % faster)
mandelbrot: 410 ms → 325 ms (21 % faster — mandel_count inlines)
nbody/_ref: unchanged within noise
inlined/free_function_call: unchanged within noise
Three unit tests cover the unroll shape, the non-recursive
no-op, and the size budget.1 parent 41c5f09 commit d24d5c3
2 files changed
Lines changed: 434 additions & 0 deletions
0 commit comments