Commit a1ab9fb
committed
perf(inline): allow inline-safe intrinsics in callee bodies
`classify` rejected any callee carrying an internal `Call`, even
direct calls into an inline-safe `Intrinsic` (`Sqrt`, `Fabs`,
`Fma`, `Sin`, `Cos`, `Pow`, `Log`, `Exp`, …). The infrastructure
to detect these was already in place — `is_inline_safe_intrinsic`
lists them and `callee_contains_inline_safe_intrinsic` checks for
them — but the helpers were only wired into stats attribution,
never into the classifier itself.
The compounding effect on `bench_nbody_ref`: `advance(...)` calls
`sqrt(d2)` inside the j-loop. The classifier saw the sqrt call,
returned `Unsupported`, and `main`'s 10 M-iteration `advance(...)`
call site never inlined. The j-loop body stayed opaque to LICM,
auto-vectorize, and the rest of the HIR fixed point; LLVM only
saw a hot call boundary.
Bench impact (macOS aarch64, `--no-cache`):
mandelbrot: ~410 ms → 315 ms (-23 %)
nbody: ~1720 → 1581 ms (-8 %)
nbody_ref: ~575 → 566 ms (within noise — LLVM was already
inlining advance via its own IPA)
fib / inlined / free_function_call: unchanged within noise
Mirrored in `classify_recursive` (the relaxed classifier used by
the depth-1 recursive inliner) so callees admitting sqrt-style
intrinsics also become recursive-inline candidates.1 parent d3e32a1 commit a1ab9fb
1 file changed
Lines changed: 23 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
451 | 451 | | |
452 | 452 | | |
453 | 453 | | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
454 | 461 | | |
455 | 462 | | |
456 | 463 | | |
| |||
748 | 755 | | |
749 | 756 | | |
750 | 757 | | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
751 | 770 | | |
752 | 771 | | |
753 | 772 | | |
754 | 773 | | |
755 | 774 | | |
756 | 775 | | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
757 | 780 | | |
758 | 781 | | |
759 | 782 | | |
| |||
0 commit comments