Commit 2cbfd84
committed
bench: add op_overload kernel — operator overloading at native speed
Adds a microbench that exercises the operator-overload path: a
value-type `Vec3` struct with `impl Add<Vec3>` invoked 20M times
in a tight loop. Exposes whether the SSA-lowering's
`try_operator_trait_dispatch` + the inliner's intrinsic-Call
admission + LLVM's mem2reg combine to give native arithmetic
speed for user-defined operator overloads.
Linux/Mac result: 20M overloaded `+` calls in ~20-70ms (~1-3ns
per call). The post-opt LLVM IR shows the entire 10M-iteration
loop reduced to `extractvalue`/`fadd`/`insertvalue` chains with
zero residual function calls — `Vec3.add` is fully inlined into
main, then mem2reg promotes the struct field accesses to SSA
registers.
This benchmark is the closest thing in the suite to "would a
devirtualization pass help us"; the answer is no — operator
overloads on concrete types are already resolved at SSA lowering
and fully inlined.1 parent a1ab9fb commit 2cbfd84
2 files changed
Lines changed: 31 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
190 | 190 | | |
191 | 191 | | |
192 | 192 | | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
193 | 198 | | |
194 | 199 | | |
195 | 200 | | |
| |||
0 commit comments