Commit e9b4b4c
authored
[NFC] cache repeated tree walks to avoid O(N^2) in optimizeTerminatingTails in CodeFolding (#8602)
Cache the result of getBranchTargets(getFunction()->body) in
optimizeTerminatingTails so that recursive calls share the same computed
set rather than each re-walking the entire function body. This avoids
O(N²) behavior where N is the size of the function body, since the
recursive calls previously each performed an O(N) tree walk. The cached
targets are computed lazily on first need and passed through to the
canMove overload that accepts pre-computed branch targets.
## Benmark data
For the test case in
#7319 (comment)
Main head:
```shell
time ./build/bin/wasm-opt --code-folding --enable-bulk-memory --enable-multivalue --enable-reference-types --enable-gc --enable-tail-call --enable-exception-handling -o /dev/null ./test3.wasm
real 5m45.996s
user 6m6.267s
sys 0m3.798s
```
This PR:
```shell
time ./build/bin/wasm-opt --code-folding --enable-bulk-memory --enable-multivalue --enable-reference-types --enable-gc --enable-tail-call --enable-exception-handling -o /dev/null ./test3.wasm
real 2m2.380s
user 2m25.700s
sys 0m2.449s
```
## Benchmark regression test
Test case:
https://jetbrains.github.io/kotlinconf-app/73cbe24d7cf5a54d37ad.wasm
On main
```shell
Performance counter stats for 'build/bin/wasm-opt 73cbe24d7cf5a54d37ad.wasm -all --code-folding -o /dev/null' (10 runs):
4837936912 task-clock # 1.445 CPUs utilized ( +- 0.51% )
114 context-switches # 23.564 /sec ( +- 7.58% )
7 cpu-migrations # 1.447 /sec ( +- 16.88% )
46271 page-faults # 9.564 K/sec ( +- 0.00% )
13431328103 instructions # 1.21 insn per cycle ( +- 0.01% )
11125222873 cycles # 2.300 GHz ( +- 0.51% )
64641504 branch-misses ( +- 1.26% )
3.3484 +- 0.0221 seconds time elapsed ( +- 0.66% )
```
On current PR
```shell
Performance counter stats for 'build/bin/wasm-opt 73cbe24d7cf5a54d37ad.wasm -all --code-folding -o /dev/null' (10 runs):
4802304211 task-clock # 1.437 CPUs utilized ( +- 0.47% )
125 context-switches # 26.029 /sec ( +- 6.50% )
8 cpu-migrations # 1.666 /sec ( +- 14.20% )
46272 page-faults # 9.635 K/sec ( +- 0.00% )
13391520427 instructions # 1.21 insn per cycle ( +- 0.01% )
11043221889 cycles # 2.300 GHz ( +- 0.47% )
59021679 branch-misses ( +- 1.24% )
3.3427 +- 0.0207 seconds time elapsed ( +- 0.62% )
```1 parent 3180c6f commit e9b4b4c
1 file changed
Lines changed: 29 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
398 | 398 | | |
399 | 399 | | |
400 | 400 | | |
401 | | - | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
402 | 409 | | |
403 | 410 | | |
404 | 411 | | |
| |||
632 | 639 | | |
633 | 640 | | |
634 | 641 | | |
| 642 | + | |
| 643 | + | |
635 | 644 | | |
636 | | - | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
637 | 648 | | |
638 | 649 | | |
639 | 650 | | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
640 | 654 | | |
641 | 655 | | |
642 | 656 | | |
| |||
697 | 711 | | |
698 | 712 | | |
699 | 713 | | |
700 | | - | |
701 | | - | |
702 | | - | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
703 | 719 | | |
704 | 720 | | |
705 | 721 | | |
| |||
795 | 811 | | |
796 | 812 | | |
797 | 813 | | |
798 | | - | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
799 | 822 | | |
800 | 823 | | |
801 | 824 | | |
| |||
0 commit comments