Optimizations for stack checks when no over/under flow is present by ehildenb · Pull Request #2709 · runtimeverification/evm-semantics

ehildenb · 2025-02-24T15:23:07Z

This is pulled out of: #2676

Specifically, we are trying to optimize for the cases where the stack checks on under/over-flow do not fail, by directly going to false on those cases when possible rather than computing the size of the workstack (using pattern-matching and faster checks). In the cases where it's potentially failing that check, we fall back to the slow cases which use the size of the wordstack to compute whether there is a stack under/over-flow.

The number of functional tests run in parallel is also increased, as that is causing timeouts (and has on other PRs too).

ehildenb · 2025-03-08T22:35:10Z

I've tested this change on the KEVM testsuite, and th eresults are:

Old KEVM performance (~ October 1, 2024):

| 0  3700.71s  29909.88s  1194.00s  4856364KB  make  test-prove-rules
| 0  150.26s   4463.26s   156.98s   4351856KB  make  test-prove-rules
| 0  1148.06s  13852.91s  557.06s   4319956KB  make  test-prove-functional
| 0  1943.14s  5832.27s   277.76s   6320464KB  make  test-prove-dss

Recent commit performance (#2708):

| 0  263.72s   959.42s    145.14s   3550608KB  make  test-conformance
| 0  4533.65s  26726.91s  1572.02s  4888348KB  make  test-prove-rules
| 0  148.51s   3504.65s   188.85s   5414092KB  make  test-prove-rules
| 0  1096.45s  13932.87s  864.27s   4485892KB  make  test-prove-functional
| 0  2082.26s  6067.04s   492.18s   7511896KB  make  test-prove-dss
| 0  374.59s   624.64s    43.66s    4751756KB  make  test-prove-optimizations

Change to how stack size is calculated (#2676):

| 0  272.93s   994.45s    143.15s   3550144KB  make  test-conformance           
| 0  4978.87s  28938.35s  1697.08s  6320588KB  make  test-prove-rules           
| 0  188.91s   3804.77s   200.43s   5511620KB  make  test-prove-rules           
| 0  1289.10s  16608.18s  970.64s   5386320KB  make  test-prove-functional      
| 0  2222.29s  6286.28s   528.47s   6907620KB  make  test-prove-dss             
| 0  371.96s   625.33s    42.14s    4292280KB  make  test-prove-optimizations

This PR:

| 0  316.40s   976.87s    141.21s   3551052KB  make  test-conformance
| 0  4508.56s  26117.61s  2227.72s  6082708KB  make  test-prove-rules
| 0  186.36s   3813.14s   196.55s   5548900KB  make  test-prove-rules
| 0  1274.51s  16359.70s  969.84s   5159188KB  make  test-prove-functional
| 0  2113.92s  6202.39s   480.33s   6641140KB  make  test-prove-dss
| 0  376.39s   627.61s    42.54s    4587436KB  make  test-prove-optimizations

So the concrete execution time went up a bit, but symbolic execution came back down. I think it's because in #2676, an optimization is made to try and avoid computing workstack sizes using lazy evaluation (something I believe the LLVM backend does, because it's safe for concrete execution, but the Haskell backend doesn't): https://github.com/runtimeverification/evm-semantics/pull/2676/files#diff-f2f12e7ffac586c70477ba2c731d8d36e08ee0b8c294a5ef86d31337212b0264L384. So this PR instead introduces it as pattern-matching directly on the rules for checking stack-over/under flow which should be fast on both backends. It does seem to make the LLVM backend run a bit longer, I will measure again and see if that's an artifact or a real slowdown.

Ping @Stevengre

ehildenb · 2025-03-09T01:48:02Z

A second run of the tests on this PR demonstrates that performance is just flaky it appears:

| 0  271.35s   975.51s    136.87s   3551292KB  make  test-conformance
| 0  4977.40s  28368.89s  1822.61s  5886528KB  make  test-prove-rules
| 0  190.56s   3841.91s   198.26s   6167356KB  make  test-prove-rules
| 0  1280.60s  16542.60s  969.63s   5381176KB  make  test-prove-functional
| 0  2139.70s  6162.45s   483.59s   7090388KB  make  test-prove-dss

ehildenb · 2025-04-09T14:48:43Z

I've checked through the commit history, and it seems that the performance regression I measured was flakiness on my machine. Re-measuring shows that we are at similar performanec as we were in October of last year.

ehildenb · 2025-04-11T17:22:21Z

I re-measured, and it seems this does not appreciably change any performance metrics:

4c6d363 (this commit):

0  260.34s   961.21s    148.01s   3550336KB  make  test-conformance
0  4320.84s  26591.10s  1590.71s  4880424KB  make  test-prove-rules
0  182.21s   4016.21s   218.30s   4817136KB  make  test-prove-rules
0  1072.70s  13834.73s  873.19s   4730224KB  make  test-prove-functional
0  1940.01s  5739.74s   455.03s   6460600KB  make  test-prove-dss
0  363.20s   615.14s    42.37s    4407692KB  make  test-prove-optimizations

db7d25e (current master):

0  260.05s   965.22s    147.81s   3550496KB  make  test-conformance
0  4285.44s  26589.53s  1594.99s  4657916KB  make  test-prove-rules
0  179.79s   3956.92s   230.66s   5090760KB  make  test-prove-rules
0  1068.31s  13783.91s  912.02s   4774488KB  make  test-prove-functional
0  1988.47s  5787.01s   470.42s   6686868KB  make  test-prove-dss
0  355.21s   602.21s    41.81s    3833360KB  make  test-prove-optimizations

Looking at specific concrete execution specs, it appears that there are some tests that benefit, mostly ones that hammer the wordstack size, but most do not.

ehildenb self-assigned this Feb 24, 2025

ehildenb force-pushed the stack-check-optim branch 2 times, most recently from afd1734 to c0852fc Compare February 27, 2025 15:24

ehildenb added 3 commits March 5, 2025 17:59

kevm-pyk/evm: use function, not macro, for stack under/over-flow

3ba1eb5

kevm-pyk/evm: optimize for non-overflow cases on stack checks

386c5d1

kevm-pyk/evm.md: optimize for non-underflow stack check

51deb06

ehildenb force-pushed the stack-check-optim branch from c0852fc to 51deb06 Compare March 8, 2025 22:39

Merge remote-tracking branch 'upstream/master' into stack-check-optim

7117dde

ehildenb added 2 commits April 9, 2025 21:52

.github/test-pr: increase functional proof parallelism

69bfd3a

Merge remote-tracking branch 'upstream/master' into stack-check-optim

4c6d363

ehildenb closed this Apr 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizations for stack checks when no over/under flow is present#2709

Optimizations for stack checks when no over/under flow is present#2709
ehildenb wants to merge 6 commits into
masterfrom
stack-check-optim

ehildenb commented Feb 24, 2025 •

edited

Loading

Uh oh!

ehildenb commented Mar 8, 2025 •

edited

Loading

Uh oh!

ehildenb commented Mar 9, 2025

Uh oh!

ehildenb commented Apr 9, 2025

Uh oh!

ehildenb commented Apr 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ehildenb commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ehildenb commented Mar 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ehildenb commented Mar 9, 2025

Uh oh!

ehildenb commented Apr 9, 2025

Uh oh!

ehildenb commented Apr 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ehildenb commented Feb 24, 2025 •

edited

Loading

ehildenb commented Mar 8, 2025 •

edited

Loading