Skip to content

Optimizations for stack checks when no over/under flow is present#2709

Closed
ehildenb wants to merge 6 commits into
masterfrom
stack-check-optim
Closed

Optimizations for stack checks when no over/under flow is present#2709
ehildenb wants to merge 6 commits into
masterfrom
stack-check-optim

Conversation

@ehildenb
Copy link
Copy Markdown
Member

@ehildenb ehildenb commented Feb 24, 2025

This is pulled out of: #2676

Specifically, we are trying to optimize for the cases where the stack checks on under/over-flow do not fail, by directly going to false on those cases when possible rather than computing the size of the workstack (using pattern-matching and faster checks). In the cases where it's potentially failing that check, we fall back to the slow cases which use the size of the wordstack to compute whether there is a stack under/over-flow.

The number of functional tests run in parallel is also increased, as that is causing timeouts (and has on other PRs too).

@ehildenb ehildenb self-assigned this Feb 24, 2025
@ehildenb ehildenb force-pushed the stack-check-optim branch 2 times, most recently from afd1734 to c0852fc Compare February 27, 2025 15:24
@ehildenb
Copy link
Copy Markdown
Member Author

ehildenb commented Mar 8, 2025

I've tested this change on the KEVM testsuite, and th eresults are:

Old KEVM performance (~ October 1, 2024):

| 0  3700.71s  29909.88s  1194.00s  4856364KB  make  test-prove-rules
| 0  150.26s   4463.26s   156.98s   4351856KB  make  test-prove-rules
| 0  1148.06s  13852.91s  557.06s   4319956KB  make  test-prove-functional
| 0  1943.14s  5832.27s   277.76s   6320464KB  make  test-prove-dss

Recent commit performance (#2708):

| 0  263.72s   959.42s    145.14s   3550608KB  make  test-conformance
| 0  4533.65s  26726.91s  1572.02s  4888348KB  make  test-prove-rules
| 0  148.51s   3504.65s   188.85s   5414092KB  make  test-prove-rules
| 0  1096.45s  13932.87s  864.27s   4485892KB  make  test-prove-functional
| 0  2082.26s  6067.04s   492.18s   7511896KB  make  test-prove-dss
| 0  374.59s   624.64s    43.66s    4751756KB  make  test-prove-optimizations

Change to how stack size is calculated (#2676):

| 0  272.93s   994.45s    143.15s   3550144KB  make  test-conformance           
| 0  4978.87s  28938.35s  1697.08s  6320588KB  make  test-prove-rules           
| 0  188.91s   3804.77s   200.43s   5511620KB  make  test-prove-rules           
| 0  1289.10s  16608.18s  970.64s   5386320KB  make  test-prove-functional      
| 0  2222.29s  6286.28s   528.47s   6907620KB  make  test-prove-dss             
| 0  371.96s   625.33s    42.14s    4292280KB  make  test-prove-optimizations   

This PR:

| 0  316.40s   976.87s    141.21s   3551052KB  make  test-conformance
| 0  4508.56s  26117.61s  2227.72s  6082708KB  make  test-prove-rules
| 0  186.36s   3813.14s   196.55s   5548900KB  make  test-prove-rules
| 0  1274.51s  16359.70s  969.84s   5159188KB  make  test-prove-functional
| 0  2113.92s  6202.39s   480.33s   6641140KB  make  test-prove-dss
| 0  376.39s   627.61s    42.54s    4587436KB  make  test-prove-optimizations

So the concrete execution time went up a bit, but symbolic execution came back down. I think it's because in #2676, an optimization is made to try and avoid computing workstack sizes using lazy evaluation (something I believe the LLVM backend does, because it's safe for concrete execution, but the Haskell backend doesn't): https://github.com/runtimeverification/evm-semantics/pull/2676/files#diff-f2f12e7ffac586c70477ba2c731d8d36e08ee0b8c294a5ef86d31337212b0264L384. So this PR instead introduces it as pattern-matching directly on the rules for checking stack-over/under flow which should be fast on both backends. It does seem to make the LLVM backend run a bit longer, I will measure again and see if that's an artifact or a real slowdown.

Ping @Stevengre

@ehildenb ehildenb force-pushed the stack-check-optim branch from c0852fc to 51deb06 Compare March 8, 2025 22:39
@ehildenb
Copy link
Copy Markdown
Member Author

ehildenb commented Mar 9, 2025

A second run of the tests on this PR demonstrates that performance is just flaky it appears:

| 0  271.35s   975.51s    136.87s   3551292KB  make  test-conformance
| 0  4977.40s  28368.89s  1822.61s  5886528KB  make  test-prove-rules
| 0  190.56s   3841.91s   198.26s   6167356KB  make  test-prove-rules
| 0  1280.60s  16542.60s  969.63s   5381176KB  make  test-prove-functional
| 0  2139.70s  6162.45s   483.59s   7090388KB  make  test-prove-dss

@ehildenb
Copy link
Copy Markdown
Member Author

ehildenb commented Apr 9, 2025

I've checked through the commit history, and it seems that the performance regression I measured was flakiness on my machine. Re-measuring shows that we are at similar performanec as we were in October of last year.

@ehildenb
Copy link
Copy Markdown
Member Author

I re-measured, and it seems this does not appreciably change any performance metrics:

4c6d363 (this commit):

0  260.34s   961.21s    148.01s   3550336KB  make  test-conformance
0  4320.84s  26591.10s  1590.71s  4880424KB  make  test-prove-rules
0  182.21s   4016.21s   218.30s   4817136KB  make  test-prove-rules
0  1072.70s  13834.73s  873.19s   4730224KB  make  test-prove-functional
0  1940.01s  5739.74s   455.03s   6460600KB  make  test-prove-dss
0  363.20s   615.14s    42.37s    4407692KB  make  test-prove-optimizations

db7d25e (current master):

0  260.05s   965.22s    147.81s   3550496KB  make  test-conformance
0  4285.44s  26589.53s  1594.99s  4657916KB  make  test-prove-rules
0  179.79s   3956.92s   230.66s   5090760KB  make  test-prove-rules
0  1068.31s  13783.91s  912.02s   4774488KB  make  test-prove-functional
0  1988.47s  5787.01s   470.42s   6686868KB  make  test-prove-dss
0  355.21s   602.21s    41.81s    3833360KB  make  test-prove-optimizations

Looking at specific concrete execution specs, it appears that there are some tests that benefit, mostly ones that hammer the wordstack size, but most do not.

@ehildenb ehildenb closed this Apr 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant