Skip to content

unexpand: save stack size - perf +6.98%#11787

Closed
oech3 wants to merge 2 commits into
uutils:mainfrom
oech3:unexpand-stack
Closed

unexpand: save stack size - perf +6.98%#11787
oech3 wants to merge 2 commits into
uutils:mainfrom
oech3:unexpand-stack

Conversation

@oech3

@oech3 oech3 commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

Replace a stack array with vec to avoid stack spill seen at rust-lang/rust#148670 .

Probally fixes #10654

@codspeed-hq

codspeed-hq Bot commented Apr 13, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 2 untouched benchmarks
⏩ 354 skipped benchmarks1


Comparing oech3:unexpand-stack (266ef23) with main (b2619df)2

Open in CodSpeed

Footnotes

  1. 354 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on main (717140a) during the generation of this report, so b2619df was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@oech3

This comment was marked as outdated.

@oech3 oech3 force-pushed the unexpand-stack branch 2 times, most recently from 628fe47 to e6c1c8f Compare April 13, 2026 06:11
@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

GNU test failed: tests/misc/io-errors. tests/misc/io-errors is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/tty/tty-eof (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/seq/seq-epipe is now being skipped but was previously passing.
Congrats! The gnu test tests/tail/tail-n0f is now passing!

@oech3 oech3 changed the title unexpand: save stack size unexpand: save stack size - perf +6.98% Apr 13, 2026
@oech3 oech3 marked this pull request as ready for review April 13, 2026 06:36
@sylvestre

Copy link
Copy Markdown
Contributor

i wasn't expecting such wins for this kind of changes

@xtqqczze

Copy link
Copy Markdown
Contributor

I don't understand why changing stack allocation to heap allocation is a performance improvement. Additionally, it may be possible that this isn't an improvement on targets other than x86_64-unknown-linux-gnu.

@oech3

oech3 commented Apr 13, 2026

Copy link
Copy Markdown
Contributor Author

I'm OK to have cfg block.

@xtqqczze

Copy link
Copy Markdown
Contributor

It would be good to understand why there is a performance difference; if necessary this could be reported upstream.

@oech3

oech3 commented Apr 13, 2026

Copy link
Copy Markdown
Contributor Author

done at rust-lang/rust#148670

@xtqqczze

Copy link
Copy Markdown
Contributor

done at rust-lang/rust#148670

Please add a comment linking to this issue; it would also have been helpful to include it in the PR description.

@xtqqczze

Copy link
Copy Markdown
Contributor

I guess it would be good to explain in the comment that the regression is specifically with codegen-units=1 + LTO.

@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

GNU test failed: tests/timeout/timeout-group. tests/timeout/timeout-group is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/tty/tty-eof (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/expand/bounded-memory is now passing!

@xtqqczze

Copy link
Copy Markdown
Contributor

I'm OK to have cfg block.

Like efd0f0c, or just x86_64-unknown-linux-gnu?

@oech3

oech3 commented Apr 14, 2026

Copy link
Copy Markdown
Contributor Author

I think cfg is overkill considering Linux-x64 is most common target (until targets significally decreasing perf was found).

@xtqqczze

Copy link
Copy Markdown
Contributor

This change is a workaround for an upstream issue and should not be applied to targets where the issue has not been demonstrated.

@oech3 oech3 marked this pull request as draft April 14, 2026 12:19
@oech3

This comment was marked as resolved.

@oech3

oech3 commented Apr 15, 2026

Copy link
Copy Markdown
Contributor Author

I guess this is no longer needed as #11806 fixed stack spill. But I'll check CodSpeed few times before closing.

@oech3 oech3 closed this Apr 15, 2026
@oech3 oech3 deleted the unexpand-stack branch April 15, 2026 08:24
@xtqqczze

Copy link
Copy Markdown
Contributor

@oech3 Thanks for looking into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

unexpand: Performance regression with codegen-units=1

3 participants