Commit 7d06433
authored
Rollup merge of #153936 - danielzgtg:perf/immediateAbortAvoidPthreadGetattrNp, r=Mark-Simulacrum
Skip stack_start_aligned for immediate-abort
This improves startup performance by 16%, shown by an optimized hello-world program. glibc's `pthread_getattr_np` performs expensive syscalls when reading `/proc/self/maps`. That is all wasted with `panic = immediate-abort` active because `init()` immediately discards the return value from `install_main_guard()`. A similar improvement can be seen in environments that don't have `/proc`. This change is safe because the immediately succeeding comment says that we rely on Linux's "own stack-guard mechanism".
Tracking issue: rust-lang/rust#147286
# Benchmark
Set it up with `cargo new hello-world2`, and replace these files:
```toml
# Cargo.toml
cargo-features = ["panic-immediate-abort"]
[package]
name = "hello-world"
version = "0.1.0"
edition = "2024"
[profile.release]
lto = true
panic = "immediate-abort"
codegen-units = 1
opt-level = "z"
strip = true
# .cargo/config.toml
[unstable]
build-std = ["std"]
```
## Before
```console
home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2
Benchmark 1: target/release/hello-world2
Time (mean ± σ): 524.8 µs ± 65.1 µs [User: 276.1 µs, System: 187.0 µs]
Range (min … max): 446.4 µs … 975.5 µs 3996 runs
home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2
Benchmark 1: target/release/hello-world2
Time (mean ± σ): 519.4 µs ± 65.8 µs [User: 282.1 µs, System: 177.7 µs]
Range (min … max): 443.2 µs … 830.5 µs 3612 runs
home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2
Benchmark 1: target/release/hello-world2
Time (mean ± σ): 520.0 µs ± 64.3 µs [User: 277.1 µs, System: 182.1 µs]
Range (min … max): 447.1 µs … 1001.3 µs 3804 runs
```
For a visualization of the problem, run `cargo +stage1 build --release && perf record --call-graph dwarf -F max ./target/release/hello-world2 && perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg`:
<img width="3832" height="1216" alt="flamegraph with 17.41% __pthread_getattr_np" src="https://github.com/user-attachments/assets/acc2286e-1582-4772-9e3b-68b5c35e3e70" />
## After
```console
home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2Benchmark 1: target/release/hello-world2
Time (mean ± σ): 444.7 µs ± 57.3 µs [User: 257.4 µs, System: 130.2 µs]
Range (min … max): 379.4 µs … 1289.3 µs 3893 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2
Benchmark 1: target/release/hello-world2
Time (mean ± σ): 452.3 µs ± 60.7 µs [User: 261.5 µs, System: 133.5 µs]
Range (min … max): 374.9 µs … 1512.4 µs 4177 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2
Benchmark 1: target/release/hello-world2
Time (mean ± σ): 441.2 µs ± 56.1 µs [User: 256.2 µs, System: 128.8 µs]
Range (min … max): 375.0 µs … 760.4 µs 4032 runs
```0 file changed
0 commit comments