Commit 9aa54a3
committed
Cut dynamic-linker startup syscalls
The dynamic-linker bring-up storm was the largest remaining startup band
after pull request #34. Adding a per-syscall histogram pointed at the
sidecar walker as the openat dominant cost (61% of getent startup) and
the per-call path_translation_t memset as the second source.
src/debug/syscall-hist.[ch]: opt-in histogram via
ELFUSE_STARTUP_TRACE=syscalls (or =all alongside the existing step
trace). Lock-free atomic counters per Linux syscall number, sorted
total-ns descending in the dump. Records freeze on the first successful
execve so steady-state traffic does not pollute the startup picture.
Fork children disable the histogram explicitly because they resume from
a parent snapshot, not a fresh bring-up.
src/syscall/sidecar.c: First a per-directory absence cache keyed by
(st_dev, st_ino, mtime, ctime) so the walker can skip the openat for
.elfuse-sidecar-index when a recent fstat on the same dirfd already saw
ENOENT. The mtime/ctime in the key closes ABA naturally and makes a
cross-process index publish observable without explicit invalidation.
Second a cached sysroot dirfd handed out as fcntl(F_DUPFD_CLOEXEC, 0) so
each translated absolute path saves the ~30 us open(sysroot) round-trip
and the dup carries CLOEXEC across any racing posix_spawn.
src/syscall/path.c: drop the per-call zero-init of path_translation_t.
The struct is ~12 KiB (24 metadata bytes plus three LINUX_PATH_MAX
buffers) and the buffers are read-after-written by their respective
resolvers. memset of all three was the dominant remaining cost after the
sidecar caches.
src/core/elf.c: skip the redundant memset of the file-data range in
elf_map_segments. The loader previously zeroed the full page-aligned
segment extent before issuing fread; now only the BSS portion plus page
padding (filesz to zero_len) is zeroed.
src/core/startup-trace.h: env parsing extended to comma-separated tokens
(steps, syscalls, all); legacy =1 keeps enabling steps only so existing
scripts keep working.
Measurement: 30-run distributions under ELFUSE_STARTUP_TRACE=syscalls,
warm cache:
bench-hot-guard-glibc startup syscalls:
5.225 ms baseline (single sample) -> 1.33 ms p50
(p25 1.21, p75 1.55, stdev 0.45, n=30) 3.9x
bench openat per-call:
135 us baseline -> 33.4 us p50
(p25 32.4, p75 35.8, stdev 7.1, n=30) 4.0x
getent passwd root startup syscalls:
7.478 ms baseline -> 2.22 ms p50
(p25 2.10, p75 2.28, stdev 0.27, n=30) 3.4x
getent openat per-call:
230 us baseline -> 52.9 us p50
(p25 51.5, p75 55.1, stdev 2.2, n=30) 4.3x
End-to-end wall-clock for getent: 14.6 ms p50 (p25 14.3, p75 15.1, stdev
1.18, n=30). Bench guardrail steady-state: static getpid 74 ns,
clock_gettime 6.7 ns, urandom1 153 ns; dynamic-glibc getpid 53 ns,
clock_gettime 6.4 ns, urandom1 142 ns. All under ceilings.
The original baselines were single first-run samples; their variance
band was not measured, so the speedup ratios are best-effort relative
to the cited starting point.1 parent 0d0e6d1 commit 9aa54a3
10 files changed
Lines changed: 724 additions & 15 deletions
File tree
- src
- core
- debug
- runtime
- syscall
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
69 | | - | |
| 69 | + | |
| 70 | + | |
70 | 71 | | |
71 | 72 | | |
72 | 73 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
348 | 348 | | |
349 | 349 | | |
350 | 350 | | |
351 | | - | |
352 | | - | |
353 | | - | |
354 | | - | |
355 | | - | |
356 | | - | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
357 | 357 | | |
358 | | - | |
| 358 | + | |
| 359 | + | |
359 | 360 | | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | 361 | | |
364 | 362 | | |
365 | 363 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
12 | 21 | | |
13 | 22 | | |
14 | 23 | | |
| |||
30 | 39 | | |
31 | 40 | | |
32 | 41 | | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
33 | 60 | | |
34 | 61 | | |
35 | 62 | | |
36 | | - | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
37 | 73 | | |
38 | 74 | | |
39 | 75 | | |
| |||
0 commit comments