Commit 036a707
committed
llama-mmap: hint THP on mmap'd weights (Linux)
Issue madvise(MADV_HUGEPAGE) on the read-only file mapping used for
model weights on Linux. For a 1 GB model this drops the potential
page count from ~262K 4KB pages to ~512 2MB pages, reducing TLB
pressure and (more importantly) reducing the number of re-faults
when pages get evicted under memory pressure.
No-op on kernels where THP is disabled. On 'madvise' mode (the
common modern default for desktop distros), this is opt-in and
requires the caller to ask. Guarded by defined(MADV_HUGEPAGE) so it
compiles cleanly on non-Linux.
Benchmark on a Skylake-SP VM, Bonsai-8B Q1_0, -fa on -ctk q8_0
-ctv q8_0 -t 12 -ub 128: neutral on this machine (~9.5 t/s tg128
both before and after) because the VM isn't memory-constrained.
The change is intended for systems where the mapping does get
evicted and re-faulted under pressure.1 parent e29cd48 commit 036a707
1 file changed
+14
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
451 | 451 | | |
452 | 452 | | |
453 | 453 | | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
454 | 468 | | |
455 | 469 | | |
456 | 470 | | |
| |||
0 commit comments