Skip to content

Commit 036a707

Browse files
committed
llama-mmap: hint THP on mmap'd weights (Linux)
Issue madvise(MADV_HUGEPAGE) on the read-only file mapping used for model weights on Linux. For a 1 GB model this drops the potential page count from ~262K 4KB pages to ~512 2MB pages, reducing TLB pressure and (more importantly) reducing the number of re-faults when pages get evicted under memory pressure. No-op on kernels where THP is disabled. On 'madvise' mode (the common modern default for desktop distros), this is opt-in and requires the caller to ask. Guarded by defined(MADV_HUGEPAGE) so it compiles cleanly on non-Linux. Benchmark on a Skylake-SP VM, Bonsai-8B Q1_0, -fa on -ctk q8_0 -ctv q8_0 -t 12 -ub 128: neutral on this machine (~9.5 t/s tg128 both before and after) because the VM isn't memory-constrained. The change is intended for systems where the mapping does get evicted and re-faulted under pressure.
1 parent e29cd48 commit 036a707

File tree

1 file changed

+14
-0
lines changed

1 file changed

+14
-0
lines changed

src/llama-mmap.cpp

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -451,6 +451,20 @@ struct llama_mmap::impl {
451451
throw std::runtime_error(format("mmap failed: %s", strerror(errno)));
452452
}
453453

454+
#ifdef __linux__
455+
// Hint the kernel to back this region with 2MB huge pages where possible.
456+
// For a 1 GB model weights map this can drop the number of pages from ~262K
457+
// 4KB pages to ~512 2MB pages, reducing TLB pressure and (critically)
458+
// reducing the number of re-faults when pages get evicted under memory
459+
// pressure. No-op if THP is not enabled / supported.
460+
if (!numa) {
461+
if (madvise(addr, file->size(), MADV_HUGEPAGE)) {
462+
LLAMA_LOG_DEBUG("note: madvise(.., MADV_HUGEPAGE) not applied: %s\n",
463+
strerror(errno));
464+
}
465+
}
466+
#endif
467+
454468
if (prefetch > 0) {
455469
if (posix_madvise(addr, std::min(file->size(), prefetch), POSIX_MADV_WILLNEED)) {
456470
LLAMA_LOG_WARN("warning: posix_madvise(.., POSIX_MADV_WILLNEED) failed: %s\n",

0 commit comments

Comments
 (0)