Skip to content

Commit 1a2cc6d

Browse files
Michel Belleauclaude
authored andcommitted
Add warning when LRU eviction cannot succeed due to all models being pinned
When models_max limit is reached but all active models are pinned, log a warning message to clarify that automatic unload cannot succeed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent bfdf7d2 commit 1a2cc6d

1 file changed

Lines changed: 4 additions & 0 deletions

File tree

tools/server/server-models.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -391,6 +391,8 @@ void server_models::unload_lru() {
391391
for (const auto & m : mapping) {
392392
if (m.second.meta.is_active()) {
393393
count_active++;
394+
// If all active models are pinned, this condition never holds and no LRU eviction will occur.
395+
// The server will keep all pinned models in memory, potentially exceeding models_max.
394396
if (!m.second.meta.pinned && m.second.meta.last_used < lru_last_used) {
395397
lru_model_name = m.first;
396398
lru_last_used = m.second.meta.last_used;
@@ -408,6 +410,8 @@ void server_models::unload_lru() {
408410
return mapping[lru_model_name].meta.status == SERVER_MODEL_STATUS_UNLOADED;
409411
});
410412
}
413+
} else if (count_active >= (size_t)base_params.models_max) {
414+
SRV_WRN("models_max limit reached, but no unpinned models available for LRU eviction - automatic unload cannot succeed\n");
411415
}
412416
}
413417

0 commit comments

Comments
 (0)