Skip to content

Commit a010122

Browse files
authored
common: do not fit to unknown device memory (ggml-org#22614)
* common: do not fit to unknown device memory Signed-off-by: Florian Reinle <f.reinle@otec.de> * common: preserve host fallback for non-GPU fit devices Signed-off-by: Florian Reinle <f.reinle@otec.de> * common: keep unknown GPU fit memory at zero Signed-off-by: Florian Reinle <f.reinle@otec.de> --------- Signed-off-by: Florian Reinle <f.reinle@otec.de>
1 parent a290ce6 commit a010122

1 file changed

Lines changed: 14 additions & 6 deletions

File tree

common/fit.cpp

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -109,16 +109,24 @@ static std::vector<llama_device_memory_data> common_get_device_memory_data(
109109
ret.back().total = total;
110110
}
111111
for (size_t i = 0; i < nd; i++) {
112+
ggml_backend_dev_t dev = llama_model_get_device(model, i);
113+
112114
size_t free;
113115
size_t total;
114-
ggml_backend_dev_memory(llama_model_get_device(model, i), &free, &total);
116+
ggml_backend_dev_memory(dev, &free, &total);
115117

116-
// devices can return 0 bytes for free and total memory if they do not
117-
// have any to report. in this case, we will use the host memory as a fallback
118-
// fixes: https://github.com/ggml-org/llama.cpp/issues/18577
118+
// Some non-GPU accelerator backends, such as BLAS, report 0/0 and rely on
119+
// the host-memory fallback. For GPU-like backends, keep 0/0 so --fit does
120+
// not assign anything to a device with an unknown memory budget.
119121
if (free == 0 && total == 0) {
120-
free = ret.back().free;
121-
total = ret.back().total;
122+
const enum ggml_backend_dev_type type = ggml_backend_dev_type(dev);
123+
if (type == GGML_BACKEND_DEVICE_TYPE_GPU || type == GGML_BACKEND_DEVICE_TYPE_IGPU) {
124+
LOG_WRN("%s: device %s did not report memory; --fit will not use it\n",
125+
__func__, ggml_backend_dev_name(dev));
126+
} else {
127+
free = ret.back().free;
128+
total = ret.back().total;
129+
}
122130
}
123131
ret[i].free = free;
124132
ret[i].total = total;

0 commit comments

Comments
 (0)