This repository was archived by the owner on Jul 4, 2025. It is now read-only.
Description At the moment we fail silently and users have to send us logs. "model failed to load"
Can we get a handle on all the potential reasons why their model failed to load, and discuss how to handle each issue?
Goal :
Graceful failures
Predefined errors
Though there are endless errors, lets adopt the Pareto Rule, as 80% of our bugs are due to 20% common model loading challenges
Examples
Model won't fit in RAM/VRAM
Another model is running... other edge cases & race conditions
Wrong model format (i.e. unsupported runtime)
Version conflicts (in trt-llm engine scneario)
Missing model.yaml, template, key input/configs
Corrupted or missing model binaries
Incompat hardware. See
Questions:
What are the other common issues?
We support various engines, but should we standardize failure modes? This allows us to offer better dx/ux down the road.
What are the various ways that llamacpp, trtllm, directml currently handle errors? Do they have a predefined, neat list we can adopt?
Related issues:
Reactions are currently unavailable
At the moment we fail silently and users have to send us logs. "model failed to load"
Can we get a handle on all the potential reasons why their model failed to load, and discuss how to handle each issue?
Goal:
Examples
Questions:
Related issues:
Something's AmissError) jan#3517