You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Oct 15, 2025. It is now read-only.
Component
Helm Chart
Desired use case or feature
Right now there is no way to choose how to deal with the model cache. If you choose the hugging face then the model is by default stored in memory:
llm-d-deployer/charts/llm-d/templates/modelservice/presets/basic-gpu-with-nixl-preset.yaml
Lines 234 to 237 in c9e16e9
And the other way is to use pvc.
I believe that we can offer a user the place to store model on the host nvme disks, when the model is being downloaded from the huggingface.
Proposed solution
Allow user to specify the hostpath or volume type when the model is being downloaded from HF.
Alternatives
No response
Additional context or screenshots
No response