Skip to content

Use modelID instead of model tag#98

Merged
ilopezluna merged 11 commits into
mainfrom
use-model-id
Jul 14, 2025
Merged

Use modelID instead of model tag#98
ilopezluna merged 11 commits into
mainfrom
use-model-id

Conversation

@ilopezluna
Copy link
Copy Markdown
Contributor

@ilopezluna ilopezluna commented Jul 2, 2025

For actions with models:

  • chat
  • record req/res
  • load
  • evict

Use the model ID (digest) instead of the tag.
This has several benefits, like we don't need to unload/load the same model if user specifies different tag.

Also this PR include the actual reference used when running a model, to use it when doing docker model ps to mimic the behavior of docker ps, if you run an image with multiple tags, the docker ps command will show you the tag used

Comment thread pkg/inference/models/manager.go Fixed
Comment thread pkg/inference/models/manager.go Fixed
Comment thread pkg/inference/scheduling/scheduler.go Fixed
Comment thread pkg/metrics/openai_recorder.go Fixed
Comment thread pkg/metrics/openai_recorder.go Fixed
Comment thread pkg/metrics/openai_recorder.go Fixed
Comment thread pkg/metrics/openai_recorder.go Fixed
Comment thread pkg/metrics/openai_recorder.go Fixed
ilopezluna and others added 6 commits July 2, 2025 15:27
…m user input

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…m user input

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…m user input

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
… of the runner. Currently the slot and the model reference used.
@ilopezluna ilopezluna requested a review from a team July 2, 2025 14:38
@ilopezluna ilopezluna marked this pull request as ready for review July 2, 2025 14:38
}

func (r *OpenAIRecorder) RecordRequest(model string, req *http.Request, body []byte) string {
modelID := r.modelManager.ResolveModelID(model)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does using the model ID in the OpenAIRecorder mean that the GUI that displays these requests will also display the SHA256? Or do we resolve it back to the "friendly" name? (Or something like ai/gemma3@sha256:a484b...?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the UI does not uses/show tags when using a model or inspecting a model. Behind the scenes uses the model ID but the UI does not show that.
With @doringeman we mentioned that we are going to need a ResolveModelTags(model reference) to also show all the available tags eventually of a local model.

Comment thread pkg/inference/scheduling/loader.go Outdated
@ilopezluna ilopezluna requested a review from xenoscopic July 9, 2025 08:42
Comment thread pkg/inference/scheduling/loader.go Dismissed
Comment thread pkg/inference/scheduling/loader.go Dismissed
Comment thread pkg/inference/scheduling/scheduler.go Dismissed
@ilopezluna ilopezluna merged commit a8437d3 into main Jul 14, 2025
4 checks passed
@ilopezluna ilopezluna deleted the use-model-id branch July 14, 2025 13:34
xenoscopic added a commit that referenced this pull request Jul 23, 2025
Follow-up to #98.

Signed-off-by: Jacob Howard <jacob.howard@docker.com>
doringeman pushed a commit to doringeman/model-runner that referenced this pull request Sep 23, 2025
doringeman pushed a commit to doringeman/model-runner that referenced this pull request Sep 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants