Summary
When importing mistralai/Mistral-7B-Instruct-v0.3 or mistralai/Mistral-Nemo-Instruct-2407 from HuggingFace, OME classifies both as EMBEDDING instead of TEXT_GENERATION. Chat/completions requests against the resulting endpoint return:
400 Bad Request: importedModel does not support any of: [TextToText, ImageTextToText]
Both models have "architectures": ["MistralForCausalLM"] in their HF config.json and should be TEXT_GENERATION.
Root Cause
Two code locations interact to produce the bug:
1. pkg/hfutil/modelconfig/mistral.go — GetArchitecture() fallback
func (c *MistralConfig) GetArchitecture() string {
if len(c.Architectures) > 0 {
return c.Architectures[0]
}
return "MistralModel" // ← dangerous fallback
}
If Architectures is empty (e.g. JSON parsing fails, field missing, or struct mismatch), the method silently returns "MistralModel".
2. pkg/modelagent/config_parser.go — determineModelCapabilitiesFromHF()
if strings.Contains(strings.ToLower(architecture), "embedding") ||
strings.Contains(strings.ToLower(architecture), "sentence") ||
strings.Contains(strings.ToLower(modelType), "bert") ||
// Special case for known embedding models
(strings.Contains(strings.ToLower(modelType), "mistral") &&
strings.Contains(strings.ToLower(architecture), "mistralmodel")) {
return append(capabilities, string(v1beta1.ModelCapabilityEmbedding))
}
When the fallback fires, modelType = "mistral" and architecture = "MistralModel" satisfy the special-case condition, and the model is classified as EMBEDDING.
The intended path for intfloat/e5-mistral-7b-instruct (a genuine embedding model) is correct: its HF config has architectures: [] or uses the base MistralModel architecture, so the fallback correctly labels it. The problem is that causal-LM models whose Architectures field fails to populate get the same treatment.
Repro
Import either of these models via the OME model-agent and check the resulting ClusterBaseModel.spec.modelCapabilities:
mistralai/Mistral-7B-Instruct-v0.3 — architectures: ["MistralForCausalLM"] — classified as EMBEDDING ❌
mistralai/Mistral-Nemo-Instruct-2407 — architectures: ["MistralForCausalLM"] — classified as EMBEDDING ❌
intfloat/e5-mistral-7b-instruct — embedding model — classified as EMBEDDING ✅
Expected Behaviour
| Model |
Architecture (HF) |
Expected capability |
mistralai/Mistral-7B-Instruct-v0.3 |
MistralForCausalLM |
TEXT_GENERATION |
mistralai/Mistral-Nemo-Instruct-2407 |
MistralForCausalLM |
TEXT_GENERATION |
intfloat/e5-mistral-7b-instruct |
MistralModel |
EMBEDDING |
Proposed Fix
Change the GetArchitecture() fallback from "MistralModel" to "" so a missing/unparsed Architectures field does not accidentally satisfy the embedding special-case:
func (c *MistralConfig) GetArchitecture() string {
if len(c.Architectures) > 0 {
return c.Architectures[0]
}
return "" // don't assume MistralModel; let caller treat as unknown
}
Alternatively, tighten the special-case check in config_parser.go to require the architecture to be exactly "MistralModel" (case-insensitive) rather than a substring match, and only when Architectures was explicitly set (not via fallback).
Additional Context
autoSelect is false on the vllm-e5-mistral-7b-instruct runtime and the two runtimes use distinct modelArchitecture values (MistralModel vs MistralForCausalLM), so runtime auto-selection is not affected — the runtimes cannot be confused with each other.
- The misclassification only affects capability gating at the endpoint level (chat vs embedding API routing).
Summary
When importing
mistralai/Mistral-7B-Instruct-v0.3ormistralai/Mistral-Nemo-Instruct-2407from HuggingFace, OME classifies both asEMBEDDINGinstead ofTEXT_GENERATION. Chat/completions requests against the resulting endpoint return:Both models have
"architectures": ["MistralForCausalLM"]in their HFconfig.jsonand should beTEXT_GENERATION.Root Cause
Two code locations interact to produce the bug:
1.
pkg/hfutil/modelconfig/mistral.go—GetArchitecture()fallbackIf
Architecturesis empty (e.g. JSON parsing fails, field missing, or struct mismatch), the method silently returns"MistralModel".2.
pkg/modelagent/config_parser.go—determineModelCapabilitiesFromHF()When the fallback fires,
modelType = "mistral"andarchitecture = "MistralModel"satisfy the special-case condition, and the model is classified asEMBEDDING.The intended path for
intfloat/e5-mistral-7b-instruct(a genuine embedding model) is correct: its HF config hasarchitectures: []or uses the baseMistralModelarchitecture, so the fallback correctly labels it. The problem is that causal-LM models whoseArchitecturesfield fails to populate get the same treatment.Repro
Import either of these models via the OME model-agent and check the resulting
ClusterBaseModel.spec.modelCapabilities:mistralai/Mistral-7B-Instruct-v0.3—architectures: ["MistralForCausalLM"]— classified asEMBEDDING❌mistralai/Mistral-Nemo-Instruct-2407—architectures: ["MistralForCausalLM"]— classified asEMBEDDING❌intfloat/e5-mistral-7b-instruct— embedding model — classified asEMBEDDING✅Expected Behaviour
mistralai/Mistral-7B-Instruct-v0.3MistralForCausalLMTEXT_GENERATIONmistralai/Mistral-Nemo-Instruct-2407MistralForCausalLMTEXT_GENERATIONintfloat/e5-mistral-7b-instructMistralModelEMBEDDINGProposed Fix
Change the
GetArchitecture()fallback from"MistralModel"to""so a missing/unparsedArchitecturesfield does not accidentally satisfy the embedding special-case:Alternatively, tighten the special-case check in
config_parser.goto require the architecture to be exactly"MistralModel"(case-insensitive) rather than a substring match, and only whenArchitectureswas explicitly set (not via fallback).Additional Context
autoSelectisfalseon thevllm-e5-mistral-7b-instructruntime and the two runtimes use distinctmodelArchitecturevalues (MistralModelvsMistralForCausalLM), so runtime auto-selection is not affected — the runtimes cannot be confused with each other.