MODEL_PROPERTIES support for all VLM model roles#3781
Conversation
There was a problem hiding this comment.
Pull request overview
Adds support for per-submodel OpenVINO property overrides for VLM pipelines via the MODEL_PROPERTIES key (with intended precedence over DEVICE_PROPERTIES and global properties), plus validation of supported VLM role names and Python binding support.
Changes:
- Extend
utils::get_model_propertiesto merge global +DEVICE_PROPERTIES[device]+MODEL_PROPERTIES[role]with defined precedence. - Apply per-role property resolution across VLM sub-model compilation sites (vision/text encoders, projectors, resamplers, LM, etc.) and Continuous Batching language model compilation.
- Add validation helpers (
get_known_vlm_model_roles,validate_vlm_model_properties) and expand C++ tests; update Python property conversion to treatMODEL_PROPERTIES/DEVICE_PROPERTIESasAnyMap.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/cpp/utils.cpp | Adds unit tests for three-layer property resolution and role validation. |
| src/python/py_utils.cpp | Ensures MODEL_PROPERTIES and DEVICE_PROPERTIES are converted to ov::AnyMap in Python bindings. |
| src/cpp/src/utils.hpp | Updates API docs/signature for property resolution and declares new VLM role helpers. |
| src/cpp/src/utils.cpp | Implements three-layer merge + role validation helpers and known-role list. |
| src/cpp/src/visual_language/pipeline.cpp | Validates role names and applies per-role properties when reading/compiling the language model. |
| src/cpp/src/continuous_batching/pipeline.cpp | Validates role names and reads LM with MODEL_PROPERTIES-filtered props; adjusts InputsEmbedder creation. |
| src/cpp/src/continuous_batching/pipeline_impl.cpp | Resolves per-role props for LM before adapter extraction/compilation. |
| src/cpp/src/visual_language/vision_encoder.cpp | Uses per-role properties for vision embeddings compilation. |
| src/cpp/src/visual_language/videochat_flash/classes.cpp | Uses per-role properties for vision embeddings / vision projection compilation. |
| src/cpp/src/visual_language/qwen3_vl/classes.cpp | Uses per-role properties for vision embeddings position model compilation. |
| src/cpp/src/visual_language/qwen2vl/classes.cpp | Uses per-role properties for vision embeddings + merger compilation. |
| src/cpp/src/visual_language/phi4mm/classes.cpp | Uses per-role properties for vision projection compilation. |
| src/cpp/src/visual_language/phi3_vision/classes.cpp | Uses per-role properties for vision embeddings + vision projection compilation. |
| src/cpp/src/visual_language/minicpm/classes.cpp | Uses per-role properties for resampler compilation. |
| src/cpp/src/visual_language/llava_next_video/classes.cpp | Uses per-role properties for multi-modal projector + resampler + vision embeddings compilation. |
| src/cpp/src/visual_language/embedding_model.cpp | Uses per-role properties for text embeddings read/compile. |
There was a problem hiding this comment.
Pull request overview
This PR introduces per-submodel property overrides for VLM and Continuous Batching pipelines via the MODEL_PROPERTIES meta key, with defined precedence across global, DEVICE_PROPERTIES, and role-specific overrides.
Changes:
- Extend
utils::get_model_propertiesto resolve properties using a 3-layer precedence (globals →DEVICE_PROPERTIES[device]→MODEL_PROPERTIES[role]) and strip meta keys before they reach OpenVINO plugins. - Apply per-role property resolution across VLM sub-model compile sites and ContinuousBatching language-model initialization.
- Add role validation (
validate_vlm_model_properties/get_known_vlm_model_roles), Python binding support for dict→ov::AnyMapconversion, and comprehensive C++ unit tests.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/cpp/utils.cpp | Adds unit tests covering DEVICE_PROPERTIES + MODEL_PROPERTIES precedence, stripping behavior, and role validation. |
| src/python/py_utils.cpp | Treats MODEL_PROPERTIES/DEVICE_PROPERTIES as ov::AnyMap in Python bindings to support nested dicts. |
| src/cpp/src/utils.hpp | Updates API/docs for get_model_properties; declares known-role list and validation helper. |
| src/cpp/src/utils.cpp | Implements 3-layer property resolution plus known-role list and validation. |
| src/cpp/src/visual_language/pipeline.cpp | Validates roles and applies per-role properties to language-model compile paths (NPU + non-NPU). |
| src/cpp/src/continuous_batching/pipeline.cpp | Validates VLM role names and strips MODEL_PROPERTIES before model read/initialization. |
| src/cpp/src/continuous_batching/pipeline_impl.cpp | Resolves/strips per-role props for language-model compilation before adapter extraction. |
| src/cpp/src/visual_language/vision_encoder.cpp | Routes vision-encoder compilation through get_model_properties for role overrides. |
| src/cpp/src/visual_language/videochat_flash/classes.cpp | Applies per-role properties to vision/projection model compilation. |
| src/cpp/src/visual_language/qwen2vl/classes.cpp | Applies per-role properties to vision encoder + merger compilation. |
| src/cpp/src/visual_language/qwen3_vl/classes.cpp | Applies per-role properties to positional-embeddings model compilation. |
| src/cpp/src/visual_language/phi4mm/classes.cpp | Applies per-role properties to vision-projection model compilation. |
| src/cpp/src/visual_language/phi3_vision/classes.cpp | Applies per-role properties to vision/projection model compilation. |
| src/cpp/src/visual_language/minicpm/classes.cpp | Applies per-role properties to resampler compilation. |
| src/cpp/src/visual_language/llava_next_video/classes.cpp | Applies per-role properties to multi-modal projector + resampler compilation. |
| src/cpp/src/visual_language/embedding_model.cpp | Applies per-role properties consistently for text-embeddings read/compile steps. |
| /// @brief Resolve properties for @p model_role by merging two layers (priority low to high): | ||
| /// 1. global (top-level keys, excluding meta keys PER_MODEL_PROPERTIES) | ||
| /// 2. PER_MODEL_PROPERTIES[model_role] | ||
| /// 1. global (top-level keys, excluding meta keys PER_MODEL_PROPERTIES | ||
| /// and DEVICE_PROPERTIES if device is specified) | ||
| /// 2. DEVICE_PROPERTIES[device] (only when @p device is non-empty) | ||
| /// 3. PER_MODEL_PROPERTIES[model_role] | ||
| /// MODEL_PROPERTIES wins over DEVICE_PROPERTIES wins | ||
| /// over globals. | ||
| /// @param properties The main properties map. Not modified. | ||
| /// @param model_role Sub-model role (e.g. "vision_embeddings"). | ||
| /// @param device Target device for the compile site. When empty, | ||
| /// DEVICE_PROPERTIES is forwarded as-is (used at read_model sites | ||
| /// which are not bound to a specific device). | ||
| /// @return A new ov::AnyMap with the merged result. The input map is left | ||
| /// untouched so callers may continue using the meta keys. | ||
| ov::AnyMap get_model_properties(ov::AnyMap& properties, const std::string& model_role); | ||
| ov::AnyMap get_model_properties(const ov::AnyMap& properties, const std::string& model_role, const std::string& device = ""); |
There was a problem hiding this comment.
Will do in next PR
| const ov::AnyMap& properties) { | ||
| ov::Core core = utils::singleton_core(); | ||
| std::shared_ptr<ov::Model> m_model = core.read_model(model_dir / "openvino_text_embeddings_model.xml", {}, properties); | ||
| const auto plugin_props = utils::get_model_properties(properties, "text_embeddings", device); |
There was a problem hiding this comment.
Would const auto text_embeddings_props = ... be more relevant here?
| const std::vector<std::string>& get_known_vlm_model_roles(); | ||
|
|
||
| /// @brief Throws if `properties[MODEL_PROPERTIES]` contains a role name | ||
| /// not in `known_roles`. No-op if the key is absent. | ||
| void validate_vlm_model_properties(const ov::AnyMap& properties, | ||
| const std::vector<std::string>& known_roles); |
There was a problem hiding this comment.
Do we really need known_roles parameter? And exposing get_known_vlm_model_roles() function?
As I can see, in all validate_vlm_model_properties() usages get_known_vlm_model_roles() is called. Only in tests another vector of know roles is passed, but I think test case with invalid properties map would be sufficient.
Description
This change enables per-model config for VLMPipeline and ContinuousBatchingPipeline with VLM adapter via
MODEL_PROPERTIESkey.Property resolution (low -> high)
DEVICE_PROPERTIES- properties propagated to given deviceMODEL_PROPERTIES- properties propagated to model with exact role in VLMPipelineAvailable roles: vision_embeddings, text_embeddings, resampler, vision_embeddings_merger, vision_embeddings_pos, vision_projection, multi_modal_projector, language_model
Example:
CVS-162621
Following VLM API redesign document:
https://github.com/dkalinowski/openvino.genai/blob/vlm-design/design.md
Checklist: