Skip to content

[OpenVINO] Support openbmb/MiniCPM-o-4_5 for image-text-to-text task#1797

Open
mlukasze wants to merge 4 commits into
huggingface:mainfrom
mlukasze:enable/openbmb-MiniCPM-o-4_5
Open

[OpenVINO] Support openbmb/MiniCPM-o-4_5 for image-text-to-text task#1797
mlukasze wants to merge 4 commits into
huggingface:mainfrom
mlukasze:enable/openbmb-MiniCPM-o-4_5

Conversation

@mlukasze

@mlukasze mlukasze commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Bumps MiniCPMOOpenVINOConfig.MAX_TRANSFORMERS_VERSION from 4.51.3 to 4.57.6 to extend support for openbmb/MiniCPM-o-4_5, which requires transformers>=4.49. No architecture changes are needed — MiniCPM-o-4_5 uses the same model_type: minicpmo as MiniCPM-o-2_6.

Installation instructions

pip install git+https://github.com/mlukasze/optimum-intel.git@enable/openbmb-MiniCPM-o-4_5
pip install -U openvino openvino-tokenizers nncf
pip install "transformers>=4.49,<5.0"

Exporting cmd-line

optimum-cli export openvino -m openbmb/MiniCPM-o-4_5 ov_minicpmo_4_5_fp16 \
    --task image-text-to-text --trust-remote-code

Inference script

from PIL import Image
import requests
from optimum.intel import OVModelForVisualCausalLM
from transformers import AutoProcessor

model_dir = "ov_minicpmo_4_5_fp16"

processor = AutoProcessor.from_pretrained(model_dir, trust_remote_code=True)
model = OVModelForVisualCausalLM.from_pretrained(model_dir, trust_remote_code=True)

image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg"
image = Image.open(requests.get(image_url, stream=True).raw).convert("RGB")

messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": "Describe this image."}
    ]}
]
inputs = processor.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt",
    return_dict=True, images=[image]
)

outputs = model.generate(**inputs, max_new_tokens=200, do_sample=False)
response = processor.decode(
    outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True
)
print(response)

Before submitting

  • [N/A] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

- Update MiniCPMOOpenVINOConfig MAX_TRANSFORMERS_VERSION to 4.57.6 to support MiniCPM-o-4_5 which requires transformers 4.49-4.57
- Add tiny model creation script for CI testing
- MiniCPM-o-4_5 uses same architecture as 2_6 (minicpmo model_type, SigLIP vision encoder, same sub-module layout)
- Audio modality is NOT exported (out of scope per existing design)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@omega-intel omega-intel force-pushed the enable/openbmb-MiniCPM-o-4_5 branch from 1df94f0 to 28ee507 Compare June 17, 2026 06:43
Comment thread tests/openvino/create_tiny_minicpmo_4_5.py Outdated
@mlukasze mlukasze requested a review from rkazants June 17, 2026 15:06
@rkazants rkazants marked this pull request as ready for review June 17, 2026 16:51
@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…2-4.57.6

MAX_TRANSFORMERS_VERSION for minicpmo was bumped to 4.57.6, so minicpmo
is now supported on transformers 4.43-4.57.6 (not skipped). Update the
expected skipped-model set to only include minicpmo when transformers >
4.57.6.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rkazants

rkazants commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

@mlukasze, please rebase the branch to make CI green

@mlukasze

mlukasze commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

@mlukasze, please rebase the branch to make CI green

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants