Skip to content

Gemma 4 export to OpenVINO not supported #1693

@Tanay2104

Description

@Tanay2104

Gemma 4 (gemma4) export to OpenVINO not supported

Environment

  • optimum-intel: 1.27.0.dev0+71e0584 (from main branch)
  • optimum: 2.1.0.dev0
  • transformers: 5.5.4
  • openvino: 2026.1.0
  • openvino-tokenizers: 2026.1.0.0
  • Python: 3.11.9

Description

Exporting google/gemma-4-E2B-it to OpenVINO fails at multiple levels.

1. Via optimum-cli (after resolving the MambaCache import error in modeling_decoder.py):

ValueError: Trying to export a gemma4 model, that is a custom or unsupported architecture,
but no custom export configuration was passed as `custom_export_configs`.

Gemma 4 (model_type: gemma4) is not registered in the OpenVINO exporter's supported architecture map.

2. Via ov.convert_model() with torch.jit.trace — fails deep inside transformers' Gemma 4 masking code:

File ".../transformers/masking_utils.py", line 492, in sdpa_mask
    q_length, q_offset = q_length.shape[0], q_length[0].to(device)
                         ~~~~~~~~~~~~~~^^^
IndexError: tuple index out of range

This occurs in both create_causal_mask and create_sliding_window_causal_mask paths during tracing.

Steps to reproduce

optimum-cli export openvino \
  --model google/gemma-4-E2B-it \
  --task vision-language-conditional-generation \
  --weight-format int4 \
  --trust-remote-code \
  output_dir

Or via Python:

import torch, openvino as ov
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("google/gemma-4-E2B-it", trust_remote_code=True, dtype=torch.float32)
model.eval()
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E2B-it", trust_remote_code=True)
dummy = tokenizer("Hello", return_tensors="pt")

traced = torch.jit.trace(model, (dummy["input_ids"], dummy["attention_mask"]), strict=False)
ov_model = ov.convert_model(traced)

Expected behavior

Gemma 4 should be exportable to OpenVINO IR format via optimum-cli, similar to other Gemma variants.

Additional notes

  • There is also a secondary issue: modeling_decoder.py imports MambaCache from transformers.models.mamba.modeling_mamba, which no longer exists in transformers 5.x, causing an ImportError on startup. This needs to be fixed independently of Gemma 4 support. Changing the import from from transformers.models.mamba.modeling_mamba import MambaCache to from transformers.models.mamba.modeling_mamba import Cache as MambaCache fixes this import error.

  • openvino-genai is working correctly for inference(for gemma3 models); the current blocker is the export/conversion pipeline.

Please add native Gemma 4 support to the OpenVINO exporter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions