Skip to content

fix(gemma3, gemma4): default token_type_ids to zeros for text-only training#45222

Open
jashshah999 wants to merge 1 commit intohuggingface:mainfrom
jashshah999:fix/gemma-text-only-training
Open

fix(gemma3, gemma4): default token_type_ids to zeros for text-only training#45222
jashshah999 wants to merge 1 commit intohuggingface:mainfrom
jashshah999:fix/gemma-text-only-training

Conversation

@jashshah999
Copy link
Copy Markdown
Contributor

Summary

When using Gemma 3 or Gemma 4 for text-only supervised fine-tuning (no images), the forward pass raises a ValueError because token_type_ids / mm_token_type_ids is not provided. This happens because AutoTokenizer does not produce these fields -- only the multimodal Processor does.

The fix defaults to all-zeros when token_type_ids / mm_token_type_ids is None during training, instead of raising. When all zeros, is_vision is entirely False, so the bidirectional vision mask branch is skipped and a standard causal mask is produced -- which is exactly correct for text-only input.

Changes

  • modeling_gemma4.py / modular_gemma4.py: default mm_token_type_ids to torch.zeros(...) instead of raising ValueError
  • modeling_gemma3.py / modular_gemma3.py: same fix for token_type_ids (same root cause)

Fixes #45200

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma3, gemma4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Gemma 4] mm_token_type_ids required for text-only fine-tuning - should default to zeros

1 participant