Skip to content

fix gemma2 checkpoint conversion validation check#3514

Merged
copybara-service[bot] merged 1 commit intomainfrom
agagik-gemma2
Mar 28, 2026
Merged

fix gemma2 checkpoint conversion validation check#3514
copybara-service[bot] merged 1 commit intomainfrom
agagik-gemma2

Conversation

@gagika
Copy link
Copy Markdown
Collaborator

@gagika gagika commented Mar 27, 2026

Description

Fix Gemma 2 architecture validation during Hugging Face conversion

This updates the _validate_or_update_architecture logic in to_huggingface.py to correctly handle Gemma 2 specific discrepancies:

  1. num_hidden_layers: MaxText bundles local and global attention layers into a single Gemma2DecoderLayer, halving the layer count relative to Hugging Face. The validation now multiplies the MaxText layer count by 2 for Gemma 2 before comparison.
  2. vocab_size: MaxText pads vocabulary sizes up to multiples of 128 or 256 for optimal TPU efficiency (e.g., 256128 instead of 256000). The logic now gracefully accepts a padded vocab size up to a 256-token difference.

Tests

Tested locally

python3 -m maxtext.checkpoint_conversion.to_huggingface src/maxtext/configs//base.yml model_name=gemma2-2b ...

I0328 01:24:02.204004 140209930927680 to_huggingface.py:318] ✅ MaxText model successfully saved in HuggingFace format at ./tmp/hf/gemma2-2b/2026-03-27-12-47
I0328 01:24:02.204053 140209930927680 to_huggingface.py:319] Elapse for save: 0.18 min
I0328 01:24:02.204087 140209930927680 to_huggingface.py:320] Overall Elapse: 0.63 min
I0328 01:24:02.204361 140209930927680 utils.py:750] Peak Memory: 26.80 GB

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown

🤖 Hi @gagika, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

@github-actions
Copy link
Copy Markdown

🤖 I'm sorry @gagika, but I was unable to process your request. Please see the logs for more details.

@copybara-service copybara-service Bot merged commit c30ada0 into main Mar 28, 2026
42 of 43 checks passed
@copybara-service copybara-service Bot deleted the agagik-gemma2 branch March 28, 2026 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants