Skip to content

Add conversion overrides for Llama, Qwen3, and Gemma 4 models so they roundtrip properly#677

Open
finbarrtimbers wants to merge 7 commits into
mainfrom
finbarr/fix-conversion
Open

Add conversion overrides for Llama, Qwen3, and Gemma 4 models so they roundtrip properly#677
finbarrtimbers wants to merge 7 commits into
mainfrom
finbarr/fix-conversion

Conversation

@finbarrtimbers
Copy link
Copy Markdown
Collaborator

@finbarrtimbers finbarrtimbers commented May 13, 2026

Previously, we got errors.

@finbarrtimbers finbarrtimbers changed the title Fix conversion so it roundtrips properly Add conversion overrides for Llama, Qwen3, and Gemma 4 models so they roundtrip properly May 13, 2026
@finbarrtimbers finbarrtimbers requested a review from AkshitaB May 13, 2026 16:07
@finbarrtimbers finbarrtimbers marked this pull request as ready for review May 13, 2026 17:17
@finbarrtimbers finbarrtimbers enabled auto-merge (squash) May 13, 2026 17:17
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f1b69d796a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/test/nn/hf/convert_test.py Outdated
Comment on lines +232 to +233
hf_config = AutoConfig.from_pretrained(model_id)
hf_model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float32)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid unconditional Hub downloads in unit tests

These helpers are called by the new Qwen3/Gemma tests without any HF_TOKEN/network skip, so a normal pytest src/test/nn/hf/convert_test.py now tries to download Qwen/Qwen3-0.6B and google/gemma-3-270m during unit tests. In offline CI or developer environments this fails before testing the converter, and even with network it pulls hundreds of MB; the existing golden tests gate the same Hub-backed coverage behind HF_TOKEN, so these should use a local/tiny config or be similarly skipped/marked.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b7c58c38c6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
BEAKER_TOKEN: ${{ secrets.BEAKER_TOKEN }}
GOOGLE_CREDENTIALS: ${{ secrets.GOOGLE_CREDENTIALS }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep Hub token out of the default test job

When the Test matrix runs on same-repo PRs or pushes where secrets are available, setting HF_TOKEN globally makes src/test/nn/hf/golden_tests.py's @skipif(not HF_TOKEN) false, so the ordinary 15-minute CPU test job now downloads and runs the Qwen3-0.6B and Gemma3-270m generation golden tests. That couples normal unit CI to Hub availability and large-model runtime; scope this token to a dedicated/gated job or keep those tests skipped unless explicitly requested.

Useful? React with 👍 / 👎.

Implements tied LM head & word embeddings for Qwen3. The three sizes
that Qwen ships tied (0.6B, 1.7B, 4B) now default to tying; 8B/14B/32B
stay untied. The HF import path is tie-aware.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants