Skip to content

[Qwen3.5] Add moe weight sync script for 35b model#4041

Open
Rohan-Bierneni wants to merge 1 commit into
mainfrom
rbierneni-q35-weight-sync
Open

[Qwen3.5] Add moe weight sync script for 35b model#4041
Rohan-Bierneni wants to merge 1 commit into
mainfrom
rbierneni-q35-weight-sync

Conversation

@Rohan-Bierneni
Copy link
Copy Markdown
Collaborator

@Rohan-Bierneni Rohan-Bierneni commented Jun 2, 2026

Description

Enables weight sync between maxtext & vllm for rl for Qwen3.5-35B.

If the change fixes a bug or a Github issue, please include a link, e.g.,:
FIXES: b/123456

Tests

Command to run validator script: https://paste.googleplex.com/5425192473067520#l=24
Result: https://paste.googleplex.com/5285887440191488

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 2, 2026

Codecov Report

❌ Patch coverage is 6.58683% with 156 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...t/integration/vllm/torchax_converter/qwen35_moe.py 6.28% 149 Missing ⚠️
...ation/vllm/torchax_converter/validate_converter.py 0.00% 5 Missing ⚠️
...c/maxtext/integration/vllm/maxtext_vllm_rollout.py 33.33% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Rohan-Bierneni Rohan-Bierneni force-pushed the rbierneni-q35-weight-sync branch 2 times, most recently from b05b211 to eed71e0 Compare June 2, 2026 21:15
@Rohan-Bierneni Rohan-Bierneni changed the title [Qwen3.5] Add moe weight sync script [Qwen3.5] Add moe weight sync script for 35b model Jun 2, 2026
@Rohan-Bierneni Rohan-Bierneni force-pushed the rbierneni-q35-weight-sync branch from eed71e0 to 0101424 Compare June 2, 2026 21:35
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

🤖 Hi @aireenmei, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## 📋 Review Summary

This Pull Request successfully introduces the Qwen35MaxTextToVLLMConverter, enabling weight synchronization for the Qwen 3.5 35B MoE model between MaxText and vLLM. The implementation correctly handles the hybrid architecture (GDN + Attention) and includes necessary TPU-specific optimizations like 128-byte alignment for MoE experts.

🔍 General Feedback

  • The converter logic is well-structured and follows the established patterns in the torchax_converter module.
  • Proactive memory management using gc.collect() in conversion loops is a good practice for handling large model weights.
  • The use of dynamic hyperparameter extraction via getattr provides good flexibility for different model variants.
  • The addition of TPU-specific GMM alignment (padding to 128) ensures compatibility with vLLM's optimized TPU kernels.

Comment thread src/maxtext/integration/vllm/torchax_converter/qwen35_moe.py
Copy link
Copy Markdown
Collaborator

@khatwanimohit khatwanimohit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Rohan-Bierneni Rohan-Bierneni force-pushed the rbierneni-q35-weight-sync branch from 0101424 to 95a1fd9 Compare June 3, 2026 20:00
fix weight sync script

Onboard model to validator script and rollout file

fix param mappings and converter logic to get weight conversion working

Working script for weight sync

Ran linter

Ran linter

Add max_num_batched_tokens only for qwen3.5

Ran linter

Remove hardcoded configs and use model specific ones

Ran linter

Resolve pr comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants