Commit 9c4c497
authored
feat(engine): add Qwen3-VL dense and MoE support to Megatron path (#1301)
* feat(engine): add Qwen3-VL dense support to Megatron path
Extend the Megatron engine to train Qwen3-VL dense models end-to-end:
mcore→HF weight conversion for update_weights and HF→mcore loading that
handles Qwen3-VL's nested HF config layout. Without this, GRPO/PPO of
any Qwen3-VL model on the Megatron backend is blocked.1 parent 2755661 commit 9c4c497
9 files changed
Lines changed: 1842 additions & 202 deletions
File tree
- areal
- engine
- core
- megatron_utils
- models/mcore
- utils
- tests
- torchrun
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
| |||
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
26 | | - | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
27 | 40 | | |
28 | 41 | | |
29 | 42 | | |
30 | 43 | | |
31 | 44 | | |
32 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
33 | 58 | | |
34 | 59 | | |
35 | 60 | | |
36 | 61 | | |
37 | 62 | | |
38 | 63 | | |
| 64 | + | |
39 | 65 | | |
40 | 66 | | |
41 | 67 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
59 | | - | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
60 | 64 | | |
61 | 65 | | |
62 | 66 | | |
| |||
1463 | 1467 | | |
1464 | 1468 | | |
1465 | 1469 | | |
1466 | | - | |
| 1470 | + | |
1467 | 1471 | | |
1468 | 1472 | | |
1469 | 1473 | | |
| |||
0 commit comments