Commit 86b8b2f
[None][feat] Add AD custom model for MiniMax-M2 family (#243)
* [None][feat] Add AD custom model for MiniMax-M2 family
Replace the existing MiniMax-M2 MoE patch with a full custom model
implementation using AD canonical ops. Covers both MiniMaxAI/MiniMax-M2
and MiniMaxAI/MiniMax-M2.5 (same architecture, model_type: minimax_m2).
Key architecture features:
- MoE with 256 experts, top-8, sigmoid routing + e_score_correction_bias
- GQA (48 Q heads, 8 KV heads, head_dim=128)
- Partial RoPE (rotary_dim=64 out of head_dim=128)
- Per-layer QK normalization (RMSNorm on full Q/K before reshape)
- FP8 block-wise quantized checkpoint
Canonical ops used: torch_rmsnorm, torch_rope_with_explicit_cos_sin,
torch_attention (GQA-native, no repeat_kv), torch_moe.
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
* [None][fix] Disable fuse_finegrained_fp8_moe for MiniMax-M2 in registry config
The trtllm fused MoE kernel fails with NVRTC compilation error for
MiniMax-M2's MoE configuration (256 experts, block-wise FP8). Add
transform disablement and torch-simple compile backend to the model
registry config so --use-registry works out of the box.
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
---------
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>1 parent aa76fe5 commit 86b8b2f
6 files changed
Lines changed: 1191 additions & 79 deletions
File tree
- examples/auto_deploy/model_registry
- configs
- tensorrt_llm/_torch/auto_deploy/models
- custom
- patches
- tests/unittest/auto_deploy/singlegpu/models
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
3 | 11 | | |
4 | 12 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
229 | 229 | | |
230 | 230 | | |
231 | 231 | | |
232 | | - | |
| 232 | + | |
233 | 233 | | |
234 | 234 | | |
235 | 235 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| 52 | + | |
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
| |||
0 commit comments