Commit fe05ce3
fix(inference): restore NemotronH mixer.D + e_score_correction_bias after vLLM reload
vLLM 0.22's layerwise reload mis-loads exactly two NemotronH per-layer param
families through the online-reload path -- mixer.D (Mamba SSD skip) and the MoE
router's gate.e_score_correction_bias -- while loading all other weights
correctly. mixer.D becomes non-deterministic garbage/inf (NaN logits) and the
gate bias gets a wrong value (broken routing), so generations go to NaN after a
weight update. Restore both from the received broadcast (correct by definition)
via each param's own weight_loader.
Also drop monkey_patch_vllm_layerwise_reload_alias_buffers: it crashes on vLLM
0.22 (AttributeError on the delattr'd conv_weights) and conv_weights is handled
correctly by vLLM's native reload finalize. Supersedes #2701.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent f619e36 commit fe05ce3
2 files changed
Lines changed: 51 additions & 34 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | 21 | | |
23 | 22 | | |
24 | 23 | | |
| |||
67 | 66 | | |
68 | 67 | | |
69 | 68 | | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
78 | | - | |
79 | | - | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | 69 | | |
103 | 70 | | |
104 | 71 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
27 | 66 | | |
28 | 67 | | |
29 | 68 | | |
| |||
148 | 187 | | |
149 | 188 | | |
150 | 189 | | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
151 | 200 | | |
152 | 201 | | |
153 | | - | |
| 202 | + | |
154 | 203 | | |
155 | 204 | | |
156 | 205 | | |
| 206 | + | |
0 commit comments