Skip to content

Commit 41fc91d

Browse files
committed
add comment
Signed-off-by: Yuki Huang <yukih@nvidia.com>
1 parent 689b107 commit 41fc91d

1 file changed

Lines changed: 8 additions & 0 deletions

File tree

nemo_rl/models/generation/vllm/vllm_backend.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,14 @@
3838

3939

4040
def fix_gpt_oss_export_transpose(key: str, weight: torch.Tensor) -> torch.Tensor:
41+
"""Apply GPT-OSS down_proj transpose fix to the weight.
42+
43+
This is a workaround for the issue that the down_proj layout is not the same across different frameworks.
44+
- HF needs [in, out] layout.
45+
- Megatron needs [in, out] layout.
46+
- vLLM needs [out, in] layout.
47+
See https://github.com/NVIDIA-NeMo/Megatron-Bridge/pull/3271 for more details.
48+
"""
4149
if key.endswith("mlp.experts.down_proj"):
4250
weight = weight.transpose(-2, -1).contiguous()
4351
return weight

0 commit comments

Comments
 (0)