Skip to content

Commit 357f414

Browse files
authored
fix(qwen3_moe): correct return type annotation on Qwen3MoeSparseMoeBlock.forward (#45352)
* fix(qwen3_moe): correct return type annotation on Qwen3MoeSparseMoeBlock.forward * fix: propagate Qwen3MoeSparseMoeBlock forward return type fix to generated vl_moe and omni_moe files Built by Rudrendu Paul, developed with Claude Code --------- Co-authored-by: Rudrendu <RudrenduPaul@users.noreply.github.com>
1 parent 0b5dbfc commit 357f414

4 files changed

Lines changed: 4 additions & 4 deletions

File tree

src/transformers/models/qwen3_moe/modeling_qwen3_moe.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -278,7 +278,7 @@ def __init__(self, config: Qwen3MoeConfig):
278278
self.experts = Qwen3MoeExperts(config)
279279
self.gate = Qwen3MoeTopKRouter(config)
280280

281-
def forward(self, hidden_states: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
281+
def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
282282
batch_size, sequence_length, hidden_dim = hidden_states.shape
283283
hidden_states_reshaped = hidden_states.view(-1, hidden_dim)
284284
_, routing_weights, selected_experts = self.gate(hidden_states_reshaped)

src/transformers/models/qwen3_moe/modular_qwen3_moe.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ def __init__(self, config: Qwen3MoeConfig):
6666
self.experts = Qwen3MoeExperts(config)
6767
self.gate = Qwen3MoeTopKRouter(config)
6868

69-
def forward(self, hidden_states: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
69+
def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
7070
batch_size, sequence_length, hidden_dim = hidden_states.shape
7171
hidden_states_reshaped = hidden_states.view(-1, hidden_dim)
7272
_, routing_weights, selected_experts = self.gate(hidden_states_reshaped)

src/transformers/models/qwen3_omni_moe/modeling_qwen3_omni_moe.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1415,7 +1415,7 @@ def __init__(self, config: Qwen3OmniMoeThinkerConfig):
14151415
self.experts = Qwen3OmniMoeThinkerTextExperts(config)
14161416
self.gate = Qwen3OmniMoeThinkerTextTopKRouter(config)
14171417

1418-
def forward(self, hidden_states: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
1418+
def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
14191419
batch_size, sequence_length, hidden_dim = hidden_states.shape
14201420
hidden_states_reshaped = hidden_states.view(-1, hidden_dim)
14211421
_, routing_weights, selected_experts = self.gate(hidden_states_reshaped)

src/transformers/models/qwen3_vl_moe/modeling_qwen3_vl_moe.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ def __init__(self, config: Qwen3VLMoeTextConfig):
136136
self.experts = Qwen3VLMoeTextExperts(config)
137137
self.gate = Qwen3VLMoeTextTopKRouter(config)
138138

139-
def forward(self, hidden_states: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
139+
def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
140140
batch_size, sequence_length, hidden_dim = hidden_states.shape
141141
hidden_states_reshaped = hidden_states.view(-1, hidden_dim)
142142
_, routing_weights, selected_experts = self.gate(hidden_states_reshaped)

0 commit comments

Comments
 (0)