fix: stop default collater from adding padding_mask#1652
fix: stop default collater from adding padding_mask#1652zeel2104 wants to merge 2 commits intoNVIDIA-NeMo:mainfrom
Conversation
Signed-off-by: Zeel <desaizeel2128@gmail.com>
hemildesai
left a comment
There was a problem hiding this comment.
Hi, thanks a lot for your contribution.
We need padding mask in the batch for correctly calculating aux_loss for MoEs. Rn, padding mask needs to be passed explicitly to handle some corner cases. As a result, I think an appropriate fix would be to add padding_mask to the batch if the model supports it, skip it otherwise.
Signed-off-by: Zeel <desaizeel2128@gmail.com>
|
@hemildesai
This keeps the MoE aux-loss path intact while avoiding crashes for models that cannot consume the argument. I also added targeted coverage for:
|
|
@adil-a + @hemildesai can you provide guidance? Thank you. |
What does this PR do ?
Stop
default_collaterfrom creating apadding_maskfor generic padded batches, so models that do not accept that argument no longer need downstream filtering and do not crash.Changelog
padding_maskcreation fromnemo_automodel.components.datasets.utils.default_collaterBefore your PR is "Ready for review"
Pre checks:
If you haven't finished some of the above items you can still open "Draft" PR.
Additional Information