[PyTorch] Refactor grouped linear and grouped MLP tests by timmoon10 · Pull Request #3122 · NVIDIA/TransformerEngine

timmoon10 · 2026-06-12T03:09:15Z

Description

test_fusible_ops.py was becoming a dumping ground for random grouped MLP tests, including tests that didn't involve fusible ops at all. This PR reorganizes the tests so that test_fusible_ops.py holds basic tests for te.ops.GroupedLinear, while test_grouped_mlp.py holds the exhaustive tests for all the various grouped MLP fused ops. I've also tried trimming down excessive test parametrization to bring down the test time from ~20 min to ~5 min.

#3111 was an earlier attempt at this refactor.

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring
Testing

Changes

Copy grouped linear and grouped MLP tests in test_fusible_ops.py into test_grouped_mlp.py
Remove redundant test cases

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Remove unused imports and helpers left after splitting grouped MLP tests out of the fusible ops suite. Co-authored-by: Codex <codex@openai.com> Signed-off-by: Tim Moon <tmoon@nvidia.com>

for more information, see https://pre-commit.ci

greptile-apps · 2026-06-12T03:12:31Z

Greptile Summary

This PR refactors the grouped linear and grouped MLP tests by splitting them across two files: test_fusible_ops.py retains basic te.ops.GroupedLinear smoke tests while the new test_grouped_mlp.py holds exhaustive parametrized tests for the fused grouped MLP ops. test.sh is updated accordingly to run test_grouped_mlp.py under the required NVTE_GROUPED_LINEAR_SINGLE_PARAM=1 NVTE_CUTEDSL_FUSED_GROUPED_MLP=1 environment.

test_fusible_ops.py: Grouped MLP and exhaustive grouped-linear parametrization (single_grouped_weight/bias, delay_wgrad_compute) removed; basic test_grouped_linear retained with a simpler parameter set.
tests/pytorch/test_grouped_mlp.py (new, ~1807 lines): Houses TestGroupedLinearOp, TestGroupedMLPFusedOp, and a standalone test_grouped_gemm_quant_cute_matches_mxfp8_quantized test; mirrors the utility setup from test_fusible_ops.py.
qa/L0_pytorch_unittest/test.sh: Adds a new test_grouped_mlp.py invocation with the necessary env vars at the end of the suite; moves test_grouped_linear.py to run alongside the other grouped tests.

Confidence Score: 5/5

Safe to merge — this is a pure test reorganization with no changes to production code paths.

All changed files are tests or CI scripts. The new test_grouped_mlp.py faithfully reproduces the parametrized coverage that was removed from test_fusible_ops.py, and test.sh correctly applies the required environment variables only to the files that need them.

No files require special attention beyond the already-flagged quantization list initialization issue in test_grouped_mlp.py (covered in a previous review thread).

Important Files Changed

Filename	Overview
tests/pytorch/test_grouped_mlp.py	New file with exhaustive grouped linear and grouped MLP tests; contains a minor typo in a skip-reason string (line 1078).
tests/pytorch/test_fusible_ops.py	Grouped MLP and exhaustive grouped-linear parametrizations removed; remaining test_grouped_linear is a clean, trimmed version.
qa/L0_pytorch_unittest/test.sh	Correctly wires up the new test_grouped_mlp.py invocation with NVTE_GROUPED_LINEAR_SINGLE_PARAM=1 and NVTE_CUTEDSL_FUSED_GROUPED_MLP=1; test_fusible_ops.py no longer carries those env vars.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[test.sh] --> B[test_fusible_ops.py\nbasic GroupedLinear smoke tests]
    A --> C[test_grouped_mlp.py\nNVTE_GROUPED_LINEAR_SINGLE_PARAM=1\nNVTE_CUTEDSL_FUSED_GROUPED_MLP=1]
    A --> D[test_grouped_linear.py\nPYTORCH_JIT=0 NVTE_TORCH_COMPILE=0]
    C --> E[TestGroupedLinearOp\ntest_grouped_linear\ntest_grouped_linear_cuda_graph_safe]
    C --> F[TestGroupedMLPFusedOp\ntest_grouped_mlp\ntest_grouped_mlp_fp16\ntest_grouped_mlp_mcore_integrations\ntest_grouped_mlp_single_weight_numerics\ntest_grouped_mlp_overwrite_main_grad\ntest_grouped_mlp_cuda_graph_safe_mxfp8]
    C --> G[test_grouped_gemm_quant_cute_matches_mxfp8_quantized]
    B --> H[TestBasicOps.test_grouped_linear\nsimplified params - no single_grouped_weight/bias]

_{Reviews (2): Last reviewed commit: "Review suggestion from @greptile-apps" | Re-trigger Greptile}

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

timmoon10 · 2026-06-12T03:18:30Z

/te-ci pytorch

vthumbe1503

LGTM. I referred to the commit changes as it was easy to read. Testing mcore based main grad changes and fp16 case seperately was good idea to reduce parametrization. Also fixing glu_inertelave_size to 32 in grouped mlp makes sense. Since we are already testing for None in other tests.

timmoon10 and others added 4 commits June 12, 2026 00:55

Copy grouped MLP tests from TE ops tests

afdbedc

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Reduce TE ops test cases

cba8ed1

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Reduce overparametrized grouped MLP tests

7287e98

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Clean up grouped MLP test leftovers

7a61810

Remove unused imports and helpers left after splitting grouped MLP tests out of the fusible ops suite. Co-authored-by: Codex <codex@openai.com> Signed-off-by: Tim Moon <tmoon@nvidia.com>

timmoon10 requested review from ksivaman and vthumbe1503 June 12, 2026 03:09

timmoon10 added testing Improvements to tests or testing infrastructure refactor MoE labels Jun 12, 2026

[pre-commit.ci] auto fixes from pre-commit.com hooks

b51a484

for more information, see https://pre-commit.ci

greptile-apps Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread tests/pytorch/test_grouped_mlp.py

timmoon10 commented Jun 12, 2026

View reviewed changes

Comment thread tests/pytorch/test_grouped_mlp.py Outdated

Review suggestion from @greptile-apps

cf54120

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

vthumbe1503 approved these changes Jun 12, 2026

View reviewed changes

timmoon10 merged commit f95573f into NVIDIA:main Jun 12, 2026
22 of 25 checks passed

timmoon10 deleted the tmoon/refactor-grouped-mlp-tests2 branch June 12, 2026 20:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PyTorch] Refactor grouped linear and grouped MLP tests#3122

[PyTorch] Refactor grouped linear and grouped MLP tests#3122
timmoon10 merged 6 commits into
NVIDIA:mainfrom
timmoon10:tmoon/refactor-grouped-mlp-tests2

timmoon10 commented Jun 12, 2026

Uh oh!

greptile-apps Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

timmoon10 commented Jun 12, 2026

Uh oh!

vthumbe1503 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

timmoon10 commented Jun 12, 2026

Description

Type of change

Changes

Checklist:

Uh oh!

greptile-apps Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

timmoon10 commented Jun 12, 2026

Uh oh!

vthumbe1503 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Jun 12, 2026 •

edited

Loading