Commit dd445ba
fix(te-plugin): handle TE 2.15+ tuple return from _Linear / _GroupedLinear (#1481)
### What does this PR do?
Type of change: Bug fix
TE 2.15+ changed `_Linear.forward` and `_GroupedLinear.forward` to
return `(out, new_workspace)` tuples instead of a single tensor.
ModelOpt's patched `te_quantized_linear_fn` /
`te_grouped_quantized_linear_fn` still piped the whole tuple into
`self.output_quantizer`, crashing inside `TensorQuantizer.forward` on
`tuple.numel()`:
```
File ".../modelopt/torch/quantization/plugins/transformer_engine.py", line 184, in te_grouped_quantized_linear_fn
return self.output_quantizer(output)
File ".../tensor_quantizer.py", line 1037, in forward
if inputs.numel() == 0:
AttributeError: 'tuple' object has no attribute 'numel'
```
Mirror the existing pattern from `_QuantTELayerNormLinear.forward`: when
the underlying TE call returns a tuple, quantize only `output[0]` (the
activation tensor) and pass auxiliary workspace metadata through
unchanged. TE <= 2.14 returns a single tensor and falls through the
`isinstance` branch identically to before this change.
Already landed on `release/0.44.0` as commit `c897fbeaaf`; this brings
`main` in sync. Follow-up to
[#1473](#1473) (signature
introspection + `_forward` cache lookup), which fixed an earlier symptom
of the same TE 2.15 signature change but not this tuple-return path.
### Usage
No public API change. PTQ continues to work transparently across TE 2.x:
```python
import modelopt.torch.quantization as mtq
mtq.quantize(model, mtq.NVFP4_DEFAULT_CFG, forward_loop) # now works on TE 2.15.x
```
### Testing
<!-- Mention how have you tested your change if applicable. -->
Verified locally against **both TE 2.12** and **TE 2.15.0** using:
```bash
pytest tests/gpu_megatron/torch/quantization/plugins/test_transformer_engine.py
```
Without this fix on TE 2.15, the same test fails immediately with
`AttributeError: 'tuple' object has no attribute 'numel'`. With this
fix, both versions exercise the same code paths and pass — TE <= 2.14
skips the `isinstance(output, tuple)` branch and behaves identically to
before.
### Before your PR is "*Ready for review*"
Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)
and your commits are signed (`git commit -s -S`).
Make sure you read and follow the [Security Best
Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors)
(e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(...,
weights_only=False)`, `pickle`, etc.).
- Is this change backward compatible?: ✅ <!--- Public API unchanged; TE
<= 2.14 path is identical (isinstance branch is false). -->
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: N/A
- Did you write any new necessary tests?: N/A <!--- Existing
`test_transformer_engine.py` already exercises both paths; it would have
caught this on TE 2.15 had CI been running against that version. A
TE-version matrix is the right follow-up but is out of scope here. -->
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
N/A
### Additional Information
<!-- E.g. related issue. -->
Triggered by Megatron-Bridge failing tests after their TE 2.15 bump. The
`release/0.44.0` cherry-pick was pushed directly (commit `c897fbeaaf`)
so Bridge could unblock; this PR carries the same fix forward to main.
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>1 parent d6e1973 commit dd445ba
1 file changed
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
100 | 100 | | |
101 | 101 | | |
102 | 102 | | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
103 | 107 | | |
104 | 108 | | |
105 | 109 | | |
| |||
195 | 199 | | |
196 | 200 | | |
197 | 201 | | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
198 | 206 | | |
199 | 207 | | |
200 | 208 | | |
| |||
0 commit comments