Skip to content

Fix training gradient underflow in quantization tests#13539

Open
jiqing-feng wants to merge 4 commits intohuggingface:mainfrom
jiqing-feng:torchao-fix-training-underflow
Open

Fix training gradient underflow in quantization tests#13539
jiqing-feng wants to merge 4 commits intohuggingface:mainfrom
jiqing-feng:torchao-fix-training-underflow

Conversation

@jiqing-feng
Copy link
Copy Markdown
Contributor

What does this PR do?

Changes autocast dtype from float16 to bfloat16 in _test_quantization_training. Float16's limited dynamic range (max ~65504, min subnormal ~5.96e-8) causes gradients to underflow to zero when passing through quantized tensor subclass operations; bfloat16 shares float32's exponent range and avoids this.

Change autocast dtype from float16 to bfloat16 in _test_quantization_training.
Float16's limited dynamic range causes gradients to underflow to zero when
passing through quantized tensor subclass operations.
@github-actions github-actions Bot added size/S PR with diff < 50 LOC tests and removed size/S PR with diff < 50 LOC labels Apr 22, 2026
@github-actions github-actions Bot added size/S PR with diff < 50 LOC and removed size/S PR with diff < 50 LOC labels Apr 22, 2026
@github-actions github-actions Bot added size/S PR with diff < 50 LOC and removed size/S PR with diff < 50 LOC labels Apr 28, 2026
@jiqing-feng
Copy link
Copy Markdown
Contributor Author

Hi @sayakpaul . Would you please review the PR? Thanks!

inputs = self.get_dummy_inputs()

with torch.amp.autocast(torch_device, dtype=torch.float16):
# Use bfloat16 instead of float16 to avoid gradient underflow with quantized layers
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this quantization backend agnostic?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/S PR with diff < 50 LOC tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants