[WIP][Common,pytorch] Add FlexQuantizer for customized quantization via CuTeDSL by kainzhong · Pull Request #3110 · NVIDIA/TransformerEngine

kainzhong · 2026-06-10T01:12:39Z

Description

Please include a brief summary of the changes, relevant motivation and context.

Fixes # (issue)

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

Please list the changes introduced in this PR:

Change A
Change B

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Signed-off-by: Kaining Zhong <kainingz@nvidia.com>

for more information, see https://pre-commit.ci

kainzhong · 2026-06-10T01:14:05Z

  NVTE_NVFP4_1D_SCALING = 4,
+  /*! Flex scaling. The quantization is implemented by users via CuTeDSL.
+   */
+  NVTE_FLEX_1D_SCALING = 5,


Probably a bad idea. Should take a look at how this is used and figure it out. If row-wise and column-wise quantization is different what should this be then...?

kainzhong · 2026-06-11T18:21:09Z

Closed in favor of #2817 . Looks like I can build flex quantization on top of his HybridQuantizedTensor which will same me lots of trouble integrating with the pytorch ecosystem.

kainzhong and others added 2 commits June 10, 2026 01:09

Add FlexQuantization

231b635

Signed-off-by: Kaining Zhong <kainingz@nvidia.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

ce94edd

for more information, see https://pre-commit.ci

kainzhong commented Jun 10, 2026

View reviewed changes

kainzhong closed this Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][Common,pytorch] Add FlexQuantizer for customized quantization via CuTeDSL#3110

[WIP][Common,pytorch] Add FlexQuantizer for customized quantization via CuTeDSL#3110
kainzhong wants to merge 2 commits into
NVIDIA:mainfrom
kainzhong:feat/flex_quantization

kainzhong commented Jun 10, 2026

Uh oh!

kainzhong Jun 10, 2026

Uh oh!

kainzhong commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kainzhong commented Jun 10, 2026

Description

Type of change

Changes

Checklist:

Uh oh!

kainzhong Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

kainzhong commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant