Enable assymetric quantization for all MultiHeadAttention qdq layers by wpietka · Pull Request #2468 · intel/neural-compressor

wpietka · 2026-05-12T13:10:41Z

Type of Change

Improvement

Description

Currently QDynamicMultiHeadAttention creates 4 separate qdq layers. Two of them are potentially asymmetrical - depending on activation dtype - and two others are always symmetrical. There are two problems here: firstly the symmetrical/asymmetrical policy differs from Static version which allows asymmetrical computation for all qdq layers and secondly dynamic version doesn't need 4 separate qdq layers. Since scale is computed in runtime and not preserved in the layer itself a single qdq layer can be reused for queries, keys and values. Attention qdq stays separate due to fixed range.

Expected Behavior & Potential Risk

Slightly increased dynamic layers accuracy

How has this PR been tested?

Vit benchmark has been run with different configurations, and the results show better accuracy with asymmetric layers

Dependency Change?

No dependency changes

Signed-off-by: Wojciech Piętka <wojciechx.pietka@intel.com>

bkowalskiINTEL

LGTM

wpietka force-pushed the dev/wpietkax/always-asymmetric-qdq-for-int8 branch from e07f851 to 5a2d00b Compare May 13, 2026 07:23

Enable assymetric quantization for all MultiHeadAttention qdq layers

5a2d00b

Signed-off-by: Wojciech Piętka <wojciechx.pietka@intel.com>

bkowalskiINTEL approved these changes May 13, 2026

View reviewed changes

wpietka merged commit 03f79f3 into master May 13, 2026
14 checks passed

wpietka deleted the dev/wpietkax/always-asymmetric-qdq-for-int8 branch May 13, 2026 12:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable assymetric quantization for all MultiHeadAttention qdq layers#2468

Enable assymetric quantization for all MultiHeadAttention qdq layers#2468
wpietka merged 1 commit into
masterfrom
dev/wpietkax/always-asymmetric-qdq-for-int8

wpietka commented May 12, 2026 •

edited

Loading

Uh oh!

bkowalskiINTEL left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

wpietka commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

Uh oh!

bkowalskiINTEL left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wpietka commented May 12, 2026 •

edited

Loading