feat: Add int8 sd conversion function for aiu by andrea-fasoli · Pull Request #95 · foundation-model-stack/fms-model-optimizer

andrea-fasoli · 2025-04-14T20:39:04Z

Description of the change

This PR introduces a conversion function for the state dictionary of an INT8-quantized model created with fms-mo.
By calling save_sd_for_aiu, a new state dictionary which complies with AIU requirements is generated and saved.
This new state dictionary / checkpoint can be loaded using fms get_model function in combination with the INT8 add-ons already present in fms-mo (see fms_mo/aiu_addons/).

The following processing steps are taken by this conversion function:

a new smoothquant scale is created by combining smoothquant weight and activation scales (and smoothquant alpha)
weights are first scaled, then converted to signed integer format (torch.int8)
zero_shift is computed and added to the state dictionary, if needed
keys/values not needed on the AIU are purged from the dictionary

Was the PR tested

I have ensured all unit tests pass

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

andrea-fasoli · 2025-05-07T01:36:36Z

by adding SAWB recomputation in the presence of narrow INT weights distributions, this PR also addresses issue #109

andrea-fasoli · 2025-05-07T01:37:40Z

needs testing and unit testing...

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

andrea-fasoli · 2025-05-08T00:54:33Z

The PR has been tested with multiple INT8 quantization configurations and is ready to be merged.

Unit tests are missing, will be added at a later date (cc: @BrandonGroth)

Conversion of a checkpoint trained with smoothquant is not supported when combined with SAWB-based weights recomputation. This feature will be implemented as part of issue #112. The feature is not strictly needed for INT8 RoBERTa, where smoothquant-free quantization configurations perform well. It may be needed for INT8 LLMs enablement.

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

andrea-fasoli requested review from BrandonGroth, chichun-charlie-liu, kcirred, nwang-ibm and tharapalanivel as code owners April 14, 2025 20:39

andrea-fasoli changed the title ~~Add int8 sd conversion function for aiu~~ feat: Add int8 sd conversion function for aiu Apr 14, 2025

github-actions Bot added the feat label Apr 14, 2025

chichun-charlie-liu reviewed Apr 18, 2025

View reviewed changes

Comment thread fms_mo/utils/aiu_utils.py Outdated

andrea-fasoli added 2 commits May 6, 2025 17:18

Add int8 sd conversion function for aiu

c02f333

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

Add qconfig save during saving for AIU

60f1aa0

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

andrea-fasoli force-pushed the int8_sd_conversion branch from dc803bb to 60f1aa0 Compare May 6, 2025 18:23

andrea-fasoli added 4 commits May 6, 2025 18:55

type hints update

e72cb0d

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

update python version used by pylint to 3.10

770a5e1

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

Add SAWB-based recomputation of weights and refactor whole process

8c1aca4

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

docstring update

dded2c1

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

andrea-fasoli mentioned this pull request May 7, 2025

Support smoothquant in combination with weight SAWB-recomputation #112

Open

andrea-fasoli added 4 commits May 8, 2025 00:03

Bug fix for conversion

7a5e504

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

Add default qskip_large_mag_layers to qcfg

ebdec64

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

Add option for qcfg None

d596983

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

Fix default config

a6f6fcf

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

chichun-charlie-liu linked an issue May 8, 2025 that may be closed by this pull request

INT8 AIU support #114

Closed

Add max value guarding vs FP16 range for zero_shift

e294bf9

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

andrea-fasoli requested a review from chichun-charlie-liu May 8, 2025 20:36

chichun-charlie-liu reviewed May 8, 2025

View reviewed changes

Comment thread fms_mo/utils/aiu_utils.py Outdated

andrea-fasoli added 2 commits May 8, 2025 21:25

Use variable for std threshold and save zero_shift as FP32

ca44bf0

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

Fix ckpt saving bug

5d78549

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

chichun-charlie-liu approved these changes May 9, 2025

View reviewed changes

chichun-charlie-liu merged commit 09c7761 into foundation-model-stack:main May 9, 2025
11 checks passed

andrea-fasoli deleted the int8_sd_conversion branch May 9, 2025 16:14

andrea-fasoli mentioned this pull request May 9, 2025

Add guardrails against narrow INT distributions #109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add int8 sd conversion function for aiu#95

feat: Add int8 sd conversion function for aiu#95
chichun-charlie-liu merged 13 commits intofoundation-model-stack:mainfrom
andrea-fasoli:int8_sd_conversion

andrea-fasoli commented Apr 14, 2025

Uh oh!

Uh oh!

andrea-fasoli commented May 7, 2025

Uh oh!

andrea-fasoli commented May 7, 2025

Uh oh!

andrea-fasoli commented May 8, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andrea-fasoli commented Apr 14, 2025

Description of the change

Was the PR tested

Uh oh!

Uh oh!

andrea-fasoli commented May 7, 2025

Uh oh!

andrea-fasoli commented May 7, 2025

Uh oh!

andrea-fasoli commented May 8, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants