Skip to content

feat: Add int8 sd conversion function for aiu#95

Merged
chichun-charlie-liu merged 13 commits intofoundation-model-stack:mainfrom
andrea-fasoli:int8_sd_conversion
May 9, 2025
Merged

feat: Add int8 sd conversion function for aiu#95
chichun-charlie-liu merged 13 commits intofoundation-model-stack:mainfrom
andrea-fasoli:int8_sd_conversion

Conversation

@andrea-fasoli
Copy link
Copy Markdown
Collaborator

Description of the change

This PR introduces a conversion function for the state dictionary of an INT8-quantized model created with fms-mo.
By calling save_sd_for_aiu, a new state dictionary which complies with AIU requirements is generated and saved.
This new state dictionary / checkpoint can be loaded using fms get_model function in combination with the INT8 add-ons already present in fms-mo (see fms_mo/aiu_addons/).

The following processing steps are taken by this conversion function:

  • a new smoothquant scale is created by combining smoothquant weight and activation scales (and smoothquant alpha)
  • weights are first scaled, then converted to signed integer format (torch.int8)
  • zero_shift is computed and added to the state dictionary, if needed
  • keys/values not needed on the AIU are purged from the dictionary

Was the PR tested

  • I have ensured all unit tests pass

@andrea-fasoli andrea-fasoli changed the title Add int8 sd conversion function for aiu feat: Add int8 sd conversion function for aiu Apr 14, 2025
@github-actions github-actions Bot added the feat label Apr 14, 2025
Comment thread fms_mo/utils/aiu_utils.py Outdated
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
@andrea-fasoli
Copy link
Copy Markdown
Collaborator Author

by adding SAWB recomputation in the presence of narrow INT weights distributions, this PR also addresses issue #109

@andrea-fasoli
Copy link
Copy Markdown
Collaborator Author

needs testing and unit testing...

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
@andrea-fasoli
Copy link
Copy Markdown
Collaborator Author

The PR has been tested with multiple INT8 quantization configurations and is ready to be merged.

Unit tests are missing, will be added at a later date (cc: @BrandonGroth)

Conversion of a checkpoint trained with smoothquant is not supported when combined with SAWB-based weights recomputation. This feature will be implemented as part of issue #112. The feature is not strictly needed for INT8 RoBERTa, where smoothquant-free quantization configurations perform well. It may be needed for INT8 LLMs enablement.

@chichun-charlie-liu chichun-charlie-liu linked an issue May 8, 2025 that may be closed by this pull request
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Comment thread fms_mo/utils/aiu_utils.py Outdated
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
@chichun-charlie-liu chichun-charlie-liu merged commit 09c7761 into foundation-model-stack:main May 9, 2025
11 checks passed
@andrea-fasoli andrea-fasoli deleted the int8_sd_conversion branch May 9, 2025 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

INT8 AIU support

2 participants