Skip to content

Add guardrails against narrow INT distributions #109

@andrea-fasoli

Description

@andrea-fasoli

Is your feature request related to a problem? Please describe.

Depending on the model, narrow distributions of INT weights can lead to underflow and accuracy degradation when running inference with limited accumulation lengths. Distributions could be adjusted automatically by tweaking the clip values (quantization boundaries) in scenarios when narrow weight distributions are detected.

Describe the solution you'd like

Analysis of the INT weights distributions in a trained checkpoint is needed. If deemed to narrow, recompute the clip values and INT weights using SAWB quantizer, and save the modified checkpoint.

Additional context

This feature was present in the older sq1e repository and can be carried over.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions