Skip to content

Add diffusion FLUX fp8_static quantization#37

Merged
yghstill merged 10 commits into
Tencent:mainfrom
yghstill:add_diffusion_quant
Aug 26, 2025
Merged

Add diffusion FLUX fp8_static quantization#37
yghstill merged 10 commits into
Tencent:mainfrom
yghstill:add_diffusion_quant

Conversation

@yghstill
Copy link
Copy Markdown
Collaborator

@yghstill yghstill commented Aug 10, 2025

Features

  • Support FLUX's Transformer block fp8 static quantization calibration.
  • Export quantization model as safetensors format.
  • Support FLUX's quantization inference, load quant config from angelslim_config.json, for example:
from angelslim.engine import InferEngine
slim_engine = InferEngine()
slim_engine.from_pretrained(model_path="youu/quant/angelslim/model/")
output = slim_engine.generate("A beautiful landscape with mountains and a river.")

TODO

  • For better reviewability, the combined compression strategy (e.g., cache + quant) will be submitted in the next PR.

@yghstill yghstill changed the title Add diffusion FLUX fp8_static quantization [WIP]Add diffusion FLUX fp8_static quantization Aug 10, 2025
@yghstill yghstill changed the title [WIP]Add diffusion FLUX fp8_static quantization Add diffusion FLUX fp8_static quantization Aug 19, 2025
@yghstill yghstill merged commit 9177027 into Tencent:main Aug 26, 2025
5 checks passed
@yghstill yghstill deleted the add_diffusion_quant branch August 26, 2025 08:43
WOODchen7 pushed a commit to WOODchen7/AngelSlim that referenced this pull request Aug 27, 2025
dawnranger pushed a commit to dawnranger/AngelSlim that referenced this pull request Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants