Skip to content

feat: Add Modular Pipeline for Stable Diffusion 3 (SD3)#13324

Merged
yiyixuxu merged 37 commits into
huggingface:mainfrom
AlanPonnachan:feat/sd3-modular-pipeline
May 7, 2026
Merged

feat: Add Modular Pipeline for Stable Diffusion 3 (SD3)#13324
yiyixuxu merged 37 commits into
huggingface:mainfrom
AlanPonnachan:feat/sd3-modular-pipeline

Conversation

@AlanPonnachan
Copy link
Copy Markdown
Contributor

@AlanPonnachan AlanPonnachan commented Mar 24, 2026

What does this PR do?

This PR introduces the modular architecture for Stable Diffusion 3 (SD3), implementing both Text-to-Image (T2I) and Image-to-Image (I2I) pipelines.

Key additions:

  • Added SD3ModularPipeline and SD3AutoBlocks to the dynamic modular pipeline resolver.
  • Migrated SD3-specific mechanics to the new BlockState
  • Added corresponding dummy objects and lazy-loading fallbacks.
  • Added TestSD3ModularPipelineFast and TestSD3Img2ImgModularPipelineFast test suites.

Related issue: #13295

Before submitting

Usage Example

import torch
from IPython.display import display
from diffusers import ComponentsManager
from diffusers.modular_pipelines.stable_diffusion_3 import StableDiffusion3ModularPipeline, StableDiffusion3AutoBlocks
from diffusers.utils import load_image

from diffusers import FlowMatchEulerDiscreteScheduler, SD3Transformer2DModel, AutoencoderKL
from diffusers.guiders import ClassifierFreeGuidance
from diffusers.image_processor import VaeImageProcessor
from transformers import CLIPTokenizer, CLIPTextModelWithProjection

components = ComponentsManager()
components.enable_auto_cpu_offload(device="cuda")

# Instantiate the Modular Pipeline 
blocks = StableDiffusion3AutoBlocks()
pipeline = StableDiffusion3ModularPipeline(blocks=blocks, components_manager=components)

repo_id = "stabilityai/stable-diffusion-3-medium-diffusers"
print("Loading components...")

# Load ONLY CLIP tokenizers
tokenizer = CLIPTokenizer.from_pretrained(repo_id, subfolder="tokenizer")
tokenizer_2 = CLIPTokenizer.from_pretrained(repo_id, subfolder="tokenizer_2")

# Load diffusers components
scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
guider = ClassifierFreeGuidance.from_config({"guidance_scale": 7.0})
image_processor = VaeImageProcessor(vae_scale_factor=8, vae_latent_channels=16)

# Load ONLY CLIP text encoders
text_encoder = CLIPTextModelWithProjection.from_pretrained(repo_id, subfolder="text_encoder", torch_dtype=torch.float16)
text_encoder_2 = CLIPTextModelWithProjection.from_pretrained(repo_id, subfolder="text_encoder_2", torch_dtype=torch.float16)

# Load Transformer and VAE
transformer = SD3Transformer2DModel.from_pretrained(repo_id, subfolder="transformer", torch_dtype=torch.float16)
vae = AutoencoderKL.from_pretrained(repo_id, subfolder="vae", torch_dtype=torch.float16)

# Inject components directly into the pipeline
pipeline.update_components(
    tokenizer=tokenizer,
    tokenizer_2=tokenizer_2,
    tokenizer_3=None,    # Dropped to prevent OOM
    scheduler=scheduler,
    guider=guider,
    image_processor=image_processor,
    text_encoder=text_encoder,
    text_encoder_2=text_encoder_2,
    text_encoder_3=None, # Dropped to prevent OOM
    transformer=transformer,
    vae=vae
)

print("Components loaded successfully! Memory saved.")


# TEXT-TO-IMAGE 

prompt = "A highly detailed macro photography of a glowing bioluminescent blue butterfly resting on a vibrant red rose, dark magical forest background, cinematic lighting, 8k resolution, masterpiece"

print("Running Text-to-Image...")
t2i_output = pipeline(
    prompt=prompt,
    num_inference_steps=28,
    guidance_scale=7.0,
    generator=torch.manual_seed(42)
)
t2i_output.images[0].save("sd3_modular_t2i.png")
print("Saved sd3_modular_t2i.png")
display(t2i_output.images[0])


# IMAGE-TO-IMAGE 

init_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png").resize((1024, 1024))

prompt_i2i = "A beautiful classic impressionist oil painting of a cat looking at the camera, thick expressive brushstrokes, vibrant colors, museum masterpiece"

print("Running Image-to-Image...")
i2i_output = pipeline(
    prompt=prompt_i2i,
    image=init_image,
    strength=0.8,
    num_inference_steps=28,
    guidance_scale=7.0,
    generator=torch.manual_seed(42)
)
i2i_output.images[0].save("sd3_modular_i2i.png")
print("Saved sd3_modular_i2i.png")
display(i2i_output.images[0])

Colab notebook: https://colab.research.google.com/drive/18_tZWIQdObq8EX0Vyd9ysGA-oACDwpf8?usp=sharing

Outputs

Text-to-Image:

sd3_modular_t2i

Image-to-Image:

sd3_modular_i2i

Who can review?

@sayakpaul @asomoza

@sayakpaul sayakpaul requested review from asomoza and yiyixuxu March 25, 2026 02:22
@sayakpaul
Copy link
Copy Markdown
Member

sayakpaul commented Mar 25, 2026

@AlanPonnachan thanks for this PR! Could you also provide some test code and sample outputs?

Copy link
Copy Markdown
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting started on this! I left some comments (majorly on the use of guidance).

Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/before_denoise.py Outdated
Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/before_denoise.py Outdated
Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/denoise.py Outdated
Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/denoise.py Outdated
Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/encoders.py
Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/encoders.py
Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/encoders.py Outdated
@sayakpaul
Copy link
Copy Markdown
Member

@claude can you review this?

@claude
Copy link
Copy Markdown

claude Bot commented Mar 28, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

@sayakpaul
Copy link
Copy Markdown
Member

@bot /style

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 28, 2026

Style bot fixed some files and pushed the changes.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@AlanPonnachan
Copy link
Copy Markdown
Contributor Author

@sayakpaul
test_modular_pipeline_stable_diffusion_3.py tests are passing.

Sample outputs you can find here: #13324 (comment)

Copy link
Copy Markdown
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this!
I left one comment

Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/denoise.py Outdated
@AlanPonnachan AlanPonnachan requested a review from yiyixuxu April 1, 2026 17:53
logger = logging.get_logger(__name__)


# auto_docstring
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added a doc page on this here #13382
basically you need to run

python utils/modular_auto_docstring.py --fix_and_overwrite

and to look through the generated docstring to see if all the paramters are properly defined

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yiyixuxu , I added description to most of InputParam and OutputParam and ran the above script.
I skimmed throught the docstrings once and felt right .
Let me know your thoughts!

@yiyixuxu
Copy link
Copy Markdown
Collaborator

yiyixuxu commented Apr 1, 2026

@claude
can you do a review here?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 1, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

@yiyixuxu
Copy link
Copy Markdown
Collaborator

@AlanPonnachan
can you look into the point 3/4/5 from claude review?

Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/encoders.py Outdated
Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/encoders.py Outdated
@github-actions github-actions Bot added size/L PR with diff > 200 LOC and removed size/L PR with diff > 200 LOC labels Apr 28, 2026
@AlanPonnachan AlanPonnachan requested a review from yiyixuxu April 28, 2026 16:43
@yiyixuxu
Copy link
Copy Markdown
Collaborator

@bot /style

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

Style bot fixed some files and pushed the changes.

@github-actions github-actions Bot added size/L PR with diff > 200 LOC and removed size/L PR with diff > 200 LOC labels Apr 28, 2026
return components, state


# auto_docstring
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh the formatter complained here, I think because it expected to see a docstring here

can you run python utils/modular_auto_docstring.py --fix_and_overwrite?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-ran this command locally but still no changes except for modular_blocks_hunyuan_video1_5.py.
since it is not part of this pr, i didn't push that change.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the #auto_docstring here
still need make fix-cpies I think

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran make fix-copies and pushed the changes

@github-actions github-actions Bot added size/L PR with diff > 200 LOC and removed size/L PR with diff > 200 LOC labels Apr 29, 2026
Comment thread src/diffusers/modular_pipelines/stable_diffusion_3/denoise.py Outdated
@github-actions github-actions Bot added size/L PR with diff > 200 LOC and removed size/L PR with diff > 200 LOC labels Apr 29, 2026
@yiyixuxu
Copy link
Copy Markdown
Collaborator

@bot /style

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

Style fix runs successfully without any file modified.

@github-actions github-actions Bot removed the size/L PR with diff > 200 LOC label Apr 30, 2026
@github-actions github-actions Bot added size/L PR with diff > 200 LOC and removed size/L PR with diff > 200 LOC labels Apr 30, 2026
@yiyixuxu yiyixuxu merged commit 7b107d3 into huggingface:main May 7, 2026
17 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants