Skip to content

[MAX] Add Qwen-Image pipeline#11

Draft
jglee-sqbits wants to merge 1 commit into
mainfrom
add/qwen-image/pipeline
Draft

[MAX] Add Qwen-Image pipeline#11
jglee-sqbits wants to merge 1 commit into
mainfrom
add/qwen-image/pipeline

Conversation

@jglee-sqbits
Copy link
Copy Markdown
Collaborator

@jglee-sqbits jglee-sqbits commented Mar 10, 2026

Summary

  • add the Qwen image transformer and diffusion pipeline
  • register the Qwen image architecture
  • update the offline generation example
  • add Qwen image integration tests
  • reduce avoidable reshape recompiles in the Qwen image pipeline glue code

Testing

  • ./bazelw run format
  • ./bazelw run lint
  • validated end-to-end Qwen-Image generation locally

Checklist

  • The PR is small and focused on one thing.
  • The code was formatted.
  • The code was tested.

@jglee-sqbits jglee-sqbits changed the title [MAX] Add Qwen image pipeline [MAX] Add Qwen-Image pipeline Mar 10, 2026
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces support for the Qwen-Image diffusion pipeline, a new text-to-image generation architecture. It integrates the Qwen image transformer and its associated components, updates the existing offline generation example to accommodate Qwen-Image specific parameters like true CFG scale and multiple input images, and includes comprehensive integration tests to validate the implementation against the reference.

Highlights

  • Qwen-Image Pipeline Integration: Added the Qwen-Image transformer and diffusion pipeline, enabling text-to-image generation with this new architecture.
  • Architecture Registration: Registered the Qwen-Image architecture within the pipeline system, making it discoverable and usable.
  • Offline Generation Example Update: Updated the simple_offline_generation.py example to support Qwen-Image specific parameters, including true-cfg-scale and the ability to handle multiple input images.
  • Comprehensive Integration Tests: Included a suite of integration tests to ensure parity with the reference Qwen-Image implementation for the transformer block, scheduler, and text encoder.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • max/examples/diffusion/simple_offline_generation.py
    • Added constants for Qwen-Image architecture names and default guidance scales.
    • Modified parse_args to include true-cfg-scale and allow multiple input-image arguments.
    • Updated argument validation for guidance-scale and true-cfg-scale.
    • Extended max_length logic for Qwen-Image architectures.
    • Refactored image loading to handle multiple input images and Qwen-Image specific guidance logic.
    • Updated parameter logging to include true_cfg_scale.
    • Adjusted warmup image loading for multiple input images.
  • max/python/max/pipelines/architectures/init.py
    • Imported qwen_image_arch, qwen_image_edit_arch, and qwen_image_edit_plus_arch.
    • Registered the new Qwen-Image architectures.
  • max/python/max/pipelines/architectures/qwen_image/init.py
    • Added __init__.py to define the QwenImageTransformerModel and qwen_image_arch for the Qwen-Image architecture.
  • max/python/max/pipelines/architectures/qwen_image/arch.py
    • Defined QwenImageArchConfig for pipeline-level configuration.
    • Registered QwenImagePipeline as a SupportedArchitecture for pixel generation.
  • max/python/max/pipelines/architectures/qwen_image/layers/init.py
    • Added __init__.py to export embedding and attention layers for Qwen-Image.
  • max/python/max/pipelines/architectures/qwen_image/layers/embeddings.py
    • Implemented get_timestep_embedding for sinusoidal timestep embeddings.
    • Implemented apply_rotary_emb for applying rotary embeddings.
    • Implemented get_1d_rotary_pos_embed for 1D rotary position embeddings.
    • Defined Timesteps and TimestepEmbedding modules.
    • Implemented QwenImageTimestepProjEmbeddings for timestep-only projection.
    • Implemented QwenImagePosEmbed for 3D Rotary Position Embeddings.
  • max/python/max/pipelines/architectures/qwen_image/layers/normalizations.py
    • Implemented LayerNormNoAffine for LayerNorm without learned affine parameters.
    • Implemented AdaLayerNormContinuous for adaptive layer normalization.
  • max/python/max/pipelines/architectures/qwen_image/layers/qwen_image_attention.py
    • Implemented _QwenImageGELU and _QwenImageDropout for FeedForward.
    • Implemented QwenImageFeedForward matching diffusers key naming.
    • Implemented QwenImageAttention for dual-stream attention.
    • Implemented _SiLUPlaceholder and _make_block_modulation for per-block modulation.
    • Implemented QwenImageTransformerBlock for dual-stream transformer blocks with per-block modulation.
  • max/python/max/pipelines/architectures/qwen_image/model.py
    • Implemented QwenImageTransformerModel as a ComponentModel to load and execute the Qwen-Image transformer.
  • max/python/max/pipelines/architectures/qwen_image/model_config.py
    • Defined QwenImageConfigBase and QwenImageConfig for model configuration.
  • max/python/max/pipelines/architectures/qwen_image/pipeline_qwen_image.py
    • Defined QwenImageModelInputs and QwenImagePipelineOutput data classes.
    • Implemented QwenImagePipeline for text-to-image generation, wiring together VAE, text encoder, and transformer.
    • Added methods for _cfg_blend, prepare_prompt_embeddings, _prepare_text_ids, _prepare_image_ids, _get_reshape_fn, decode_latents, _postprocess_latents, _to_numpy, _image_to_flat_hwc, preprocess_latents, _patchify_and_pack, scheduler_step, prepare_scheduler, and execute.
  • max/python/max/pipelines/architectures/qwen_image/qwen_image.py
    • Implemented the core Qwen-Image Transformer 2D model.
  • max/python/max/pipelines/architectures/qwen_image/weight_adapters.py
    • Added a placeholder file indicating no specific weight remapping is required for Qwen-Image.
  • max/tests/integration/architectures/qwen_image/BUILD.bazel
    • Configured Bazel build rules for Qwen-Image integration tests.
  • max/tests/integration/architectures/qwen_image/conftest.py
    • Provided Pytest fixtures for Qwen-Image integration tests, including configurations and dummy data.
  • max/tests/integration/architectures/qwen_image/test_attention.py
    • Added an integration test to verify the Qwen-Image transformer block's output against a reference.
  • max/tests/integration/architectures/qwen_image/test_scheduler_parity.py
    • Added an integration test to confirm the Qwen-Image scheduler's sigma schedule matches the reference.
  • max/tests/integration/architectures/qwen_image/test_text_encoder_parity.py
    • Added an integration test to validate the Qwen2.5-VL text encoder's output against a reference.
  • max/tests/integration/architectures/qwen_image/testdata/BUILD.bazel
    • Configured Bazel build rules for Qwen-Image test data.
  • max/tests/integration/architectures/qwen_image/testdata/config.json
    • Added a configuration file for Qwen-Image test data.
Activity
  • The pull request was created by jglee-sqbits with a detailed summary of changes and testing performed. No other reviewer activity is noted.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@jglee-sqbits jglee-sqbits force-pushed the add/qwen-image/pipeline branch from 19a8fd3 to 8f286a7 Compare March 24, 2026 08:34
@jglee-sqbits jglee-sqbits force-pushed the add/qwen-image/pipeline branch from 8f286a7 to 621af71 Compare March 24, 2026 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant