Add Ulysses attention by csgoogle · Pull Request #376 · AI-Hypercomputer/maxdiffusion

csgoogle · 2026-04-13T17:48:57Z

Summary

This PR adds Ulysses attention support for WAN TPU inference in MaxDiffusion and documents how to enable it.

Design Doc: https://docs.google.com/document/d/1_hrPGaIwj84iF8vFJrcdKdmwfKJPvW6O2Sy5ftLVn60/edit?usp=sharing&resourcekey=0-p0zkvHa_NJDwHPqLwNxNCg

What Changed

added a TPU Ulysses attention path for WAN that performs sequence-to-head all_to_all before local splash attention and restores the original layout afterward
refactored the TPU flash/Ulysses block-size resolution logic so both paths use the same helper
added fail fast with a ValueError when the attention head count is not divisible by the context shard count
added tests
updated the README to document Ulysses support for WAN inference, including the required attention="ulysses" and ici_context_parallelism>1 override pattern

Performance

TPU v6e

Wan2.2 I2V

Setup:

model: Wan-AI/Wan2.2-I2V-A14B-Diffusers
hardware: 8x TPU v6 lite
parallelism: dp=2, cp=4, fsdp=1, tp=1
timing config: 40 inference steps, 81 frames, 720x1280

Global Batch Size	Flash	Ulysses	Delta
1	285.56s	251.45s	-11.9%
2	533.67s	491.22s	-8.0%

Wan2.2 T2V

Setup:

model: Wan-AI/Wan2.2-T2V-A14B-Diffusers
hardware: 8x TPU v6e
parallelism: dp=2, cp=4, fsdp=1, tp=1
timing config: 40 inference steps, 81 frames, 720x1280

Global Batch Size	Flash	Ulysses	Delta
1	275.54s	246.90s	-10.39%
2	535.40s	480.24s	-10.30%

TPU v7x

Wan2.2 I2V

Setup:

model: Wan-AI/Wan2.2-I2V-A14B-Diffusers
hardware: TPU v7-8 (8 chips)
parallelism: ici_context_parallelism=4, ici_data_parallelism=2
timing config: 40 inference steps, 81 frames, 720x1280
flash block sizes: block_q=2048, block_kv=2048, block_kv_compute=1024

Global Batch Size	Flash	Ulysses	Delta
1	209s	199s	-5%
2	414s	394s	-5%
4	829s	780s	-6%

github-actions · 2026-04-13T17:49:09Z

e2e testgrid: https://8bcf50593faf4ea38060e236169827e5-dot-us-central1.composer.googleusercontent.com/dags/maxdiffusion_tpu_e2e/grid

Perseus14 · 2026-04-16T06:31:29Z

@csgoogle Please squash your commits

csgoogle · 2026-04-16T18:01:41Z

@csgoogle Please squash your commits

done

csgoogle changed the title ~~working code~~ Add Ulysses attention Apr 15, 2026

csgoogle marked this pull request as ready for review April 15, 2026 09:18

csgoogle requested a review from entrpn as a code owner April 15, 2026 09:18

Perseus14 reviewed Apr 15, 2026

View reviewed changes

Comment thread src/maxdiffusion/models/attention_flax.py

entrpn previously approved these changes Apr 15, 2026

View reviewed changes

Perseus14 previously approved these changes Apr 16, 2026

View reviewed changes

csgoogle dismissed stale reviews from Perseus14 and entrpn via 292fd84 April 16, 2026 17:05

csgoogle force-pushed the ulysses-attention-benchmark branch from 656e150 to 1b3bbe2 Compare April 16, 2026 17:31

Perseus14 previously approved these changes Apr 16, 2026

View reviewed changes

csgoogle dismissed Perseus14’s stale review via 673b2a9 April 16, 2026 17:46

csgoogle force-pushed the ulysses-attention-benchmark branch 2 times, most recently from 673b2a9 to 5f75432 Compare April 16, 2026 17:48

Add Ulysses attention support

a4e0ae7

csgoogle force-pushed the ulysses-attention-benchmark branch from 5f75432 to a4e0ae7 Compare April 16, 2026 17:54

entrpn approved these changes Apr 16, 2026

View reviewed changes

github-actions Bot added the pull ready label Apr 17, 2026

copybara-service Bot merged commit 702cadd into main Apr 17, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ulysses attention#376

Add Ulysses attention#376
copybara-service[bot] merged 1 commit intomainfrom
ulysses-attention-benchmark

csgoogle commented Apr 13, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 13, 2026

Uh oh!

Uh oh!

Perseus14 commented Apr 16, 2026

Uh oh!

csgoogle commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

csgoogle commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Performance

TPU v6e

Wan2.2 I2V

Wan2.2 T2V

TPU v7x

Wan2.2 I2V

Uh oh!

github-actions Bot commented Apr 13, 2026

Uh oh!

Uh oh!

Perseus14 commented Apr 16, 2026

Uh oh!

csgoogle commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

csgoogle commented Apr 13, 2026 •

edited

Loading