Skip to content

Commit bc3bca0

Browse files
authored
Update README and ignore kernels in CI (#387)
1 parent c60e65d commit bc3bca0

2 files changed

Lines changed: 22 additions & 3 deletions

File tree

.github/workflows/UnitTests.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,11 +57,11 @@ jobs:
5757
- name: PyTest
5858
run: | #--deselect=src/maxdiffusion/tests/input_pipeline_interface_test.py
5959
export LIBTPU_INIT_ARGS='--xla_tpu_scoped_vmem_limit_kib=65536'
60-
HF_HUB_CACHE=/mnt/disks/github-runner-disk/ HF_HOME=/mnt/disks/github-runner-disk/ TOKENIZERS_PARALLELISM=false python3 -m pytest --deselect=src/maxdiffusion/tests/ltx_transformer_step_test.py -x
61-
# add_pull_ready:
60+
HF_HUB_CACHE=/mnt/disks/github-runner-disk/ HF_HOME=/mnt/disks/github-runner-disk/ TOKENIZERS_PARALLELISM=false python3 -m pytest --ignore=src/maxdiffusion/kernels/ --deselect=src/maxdiffusion/tests/ltx_transformer_step_test.py -x
61+
# add_pull_ready
6262
# if: github.ref != 'refs/heads/main'
6363
# permissions:
6464
# checks: read
6565
# pull-requests: write
6666
# needs: build
67-
# uses: ./.github/workflows/AddLabel.yml
67+
# uses: ./.github/workflows/AddLabel.yml

README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
[![Unit Tests](https://github.com/AI-Hypercomputer/maxdiffusion/actions/workflows/UnitTests.yml/badge.svg)](https://github.com/AI-Hypercomputer/maxdiffusion/actions/workflows/UnitTests.yml)
1818

1919
# What's new?
20+
- **`2026/04/16`**: Support for Tokamax Ring Attention kernel is now added.
2021
- **`2026/03/31`**: Wan2.2 SenCache inference is now supported for T2V and I2V (up to 1.4x speedup)
2122
- **`2026/03/25`**: Wan2.1 and Wan2.2 Magcache inference is now supported
2223
- **`2026/03/25`**: LTX-2 Video Inference is now supported
@@ -623,6 +624,24 @@ To generate images, run the following command:
623624
...
624625
```
625626

627+
### Ring Attention
628+
We added ring attention support for Wan models. Below are the stats for one `720p` (81 frames) video generation (with CFG DP):
629+
| Accelerator | Model | Attention Type | Inference Steps | Sharding | e2e Generation Time |
630+
| -- | -- | -- | -- | -- | -- |
631+
| v7x-8 | WAN 2.1 | Tokamax Flash | 50 | dp2-fsdp1-context4-tp1 | 264.2 |
632+
| v7x-8 | WAN 2.1 | Tokamax Ring | 50 | dp2-fsdp1-context4-tp1 | **252.4** |
633+
| v7x-8 | WAN 2.2 | Tokamax Flash | 40 | dp2-fsdp1-context4-tp1 | 212.7 |
634+
| v7x-8 | WAN 2.2 | Tokamax Ring | 40 | dp2-fsdp1-context4-tp1 | **201.7** |
635+
636+
| Accelerator | Model | Attention Type | Inference Steps | Sharding | e2e Generation Time |
637+
| -- | -- | -- | -- | -- | -- |
638+
| v7x-16 | WAN 2.1 | Tokamax Flash | 50 | dp2-fsdp1-context8-tp1 | 146.6 |
639+
| v7x-16 | WAN 2.1 | Tokamax Ring | 50 | dp2-fsdp1-context8-tp1 | **137.2** |
640+
| v7x-16 | WAN 2.2 | Tokamax Flash | 40 | dp2-fsdp1-context8-tp1 | **117.8** |
641+
| v7x-16 | WAN 2.2 | Tokamax Ring | 40 | dp2-fsdp1-context8-tp1 | 137.5 |
642+
643+
(* There are some known stability issues for ring attention on 16 TPUs, please use `tokamax_flash` attention instead.)
644+
626645
## Flux
627646

628647
First make sure you have permissions to access the Flux repos in Huggingface.

0 commit comments

Comments
 (0)