Skip to content

Commit 33f785b

Browse files
dg845yuanshenghaiSHYuanBestyiyixuxusayakpaul
authored
Add Helios-14B Video Generation Pipelines (#13208)
* [1/N] add helios * fix test * make fix-copies * change script path * fix cus script * update docs * fix documented check * update links for docs and examples * change default config * small refactor * add test * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * remove register_buffer for _scale_cache * fix non-cuda devices error * remove "handle the case when timestep is 2D" * refactor HeliosMultiTermMemoryPatch and process_input_hidden_states * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * fix calculate_shift * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * rewritten `einops` in pure `torch` * fix: pass patch_size to apply_schedule_shift instead of hardcoding * remove the logics of 'vae_decode_type' * move some validation into check_inputs() * rename helios scheduler & merge all into one step() * add some details to doc * move dmd step() logics from pipeline to scheduler * change to Python 3.9+ style type * fix NoneType error * refactor DMD scheduler's set_timestep * change rope related vars name * fix stage2 sample * fix dmd sample * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * remove redundant & refactor norm_out * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * change "is_keep_x0" to "keep_first_frame" * use a more intuitive name * refactor dynamic_time_shifting * remove use_dynamic_shifting args * remove usage of UniPCMultistepScheduler * separate stage2 sample to HeliosPyramidPipeline * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * fix transformer * use a more intuitive name * update example script * fix requirements * remove redudant attention mask * fix * optimize pipelines * make style . * update TYPE_CHECKING * change to use torch.split Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * derive memory patch sizes from patch_size multiples * remove some hardcoding * move some checks into check_inputs * refactor sample_block_noise * optimize encoding chunks logits for v2v * use num_history_latent_frames = sum(history_sizes) * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * remove redudant optimized_scale * Update src/diffusers/pipelines/helios/pipeline_helios_pyramid.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * use more descriptive name * optimize history_latents * remove not used "num_inference_steps" * removed redudant "pyramid_num_stages" * add "is_cfg_zero_star" and "is_distilled" to HeliosPyramidPipeline * remove redudant * change example scripts name * change example scripts name * correct docs * update example * update docs * Update tests/models/transformers/test_models_transformer_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update tests/models/transformers/test_models_transformer_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * separate HeliosDMDScheduler * fix numerical stability issue: * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * remove redudant * small refactor * remove use_interpolate_prompt logits * simplified model test * fallbackt to BaseModelTesterConfig * remove _maybe_expand_t2v_lora_for_i2v * fix HeliosLoraLoaderMixin * update docs * use randn_tensor for test * fix doc typo * optimize code * mark torch.compile xfail * change paper name * Make get_dummy_inputs deterministic using self.generator * Set less strict threshold for test_save_load_float16 test for Helios pipeline * make style and make quality * Preparation for merging * add torch.Generator * Fix HeliosPipelineOutput doc path * Fix Helios related (optimize docs & remove redudant) (#13210) * fix docs * remove redudant * remove redudant * fix group offload * Removed fixes for group offload --------- Co-authored-by: yuanshenghai <yuanshenghai@bytedance.com> Co-authored-by: Shenghai Yuan <140951558+SHYuanBest@users.noreply.github.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: SHYuanBest <shyuan-cs@hotmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
1 parent 06ccde9 commit 33f785b

33 files changed

+5655
-2
lines changed

docs/source/en/_toctree.yml

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,8 @@
194194
title: Model accelerators and hardware
195195
- isExpanded: false
196196
sections:
197+
- local: using-diffusers/helios
198+
title: Helios
197199
- local: using-diffusers/consisid
198200
title: ConsisID
199201
- local: using-diffusers/sdxl
@@ -350,6 +352,8 @@
350352
title: FluxTransformer2DModel
351353
- local: api/models/glm_image_transformer2d
352354
title: GlmImageTransformer2DModel
355+
- local: api/models/helios_transformer3d
356+
title: HeliosTransformer3DModel
353357
- local: api/models/hidream_image_transformer
354358
title: HiDreamImageTransformer2DModel
355359
- local: api/models/hunyuan_transformer2d
@@ -625,7 +629,6 @@
625629
title: Image-to-image
626630
- local: api/pipelines/stable_diffusion/inpaint
627631
title: Inpainting
628-
629632
- local: api/pipelines/stable_diffusion/latent_upscale
630633
title: Latent upscaler
631634
- local: api/pipelines/stable_diffusion/ldm3d_diffusion
@@ -674,6 +677,8 @@
674677
title: ConsisID
675678
- local: api/pipelines/framepack
676679
title: Framepack
680+
- local: api/pipelines/helios
681+
title: Helios
677682
- local: api/pipelines/hunyuan_video
678683
title: HunyuanVideo
679684
- local: api/pipelines/hunyuan_video15
@@ -745,6 +750,10 @@
745750
title: FlowMatchEulerDiscreteScheduler
746751
- local: api/schedulers/flow_match_heun_discrete
747752
title: FlowMatchHeunDiscreteScheduler
753+
- local: api/schedulers/helios_dmd
754+
title: HeliosDMDScheduler
755+
- local: api/schedulers/helios
756+
title: HeliosScheduler
748757
- local: api/schedulers/heun
749758
title: HeunDiscreteScheduler
750759
- local: api/schedulers/ipndm

docs/source/en/api/loaders/lora.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
2323
- [`AuraFlowLoraLoaderMixin`] provides similar functions for [AuraFlow](https://huggingface.co/fal/AuraFlow).
2424
- [`LTXVideoLoraLoaderMixin`] provides similar functions for [LTX-Video](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video).
2525
- [`SanaLoraLoaderMixin`] provides similar functions for [Sana](https://huggingface.co/docs/diffusers/main/en/api/pipelines/sana).
26+
- [`HeliosLoraLoaderMixin`] provides similar functions for [HunyuanVideo](https://huggingface.co/docs/diffusers/main/en/api/pipelines/helios).
2627
- [`HunyuanVideoLoraLoaderMixin`] provides similar functions for [HunyuanVideo](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hunyuan_video).
2728
- [`Lumina2LoraLoaderMixin`] provides similar functions for [Lumina2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/lumina2).
2829
- [`WanLoraLoaderMixin`] provides similar functions for [Wan](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan).
@@ -86,6 +87,10 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
8687

8788
[[autodoc]] loaders.lora_pipeline.SanaLoraLoaderMixin
8889

90+
## HeliosLoraLoaderMixin
91+
92+
[[autodoc]] loaders.lora_pipeline.HeliosLoraLoaderMixin
93+
8994
## HunyuanVideoLoraLoaderMixin
9095

9196
[[autodoc]] loaders.lora_pipeline.HunyuanVideoLoraLoaderMixin
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# HeliosTransformer3DModel
13+
14+
A 14B Real-Time Autogressive Diffusion Transformer model (support T2V, I2V and V2V) for 3D video-like data from [Helios](https://github.com/PKU-YuanGroup/Helios) was introduced in [Helios: Real Real-Time Long Video Generation Model](https://huggingface.co/papers/) by Peking University & ByteDance & etc.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import HeliosTransformer3DModel
20+
21+
# Best Quality
22+
transformer = HeliosTransformer3DModel.from_pretrained("BestWishYsh/Helios-Base", subfolder="transformer", torch_dtype=torch.bfloat16)
23+
# Intermediate Weight
24+
transformer = HeliosTransformer3DModel.from_pretrained("BestWishYsh/Helios-Mid", subfolder="transformer", torch_dtype=torch.bfloat16)
25+
# Best Efficiency
26+
transformer = HeliosTransformer3DModel.from_pretrained("BestWishYsh/Helios-Distilled", subfolder="transformer", torch_dtype=torch.bfloat16)
27+
```
28+
29+
## HeliosTransformer3DModel
30+
31+
[[autodoc]] HeliosTransformer3DModel
32+
33+
## Transformer2DModelOutput
34+
35+
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput

0 commit comments

Comments
 (0)