huggingface
diff --git a/‎.ai/skills/model-integration/SKILL.md‎
Lines changed: 30 additions & 3 deletions b/‎.ai/skills/model-integration/SKILL.md‎
Lines changed: 30 additions & 3 deletions
diff --git a/‎.ai/skills/parity-testing/SKILL.md‎
Lines changed: 2 additions & 0 deletions b/‎.ai/skills/parity-testing/SKILL.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/source/en/_toctree.yml‎
Lines changed: 4 additions & 0 deletions b/‎docs/source/en/_toctree.yml‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/source/en/api/models/motif_video_transformer_3d.md‎
Lines changed: 32 additions & 0 deletions b/‎docs/source/en/api/models/motif_video_transformer_3d.md‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎docs/source/en/api/pipelines/motif_video.md‎
Lines changed: 123 additions & 0 deletions b/‎docs/source/en/api/pipelines/motif_video.md‎
Lines changed: 123 additions & 0 deletions
diff --git a/‎docs/source/en/api/pipelines/overview.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/source/en/api/pipelines/overview.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/source/en/conceptual/contribution.md‎
Lines changed: 24 additions & 6 deletions b/‎docs/source/en/conceptual/contribution.md‎
Lines changed: 24 additions & 6 deletions
diff --git a/‎src/diffusers/__init__.py‎
Lines changed: 12 additions & 0 deletions b/‎src/diffusers/__init__.py‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎src/diffusers/hooks/_helpers.py‎
Lines changed: 20 additions & 0 deletions b/‎src/diffusers/hooks/_helpers.py‎
Lines changed: 20 additions & 0 deletions
@@ -73,10 +73,37 @@ See [../../models.md](../../models.md) for the attention pattern, implementation
 
 **Don't combine structural changes with behavioral changes.** Restructuring code to fit diffusers APIs (ModelMixin, ConfigMixin, etc.) is unavoidable. But don't also "improve" the algorithm, refactor computation order, or rename internal variables for aesthetics. Keep numerical logic as close to the reference as possible, even if it looks unclean. For standard → modular, this is stricter: copy loop logic verbatim and only restructure into blocks. Clean up in a separate commit after parity is confirmed.
 
-### Test setup
+### Testing
 
-- Slow tests gated with `@slow` and `RUN_SLOW=1`
-- All model-level tests must use the `BaseModelTesterConfig`, `ModelTesterMixin`, `MemoryTesterMixin`, `AttentionTesterMixin`, `LoraTesterMixin`, and `TrainingTesterMixin` classes initially to write the tests. Any additional tests should be added after discussions with the maintainers. Use `tests/models/transformers/test_models_transformer_flux.py` as a reference.
+Two test layers must be added for any new pipeline: pipeline-level tests, and (if a new model is introduced) model-level tests. Integration/slow tests and LoRA tests are **not** added in the initial PR — they come later, after discussion with maintainers.
+
+**General rules (apply to both layers):**
+- Keep component sizes tiny so the suite runs fast — small `num_layers`, small hidden/attention dims, low resolution, few frames. Reference `tests/pipelines/wan/test_wan.py` (`get_dummy_components` and `get_dummy_inputs`) for the size scale to target.
+- No LoRA tests in the initial PR (no `LoraTesterMixin`, no `tests/lora/test_lora_layers_<model>.py`).
+- No integration / slow tests in the initial PR — don't add anything gated on `@slow` / `RUN_SLOW=1` yet.
+
+#### Pipeline-level tests
+
+- Location: `tests/pipelines/<model>/test_<model>.py` (one file per pipeline variant, e.g. T2V, I2V).
+- Subclass both `PipelineTesterMixin` (from `..test_pipelines_common`) and `unittest.TestCase`.
+- Set `pipeline_class`, `params`, `batch_params`, `image_params` from `..pipeline_params`, and any `required_optional_params` / capability flags (`test_xformers_attention`, `supports_dduf`, etc.) that apply.
+- Implement `get_dummy_components()` (build all sub-modules with tiny configs and a fixed `torch.manual_seed(0)` before each) and `get_dummy_inputs(device, seed=0)`.
+- Skip any inherited tests that don't apply with `@unittest.skip("Test not supported")` rather than deleting them.
+- Reference: `tests/pipelines/wan/test_wan.py`.
+
+#### Model-level tests
+
+Only required if the pipeline introduces a new model class (transformer, VAE, etc.). Don't write these by hand — generate them (example command below):
+
+```bash
+python utils/generate_model_tests.py src/diffusers/models/transformers/transformer_<model>.py
+```
+
+- Run with **no `--include` flags** initially. The generator auto-detects mixins/attributes and emits the always-on testers (`ModelTesterMixin`, `MemoryTesterMixin`, `TorchCompileTesterMixin`, plus `AttentionTesterMixin` / `ContextParallelTesterMixin` / `TrainingTesterMixin` as applicable). Optional testers (quantization, caching, single-file, IP adapter, etc.) are added later, after maintainer discussion.
+- The generator writes to `tests/models/transformers/test_models_transformer_<model>.py` (or the matching `unets/` / `autoencoders/` subdir).
+- Fill in the `TODO`s in the generated `<Model>TesterConfig`: `pretrained_model_name_or_path`, `get_init_dict()` (tiny config), `get_dummy_inputs()`, `input_shape`, `output_shape`. Keep init dims small for speed.
+- Do **not** add `LoraTesterMixin` at the start, even if the model subclasses `PeftAdapterMixin` — strip it from the generated file for the initial PR.
+- Reference: `tests/models/transformers/test_models_transformer_flux.py`.
 
 ---
 
 
@@ -7,6 +7,8 @@ description: >
   visual artifacts — as these are usually parity bugs.
 ---
 
+> **Note**: Parity testing is **separate from** the unit-level tests that ship in `tests/`. If you are integrating a new model, the model-level test suite under `tests/models/` is still required — follow the **"#### Model-level tests"** section in [`../model-integration/SKILL.md`](../model-integration/SKILL.md) (generate via `utils/generate_model_tests.py`, no `--include` flags initially, no `LoraTesterMixin`). Parity tests verify numerical correctness during development; the generated test suite is what CI runs.
+
 ## Setup — gather before starting
 
 Before writing any test code, gather:
 
@@ -388,6 +388,8 @@
         title: LuminaNextDiT2DModel
       - local: api/models/mochi_transformer3d
         title: MochiTransformer3DModel
+      - local: api/models/motif_video_transformer_3d
+        title: MotifVideoTransformer3DModel
       - local: api/models/omnigen_transformer
         title: OmniGenTransformer2DModel
       - local: api/models/ovisimage_transformer2d
@@ -684,6 +686,8 @@
         title: LTXVideo
       - local: api/pipelines/mochi
         title: Mochi
+      - local: api/pipelines/motif_video
+        title: Motif-Video
       - local: api/pipelines/skyreels_v2
         title: SkyReels-V2
       - local: api/pipelines/stable_diffusion/svd
 
@@ -0,0 +1,32 @@
+<!-- Copyright 2026 The HuggingFace Team. All rights reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License. -->
+
+# MotifVideoTransformer3DModel
+
+A Diffusion Transformer model for 3D video-like data was introduced in Motif-Video by the Motif Technologies Team.
+
+The model uses a three-stage architecture with 12 dual-stream + 16 single-stream + 8 DDT decoder layers and rotary positional embeddings (RoPE) for video generation.
+
+The model can be loaded with the following code snippet.
+
+```python
+from diffusers import MotifVideoTransformer3DModel
+
+transformer = MotifVideoTransformer3DModel.from_pretrained("Motif-Technologies/Motif-Video-2B", subfolder="transformer", torch_dtype=torch.bfloat16)
+```
+
+## MotifVideoTransformer3DModel
+
+[[autodoc]] MotifVideoTransformer3DModel
+
+## Transformer2DModelOutput
+
+[[autodoc]] models.modeling_outputs.Transformer2DModelOutput
@@ -0,0 +1,123 @@
+<!-- Copyright 2026 The HuggingFace Team. All rights reserved. -->
+
+# Motif-Video
+
+[Technical Report](https://arxiv.org/abs/2604.16503)
+
+Motif-Video is a 2B parameter diffusion transformer designed for text-to-video and image-to-video generation. It features a three-stage architecture with 12 dual-stream + 16 single-stream + 8 DDT decoder layers, Shared Cross-Attention for stable text-video alignment under long video sequences, T5Gemma2 text encoder, and rectified flow matching for velocity prediction.
+
+<p align="center">
+  <img src="https://huggingface.co/Motif-Technologies/Motif-Video-2B/resolve/main/assets/architecture.png" width="90%" alt="Motif-Video architecture"/>
+</p>
+
+## Text-to-Video Generation
+
+Use `MotifVideoPipeline` for text-to-video generation:
+
+```python
+import torch
+from diffusers import MotifVideoPipeline
+from diffusers.utils import export_to_video
+
+
+pipe = MotifVideoPipeline.from_pretrained(
+    "Motif-Technologies/Motif-Video-2B",
+    torch_dtype=torch.bfloat16,
+)
+pipe.to("cuda")
+
+prompt = "A woman with long brown hair and light skin smiles at another woman with long blonde hair."
+negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
+
+video = pipe(
+    prompt=prompt,
+    negative_prompt=negative_prompt,
+    width=1280,
+    height=736,
+    num_frames=121,
+    num_inference_steps=50,
+).frames[0]
+export_to_video(video, "output.mp4", fps=24)
+```
+
+## Image-to-Video Generation
+
+Use `MotifVideoImage2VideoPipeline` for image-to-video generation:
+
+```python
+import torch
+from diffusers import MotifVideoImage2VideoPipeline
+from diffusers.utils import export_to_video, load_image
+
+
+pipe = MotifVideoImage2VideoPipeline.from_pretrained(
+    "Motif-Technologies/Motif-Video-2B",
+    torch_dtype=torch.bfloat16,
+)
+pipe.to("cuda")
+
+image = load_image("input_image.png")
+prompt = "A cinematic scene with vivid colors."
+negative_prompt = "worst quality, blurry, jittery, distorted"
+
+video = pipe(
+    image=image,
+    prompt=prompt,
+    negative_prompt=negative_prompt,
+    width=1280,
+    height=736,
+    num_frames=121,
+    num_inference_steps=50,
+).frames[0]
+export_to_video(video, "i2v_output.mp4", fps=24)
+```
+
+### Memory-efficient Inference
+
+For GPUs with less than 30GB VRAM (e.g., RTX 4090), use model CPU offloading:
+
+```bash
+export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
+```
+
+```python
+import torch
+from diffusers import MotifVideoPipeline
+from diffusers.utils import export_to_video
+
+
+pipe = MotifVideoPipeline.from_pretrained(
+    "Motif-Technologies/Motif-Video-2B",
+    torch_dtype=torch.bfloat16,
+)
+pipe.enable_model_cpu_offload()
+
+prompt = "A woman with long brown hair and light skin smiles at another woman with long blonde hair."
+negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
+
+video = pipe(
+    prompt=prompt,
+    negative_prompt=negative_prompt,
+    width=1280,
+    height=736,
+    num_frames=121,
+    num_inference_steps=50,
+).frames[0]
+export_to_video(video, "output.mp4", fps=24)
+```
+
+## MotifVideoPipeline
+
+[[autodoc]] MotifVideoPipeline
+  - all
+  - __call__
+
+## MotifVideoImage2VideoPipeline
+
+[[autodoc]] MotifVideoImage2VideoPipeline
+  - all
+  - __call__
+
+## MotifVideoPipelineOutput
+
+[[autodoc]] pipelines.motif_video.pipeline_output.MotifVideoPipelineOutput
@@ -57,6 +57,7 @@ The table below lists all the pipelines currently available in 🤗 Diffusers an
 | [LLaDA2](llada2) | text2text |
 | [Lumina-T2X](lumina) | text2image |
 | [Marigold](marigold) | depth-estimation, normals-estimation, intrinsic-decomposition |
+| [Motif-Video](motif_video) | text2video, image2video |
 | [PAG](pag) | text2image |
 | [PixArt-α](pixart) | text2image |
 | [PixArt-Σ](pixart_sigma) | text2image |
 
@@ -570,11 +570,29 @@ For documentation strings, 🧨 Diffusers follows the [Google style](https://goo
 
 ## Coding with AI agents
 
-The repository keeps AI-agent configuration in `.ai/` and exposes local agent files via symlinks.
-
-- **Source of truth** — edit files under `.ai/` (`AGENTS.md` for coding guidelines, `skills/` for on-demand task knowledge)
-- **Don't edit** generated root-level `AGENTS.md`, `CLAUDE.md`, or `.agents/skills`/`.claude/skills` — they are symlinks
-- Setup commands:
+The repository keeps AI-agent configuration in [`.ai/`](https://github.com/huggingface/diffusers/tree/main/.ai) and exposes local agent files via symlinks. If you use a coding agent (Claude Code, OpenAI Codex, etc.) to help with a contribution, point it at this directory — it contains the project conventions and on-demand task knowledge maintainers expect contributors to follow.
+
+- **Read-only for contributors** — `.ai/` is maintained by the core maintainers. Please do not edit files under `.ai/` (or the generated root-level `AGENTS.md`, `CLAUDE.md`, `.agents/skills`, `.claude/skills`, which are symlinks) in your PR. If you find something missing or wrong, open an issue or flag it on the PR and a maintainer will update it.
+- **Guidelines** (loaded into every agent session):
+  - [`.ai/AGENTS.md`](https://github.com/huggingface/diffusers/blob/main/.ai/AGENTS.md) — top-level coding guidelines
+  - [`.ai/models.md`](https://github.com/huggingface/diffusers/blob/main/.ai/models.md) — attention pattern, model implementation rules, common conventions
+  - [`.ai/pipelines.md`](https://github.com/huggingface/diffusers/blob/main/.ai/pipelines.md) — pipeline conventions
+  - [`.ai/modular.md`](https://github.com/huggingface/diffusers/blob/main/.ai/modular.md) — modular pipeline conventions and conversion checklist
+  - [`.ai/review-rules.md`](https://github.com/huggingface/diffusers/blob/main/.ai/review-rules.md) — what reviewers look for
+- **Skills** (under [`.ai/skills/`](https://github.com/huggingface/diffusers/tree/main/.ai/skills), loaded on demand for specific tasks):
+  - `model-integration` — adding a new model or pipeline to diffusers end-to-end (file structure, integration checklist, testing layout, weight conversion)
+  - `parity-testing` — verifying numerical parity between the diffusers implementation and a reference implementation
+- **Setup commands**:
   - `make codex` — symlink guidelines + skills for OpenAI Codex
   - `make claude` — symlink guidelines + skills for Claude Code
-  - `make clean-ai` — remove all generated symlinks
+  - `make clean-ai` — remove all generated symlinks
+
+### AI-assisted and agentic contributions
+
+AI-assisted contributions are welcome, but they must be coordinated, scoped, and verified to keep review load manageable. PRs that do not follow these guidelines may be closed without detailed review.
+
+- **Coordinate before opening a PR.** Find or open an issue, review similar PRs (open and recently closed), and wait for an explicit acknowledgment from a maintainer on that issue before opening a PR. This gives us a chance to discuss scope, avoid duplicate work, and confirm the approach.
+- **Fix patterns, not one-offs.** If you spot an recurring issue, search the codebase for similar instances and open a *single* issue with a clear, systematic scope (e.g. "fix mutable defaults across all schedulers") rather than many issues or PRs for individual instances. 
+- **Include in the PR description:**
+  - A **coordination link** to the issue or discussion where a maintainer acknowledged the work.
+  - The **test commands you ran** and their results (paste relevant output, not just "tests pass").
@@ -266,6 +266,7 @@
             "LuminaNextDiT2DModel",
             "MochiTransformer3DModel",
             "ModelMixin",
+            "MotifVideoTransformer3DModel",
             "MotionAdapter",
             "MultiAdapter",
             "MultiControlNetModel",
@@ -621,7 +622,9 @@
             "LongCatImageEditPipeline",
             "LongCatImagePipeline",
             "LTX2ConditionPipeline",
+            "LTX2HDRPipeline",
             "LTX2ImageToVideoPipeline",
+            "LTX2InContextPipeline",
             "LTX2LatentUpsamplePipeline",
             "LTX2Pipeline",
             "LTXConditionPipeline",
@@ -638,6 +641,9 @@
             "MarigoldIntrinsicsPipeline",
             "MarigoldNormalsPipeline",
             "MochiPipeline",
+            "MotifVideoImage2VideoPipeline",
+            "MotifVideoPipeline",
+            "MotifVideoPipelineOutput",
             "MusicLDMPipeline",
             "NucleusMoEImagePipeline",
             "OmniGenPipeline",
@@ -1088,6 +1094,7 @@
             LuminaNextDiT2DModel,
             MochiTransformer3DModel,
             ModelMixin,
+            MotifVideoTransformer3DModel,
             MotionAdapter,
             MultiAdapter,
             MultiControlNetModel,
@@ -1418,7 +1425,9 @@
             LongCatImageEditPipeline,
             LongCatImagePipeline,
             LTX2ConditionPipeline,
+            LTX2HDRPipeline,
             LTX2ImageToVideoPipeline,
+            LTX2InContextPipeline,
             LTX2LatentUpsamplePipeline,
             LTX2Pipeline,
             LTXConditionPipeline,
@@ -1435,6 +1444,9 @@
             MarigoldIntrinsicsPipeline,
             MarigoldNormalsPipeline,
             MochiPipeline,
+            MotifVideoImage2VideoPipeline,
+            MotifVideoPipeline,
+            MotifVideoPipelineOutput,
             MusicLDMPipeline,
             NucleusMoEImagePipeline,
             OmniGenPipeline,
 
@@ -188,6 +188,10 @@ def _register_transformer_blocks_metadata():
     from ..models.transformers.transformer_kandinsky import Kandinsky5TransformerDecoderBlock
     from ..models.transformers.transformer_ltx import LTXVideoTransformerBlock
     from ..models.transformers.transformer_mochi import MochiTransformerBlock
+    from ..models.transformers.transformer_motif_video import (
+        MotifVideoSingleTransformerBlock,
+        MotifVideoTransformerBlock,
+    )
     from ..models.transformers.transformer_qwenimage import QwenImageTransformerBlock
     from ..models.transformers.transformer_wan import WanTransformerBlock
     from ..models.transformers.transformer_z_image import ZImageTransformerBlock
@@ -290,6 +294,22 @@ def _register_transformer_blocks_metadata():
         ),
     )
 
+    # MotifVideo
+    TransformerBlockRegistry.register(
+        model_class=MotifVideoTransformerBlock,
+        metadata=TransformerBlockMetadata(
+            return_hidden_states_index=0,
+            return_encoder_hidden_states_index=1,
+        ),
+    )
+    TransformerBlockRegistry.register(
+        model_class=MotifVideoSingleTransformerBlock,
+        metadata=TransformerBlockMetadata(
+            return_hidden_states_index=0,
+            return_encoder_hidden_states_index=1,
+        ),
+    )
+
     # Wan
     TransformerBlockRegistry.register(
         model_class=WanTransformerBlock,