|
| 1 | +# Diffusers — Agent Guide |
| 2 | + |
| 3 | +## Coding style |
| 4 | + |
| 5 | +Strive to write code as simple and explicit as possible. |
| 6 | + |
| 7 | +- Minimize small helper/utility functions — inline the logic instead. A reader should be able to follow the full flow without jumping between functions. |
| 8 | +- No defensive code or unused code paths — do not add fallback paths, safety checks, or configuration options "just in case". When porting from a research repo, delete training-time code paths, experimental flags, and ablation branches entirely — only keep the inference path you are actually integrating. |
| 9 | +- Do not guess user intent and silently correct behavior. Make the expected inputs clear in the docstring, and raise a concise error for unsupported cases rather than adding complex fallback logic. |
| 10 | + |
| 11 | +--- |
| 12 | + |
| 13 | +### Dependencies |
| 14 | +- No new mandatory dependency without discussion (e.g. `einops`) |
| 15 | +- Optional deps guarded with `is_X_available()` and a dummy in `utils/dummy_*.py` |
| 16 | + |
| 17 | +## Code formatting |
| 18 | +- `make style` and `make fix-copies` should be run as the final step before opening a PR |
| 19 | + |
| 20 | +### Copied Code |
| 21 | +- Many classes are kept in sync with a source via a `# Copied from ...` header comment |
| 22 | +- Do not edit a `# Copied from` block directly — run `make fix-copies` to propagate changes from the source |
| 23 | +- Remove the header to intentionally break the link |
| 24 | + |
| 25 | +### Models |
| 26 | +- All layer calls should be visible directly in `forward` — avoid helper functions that hide `nn.Module` calls. |
| 27 | +- Try to not introduce graph breaks as much as possible for better compatibility with `torch.compile`. For example, DO NOT arbitrarily insert operations from NumPy in the forward implementations. |
| 28 | +- Attention must follow the diffusers pattern: both the `Attention` class and its processor are defined in the model file. The processor's `__call__` handles the actual compute and must use `dispatch_attention_fn` rather than calling `F.scaled_dot_product_attention` directly. The attention class inherits `AttentionModuleMixin` and declares `_default_processor_cls` and `_available_processors`. |
| 29 | + |
| 30 | +```python |
| 31 | +# transformer_mymodel.py |
| 32 | + |
| 33 | +class MyModelAttnProcessor: |
| 34 | + _attention_backend = None |
| 35 | + _parallel_config = None |
| 36 | + |
| 37 | + def __call__(self, attn, hidden_states, attention_mask=None, ...): |
| 38 | + query = attn.to_q(hidden_states) |
| 39 | + key = attn.to_k(hidden_states) |
| 40 | + value = attn.to_v(hidden_states) |
| 41 | + # reshape, apply rope, etc. |
| 42 | + hidden_states = dispatch_attention_fn( |
| 43 | + query, key, value, |
| 44 | + attn_mask=attention_mask, |
| 45 | + backend=self._attention_backend, |
| 46 | + parallel_config=self._parallel_config, |
| 47 | + ) |
| 48 | + hidden_states = hidden_states.flatten(2, 3) |
| 49 | + return attn.to_out[0](hidden_states) |
| 50 | + |
| 51 | + |
| 52 | +class MyModelAttention(nn.Module, AttentionModuleMixin): |
| 53 | + _default_processor_cls = MyModelAttnProcessor |
| 54 | + _available_processors = [MyModelAttnProcessor] |
| 55 | + |
| 56 | + def __init__(self, query_dim, heads=8, dim_head=64, ...): |
| 57 | + super().__init__() |
| 58 | + self.to_q = nn.Linear(query_dim, heads * dim_head, bias=False) |
| 59 | + self.to_k = nn.Linear(query_dim, heads * dim_head, bias=False) |
| 60 | + self.to_v = nn.Linear(query_dim, heads * dim_head, bias=False) |
| 61 | + self.to_out = nn.ModuleList([nn.Linear(heads * dim_head, query_dim), nn.Dropout(0.0)]) |
| 62 | + self.set_processor(MyModelAttnProcessor()) |
| 63 | + |
| 64 | + def forward(self, hidden_states, attention_mask=None, **kwargs): |
| 65 | + return self.processor(self, hidden_states, attention_mask, **kwargs) |
| 66 | +``` |
| 67 | + |
| 68 | +Consult the implementations in `src/diffusers/models/transformers/` if you need further references. |
| 69 | + |
| 70 | +### Pipeline |
| 71 | +- All pipelines must inherit from `DiffusionPipeline`. Consult implementations in `src/diffusers/pipelines` in case you need references. |
| 72 | +- DO NOT use an existing pipeline class (e.g., `FluxPipeline`) to override another pipeline (e.g., `FluxImg2ImgPipeline` which will be a part of the core codebase (`src`). |
| 73 | + |
| 74 | + |
| 75 | +### Tests |
| 76 | +- Slow tests gated with `@slow` and `RUN_SLOW=1` |
| 77 | +- All model-level tests must use the `BaseModelTesterConfig`, `ModelTesterMixin`, `MemoryTesterMixin`, `AttentionTesterMixin`, `LoraTesterMixin`, and `TrainingTesterMixin` classes initially to write the tests. Any additional tests should be added after discussions with the maintainers. Use `tests/models/transformers/test_models_transformer_flux.py` as a reference. |
0 commit comments