Skip to content

Commit f09b1cc

Browse files
committed
start the section on sequential pipelines
1 parent 285f877 commit f09b1cc

1 file changed

Lines changed: 96 additions & 44 deletions

File tree

docs/source/en/modular_diffusers/write_own_pipeline_block.md

Lines changed: 96 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ In Modular Diffusers, you build your workflow using `ModularPipelineBlocks`. We
1717
In this tutorial, we will focus on how to write a basic `PipelineBlock` and how it interacts with other components in the system. We will also cover how to connect them together using the multi-blocks: `SequentialPipelineBlocks`, `LoopSequentialPipelineBlocks`, and `AutoPipelineBlocks`.
1818

1919

20-
### Understanding the Foundation: `PipelineState`
20+
## Understanding the Foundation: `PipelineState`
2121

2222
Before we dive into creating `PipelineBlock`s, we need to have a basic understanding of `PipelineState` - the core data structure that all blocks operate on. This concept is fundamental to understanding how blocks interact with each other and the pipeline system.
2323

@@ -44,7 +44,7 @@ PipelineState(
4444
},
4545
```
4646

47-
### Creating a `PipelineBlock`
47+
## Creating a `PipelineBlock`
4848

4949
To write a `PipelineBlock` class, you need to define a few properties that determine how your block interacts with the pipeline state. Understanding these properties is crucial - they define what data your block can access and what it can produce.
5050

@@ -182,17 +182,17 @@ def make_block(inputs=[], intermediate_inputs=[], intermediate_outputs=[], block
182182
Let's create a simple block to see how these definitions interact with the pipeline state. To better understand what's happening, we'll print out the states before and after updates to inspect them:
183183

184184
```py
185-
user_inputs = [
185+
inputs = [
186186
InputParam(name="image", type_hint="PIL.Image", description="raw input image to process")
187187
]
188188

189-
user_intermediate_inputs = [InputParam(name="batch_size", type_hint=int)]
189+
intermediate_inputs = [InputParam(name="batch_size", type_hint=int)]
190190

191-
user_intermediate_outputs = [
191+
intermediate_outputs = [
192192
OutputParam(name="image_latents", description="latents representing the image")
193193
]
194194

195-
def user_block_fn(block_state, pipeline_state):
195+
def image_encoder_block_fn(block_state, pipeline_state):
196196
print(f"pipeline_state (before update): {pipeline_state}")
197197
print(f"block_state (before update): {block_state}")
198198

@@ -206,21 +206,23 @@ def user_block_fn(block_state, pipeline_state):
206206
return block_state
207207

208208
# Create a block with our definitions
209-
block = make_block(
210-
inputs=user_inputs,
211-
intermediate_inputs=user_intermediate_inputs,
212-
intermediate_outputs=user_intermediate_outputs,
213-
block_fn=user_block_fn
209+
image_encoder_block = make_block(
210+
inputs=inputs,
211+
intermediate_inputs=intermediate_inputs,
212+
intermediate_outputs=intermediate_outputs,
213+
block_fn=image_encoder_block_fn,
214+
description=" Encode raw image into its latent presentation"
214215
)
215-
pipe = block.init_pipeline()
216+
pipe = image_encoder_block.init_pipeline()
216217
```
217218

218219
Let's check the pipeline's docstring to see what inputs it expects:
219-
220220
```py
221221
>>> print(pipe.doc)
222222
class TestBlock
223223

224+
Encode raw image into its latent presentation
225+
224226
Inputs:
225227

226228
image (`PIL.Image`, *optional*):
@@ -246,37 +248,6 @@ state = pipe(image=image, batch_size=2)
246248
print(f"pipeline_state (after update): {state}")
247249
```
248250

249-
```out
250-
pipeline_state (before update): PipelineState(
251-
inputs={
252-
image: <PIL.Image.Image image mode=RGB size=512x512 at 0x7F226024EB90>
253-
},
254-
intermediates={
255-
batch_size: 2
256-
},
257-
)
258-
block_state (before update): BlockState(
259-
image: <PIL.Image.Image image mode=RGB size=512x512 at 0x7F2260260220>
260-
batch_size: 2
261-
)
262-
263-
block_state (after update): BlockState(
264-
image: Tensor(dtype=torch.float32, shape=torch.Size([1, 3, 512, 512]))
265-
batch_size: 4
266-
processed_image: List[4] of Tensors with shapes [torch.Size([1, 3, 512, 512]), torch.Size([1, 3, 512, 512]), torch.Size([1, 3, 512, 512]), torch.Size([1, 3, 512, 512])]
267-
image_latents: Tensor(dtype=torch.float32, shape=torch.Size([1, 4, 64, 64]))
268-
)
269-
pipeline_state (after update): PipelineState(
270-
inputs={
271-
image: <PIL.Image.Image image mode=RGB size=512x512 at 0x7F226024EB90>
272-
},
273-
intermediates={
274-
batch_size: 4
275-
image_latents: Tensor(dtype=torch.float32, shape=torch.Size([1, 4, 64, 64]))
276-
},
277-
)
278-
```
279-
280251
**Key Observations:**
281252

282253
1. **Before the update**: `image` (the input) goes to the immutable inputs dict, while `batch_size` (the intermediate_input) goes to the mutable intermediates dict, and both are available in `block_state`.
@@ -288,3 +259,84 @@ pipeline_state (after update): PipelineState(
288259
- **`processed_image`** was not added to `pipeline_state` because it wasn't declared as an intermediate output
289260

290261
I hope by now you have a basic idea about how `PipelineBlock` manages state through inputs, intermediate inputs, and intermediate outputs. The real power comes when we connect multiple blocks together - their intermediate outputs become intermediate inputs for subsequent blocks, creating modular workflows. Let's explore how to build these connections using multi-blocks like `SequentialPipelineBlocks`.
262+
263+
## Create a `SequentialPipelineBlocks`
264+
265+
I think by this point, you're already familiar with `SequentialPipelineBlocks` and how to create them with the `from_blocks_dict` API. It's one of the most common ways to use Modular Diffusers, and we've covered it pretty well in the [quicktour](https://moon-ci-docs.huggingface.co/docs/diffusers/pr_9672/en/modular_diffusers/quicktour#modularpipelineblocks).
266+
267+
But how do blocks actually connect and work together? Understanding this is crucial for building effective modular workflows. Let's explore this through an example.
268+
269+
**How Blocks Connect in SequentialPipelineBlocks:**
270+
271+
The key insight is that blocks connect through their intermediate inputs and outputs - the "studs and anti-studs" we discussed earlier. Let's expand on our example to create a new block that produces `batch_size`, which we'll call "input_block":
272+
273+
```py
274+
def input_block_fn(block_state, pipeline_state):
275+
276+
# Simulate processing the image
277+
if not isinstance(block_state.prompt, list):
278+
prompt = [block_state.prompt]
279+
batch_size = len(block_state.prompt)
280+
block_state.batch_size = batch_size * block_state.num_images_per_prompt
281+
282+
return block_state
283+
284+
input_block = make_block(
285+
inputs=[
286+
InputParam(name="prompt", type_hint=list, description="list of text prompts"),
287+
InputParam(name="num_images_per_prompt", type_hint=int, description="number of images per prompt")
288+
],
289+
intermediate_outputs=[
290+
OutputParam(name="batch_size", description="calculated batch size")
291+
],
292+
block_fn=input_block_fn,
293+
description="A block that determines batch_size based on the number of prompts and num_images_per_prompt argument."
294+
)
295+
```
296+
297+
Now let's connect these blocks to create a pipeline:
298+
299+
```py
300+
from diffusers.modular_pipelines import SequentialPipelineBlocks, InsertableDict
301+
blocks_dict = InsertableDict()
302+
blocks_dict["input"] = input_block
303+
blocks_dict["image_encoder"] = image_encoder_block
304+
blocks = SequentialPipelineBlocks.from_blocks_dict(blocks_dict)
305+
pipeline = blocks.init_pipeline()
306+
```
307+
308+
Now you have a pipeline with 2 blocks. When you inspect `pipeline.doc`, you can see that `batch_size` is not listed as an input. The pipeline automatically detects that the `input_block` can produce `batch_size` for the `image_encoder_block`, so it doesn't ask the user to provide it.
309+
310+
```py
311+
>>> print(pipeline.doc)
312+
class SequentialPipelineBlocks
313+
314+
Inputs:
315+
316+
prompt (`None`, *optional*):
317+
318+
num_images_per_prompt (`None`, *optional*):
319+
320+
image (`PIL.Image`, *optional*):
321+
raw input image to process
322+
323+
Outputs:
324+
325+
batch_size (`None`):
326+
327+
image_latents (`None`):
328+
latents representing the image
329+
```
330+
331+
At runtime, you have data flow like this:
332+
333+
![Data Flow Diagram](https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/modular_quicktour/sequential_mermaid.png)
334+
335+
**How SequentialPipelineBlocks Works:**
336+
337+
1. **Execution Order**: Blocks are executed in the order they're registered in the `blocks_dict`
338+
2. **Data Flow**: Outputs from one block become available as intermediate inputs to all subsequent blocks
339+
3. **Smart Input Resolution**: The pipeline automatically figures out which values need to be provided by the user and which will be generated by previous blocks
340+
4. **Consistent Interface**: Each block maintains its own behavior and operates through its defined interface, while collectively these interfaces determine what the entire pipeline accepts and produces
341+
342+
What happens within each block follows the same pattern we described earlier: each block gets its own `block_state` with the relevant inputs and intermediate inputs, performs its computation, and updates the pipeline state with its intermediate outputs.

0 commit comments

Comments
 (0)