Skip to content

Commit 9a2923d

Browse files
authored
[docs] remove pipeline examples section (#13771)
* docs * links
1 parent 65aff37 commit 9a2923d

34 files changed

Lines changed: 3448 additions & 4538 deletions

docs/source/en/_toctree.yml

Lines changed: 0 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -196,35 +196,6 @@
196196
- local: optimization/neuron
197197
title: AWS Neuron
198198
title: Model accelerators and hardware
199-
- isExpanded: false
200-
sections:
201-
- local: using-diffusers/helios
202-
title: Helios
203-
- local: using-diffusers/consisid
204-
title: ConsisID
205-
- local: using-diffusers/sdxl
206-
title: Stable Diffusion XL
207-
- local: using-diffusers/sdxl_turbo
208-
title: SDXL Turbo
209-
- local: using-diffusers/kandinsky
210-
title: Kandinsky
211-
- local: using-diffusers/omnigen
212-
title: OmniGen
213-
- local: using-diffusers/pag
214-
title: PAG
215-
- local: using-diffusers/inference_with_lcm
216-
title: Latent Consistency Model
217-
- local: using-diffusers/shap-e
218-
title: Shap-E
219-
- local: using-diffusers/diffedit
220-
title: DiffEdit
221-
- local: using-diffusers/inference_with_tcd_lora
222-
title: Trajectory Consistency Distillation-LoRA
223-
- local: using-diffusers/svd
224-
title: Stable Video Diffusion
225-
- local: using-diffusers/marigold_usage
226-
title: Marigold Computer Vision
227-
title: Specific pipeline examples
228199
- isExpanded: false
229200
sections:
230201
- sections:

docs/source/en/advanced_inference/outpaint.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ For example, remove the background from this image of a pair of shoes.
4646
</div>
4747
</div>
4848

49-
[Stable Diffusion XL (SDXL)](../using-diffusers/sdxl) models work best with 1024x1024 images, but you can resize the image to any size as long as your hardware has enough memory to support it. The transparent background in the image should also be replaced with a white background. Create a function (like the one below) that scales and pastes the image onto a white background.
49+
[Stable Diffusion XL (SDXL)](../api/pipelines/stable_diffusion/stable_diffusion_xl) models work best with 1024x1024 images, but you can resize the image to any size as long as your hardware has enough memory to support it. The transparent background in the image should also be replaced with a white background. Create a function (like the one below) that scales and pastes the image onto a white background.
5050

5151
```py
5252
import random

docs/source/en/api/pipelines/consisid.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,82 @@ ConsisID requires about 44 GB of GPU memory to decode 49 frames (6 seconds of vi
4949
| vae.enable_slicing | 16 GB | 22 GB |
5050
| vae.enable_tiling | 5 GB | 7 GB |
5151

52+
## Load Model Checkpoints
53+
54+
Model weights may be stored in separate subfolders on the Hub or locally, in which case, you should use the [`~DiffusionPipeline.from_pretrained`] method.
55+
56+
```python
57+
# !pip install consisid_eva_clip insightface facexlib
58+
import torch
59+
from diffusers import ConsisIDPipeline
60+
from diffusers.pipelines.consisid.consisid_utils import prepare_face_models, process_face_embeddings_infer
61+
from huggingface_hub import snapshot_download
62+
63+
# Download ckpts
64+
snapshot_download(repo_id="BestWishYsh/ConsisID-preview", local_dir="BestWishYsh/ConsisID-preview")
65+
66+
# Load face helper model to preprocess input face image
67+
face_helper_1, face_helper_2, face_clip_model, face_main_model, eva_transform_mean, eva_transform_std = prepare_face_models("BestWishYsh/ConsisID-preview", device="cuda", dtype=torch.bfloat16)
68+
69+
# Load consisid base model
70+
pipe = ConsisIDPipeline.from_pretrained("BestWishYsh/ConsisID-preview", torch_dtype=torch.bfloat16)
71+
pipe.to("cuda")
72+
```
73+
74+
## Identity-Preserving Text-to-Video
75+
76+
For identity-preserving text-to-video, pass a text prompt and an image contain clear face (e.g., preferably half-body or full-body). By default, ConsisID generates a 720x480 video for the best results.
77+
78+
```python
79+
from diffusers.utils import export_to_video
80+
81+
prompt = "The video captures a boy walking along a city street, filmed in black and white on a classic 35mm camera. His expression is thoughtful, his brow slightly furrowed as if he's lost in contemplation. The film grain adds a textured, timeless quality to the image, evoking a sense of nostalgia. Around him, the cityscape is filled with vintage buildings, cobblestone sidewalks, and softly blurred figures passing by, their outlines faint and indistinct. Streetlights cast a gentle glow, while shadows play across the boy's path, adding depth to the scene. The lighting highlights the boy's subtle smile, hinting at a fleeting moment of curiosity. The overall cinematic atmosphere, complete with classic film still aesthetics and dramatic contrasts, gives the scene an evocative and introspective feel."
82+
image = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_input.png?download=true"
83+
84+
id_cond, id_vit_hidden, image, face_kps = process_face_embeddings_infer(face_helper_1, face_clip_model, face_helper_2, eva_transform_mean, eva_transform_std, face_main_model, "cuda", torch.bfloat16, image, is_align_face=True)
85+
86+
video = pipe(image=image, prompt=prompt, num_inference_steps=50, guidance_scale=6.0, use_dynamic_cfg=False, id_vit_hidden=id_vit_hidden, id_cond=id_cond, kps_cond=face_kps, generator=torch.Generator("cuda").manual_seed(42))
87+
export_to_video(video.frames[0], "output.mp4", fps=8)
88+
```
89+
<table>
90+
<tr>
91+
<th style="text-align: center;">Face Image</th>
92+
<th style="text-align: center;">Video</th>
93+
<th style="text-align: center;">Description</th>
94+
</tr>
95+
<tr>
96+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_image_0.png?download=true" style="height: auto; width: 600px;"></td>
97+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_output_0.gif?download=true" style="height: auto; width: 2000px;"></td>
98+
<td>The video, in a beautifully crafted animated style, features a confident woman riding a horse through a lush forest clearing. Her expression is focused yet serene as she adjusts her wide-brimmed hat with a practiced hand. She wears a flowy bohemian dress, which moves gracefully with the rhythm of the horse, the fabric flowing fluidly in the animated motion. The dappled sunlight filters through the trees, casting soft, painterly patterns on the forest floor. Her posture is poised, showing both control and elegance as she guides the horse with ease. The animation's gentle, fluid style adds a dreamlike quality to the scene, with the woman’s calm demeanor and the peaceful surroundings evoking a sense of freedom and harmony.</td>
99+
</tr>
100+
<tr>
101+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_image_1.png?download=true" style="height: auto; width: 600px;"></td>
102+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_output_1.gif?download=true" style="height: auto; width: 2000px;"></td>
103+
<td>The video, in a captivating animated style, shows a woman standing in the center of a snowy forest, her eyes narrowed in concentration as she extends her hand forward. She is dressed in a deep blue cloak, her breath visible in the cold air, which is rendered with soft, ethereal strokes. A faint smile plays on her lips as she summons a wisp of ice magic, watching with focus as the surrounding trees and ground begin to shimmer and freeze, covered in delicate ice crystals. The animation’s fluid motion brings the magic to life, with the frost spreading outward in intricate, sparkling patterns. The environment is painted with soft, watercolor-like hues, enhancing the magical, dreamlike atmosphere. The overall mood is serene yet powerful, with the quiet winter air amplifying the delicate beauty of the frozen scene.</td>
104+
</tr>
105+
<tr>
106+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_image_2.png?download=true" style="height: auto; width: 600px;"></td>
107+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_output_2.gif?download=true" style="height: auto; width: 2000px;"></td>
108+
<td>The animation features a whimsical portrait of a balloon seller standing in a gentle breeze, captured with soft, hazy brushstrokes that evoke the feel of a serene spring day. His face is framed by a gentle smile, his eyes squinting slightly against the sun, while a few wisps of hair flutter in the wind. He is dressed in a light, pastel-colored shirt, and the balloons around him sway with the wind, adding a sense of playfulness to the scene. The background blurs softly, with hints of a vibrant market or park, enhancing the light-hearted, yet tender mood of the moment.</td>
109+
</tr>
110+
<tr>
111+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_image_3.png?download=true" style="height: auto; width: 600px;"></td>
112+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_output_3.gif?download=true" style="height: auto; width: 2000px;"></td>
113+
<td>The video captures a boy walking along a city street, filmed in black and white on a classic 35mm camera. His expression is thoughtful, his brow slightly furrowed as if he's lost in contemplation. The film grain adds a textured, timeless quality to the image, evoking a sense of nostalgia. Around him, the cityscape is filled with vintage buildings, cobblestone sidewalks, and softly blurred figures passing by, their outlines faint and indistinct. Streetlights cast a gentle glow, while shadows play across the boy's path, adding depth to the scene. The lighting highlights the boy's subtle smile, hinting at a fleeting moment of curiosity. The overall cinematic atmosphere, complete with classic film still aesthetics and dramatic contrasts, gives the scene an evocative and introspective feel.</td>
114+
</tr>
115+
<tr>
116+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_image_4.png?download=true" style="height: auto; width: 600px;"></td>
117+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/consisid/consisid_output_4.gif?download=true" style="height: auto; width: 2000px;"></td>
118+
<td>The video features a baby wearing a bright superhero cape, standing confidently with arms raised in a powerful pose. The baby has a determined look on their face, with eyes wide and lips pursed in concentration, as if ready to take on a challenge. The setting appears playful, with colorful toys scattered around and a soft rug underfoot, while sunlight streams through a nearby window, highlighting the fluttering cape and adding to the impression of heroism. The overall atmosphere is lighthearted and fun, with the baby's expressions capturing a mix of innocence and an adorable attempt at bravery, as if truly ready to save the day.</td>
119+
</tr>
120+
</table>
121+
122+
## Resources
123+
124+
Learn more about ConsisID with the following resources.
125+
- A [video](https://www.youtube.com/watch?v=PhlgC-bI5SQ) demonstrating ConsisID's main features.
126+
- The research paper, [Identity-Preserving Text-to-Video Generation by Frequency Decomposition](https://hf.co/papers/2411.17440) for more details.
127+
52128
## ConsisIDPipeline
53129

54130
[[autodoc]] ConsisIDPipeline

docs/source/en/api/pipelines/helios.md

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -445,6 +445,94 @@ export_to_video(output, "helios_distilled_v2v_output.mp4", fps=24)
445445
</hfoptions>
446446

447447

448+
## Text-to-Video Showcases
449+
450+
<table>
451+
<tr>
452+
<th style="text-align: center;">Prompt</th>
453+
<th style="text-align: center;">Generated Video</th>
454+
</tr>
455+
<tr>
456+
<td><small>A Viking warrior driving a modern city bus filled with passengers. The Viking has long blonde hair tied back, a beard, and is adorned with a fur-lined helmet and armor. He wears a traditional tunic and trousers, but also sports a seatbelt as he focuses on navigating the busy streets. The interior of the bus is typical, with rows of seats occupied by diverse passengers going about their daily routines. The exterior shots show the bustling urban environment, including tall buildings and traffic. Medium shot focusing on the Viking at the wheel, with occasional close-ups of his determined expression.
457+
</small></td>
458+
<td>
459+
<video width="4000" controls>
460+
<source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/helios/t2v_showcases1.mp4" type="video/mp4">
461+
</video>
462+
</td>
463+
</tr>
464+
<tr>
465+
<td><small>A documentary-style nature photography shot from a camera truck moving to the left, capturing a crab quickly scurrying into its burrow. The crab has a hard, greenish-brown shell and long claws, moving with determined speed across the sandy ground. Its body is slightly arched as it burrows into the sand, leaving a small trail behind. The background shows a shallow beach with scattered rocks and seashells, and the horizon features a gentle curve of the coastline. The photo has a natural and realistic texture, emphasizing the crab's natural movement and the texture of the sand. A close-up shot from a slightly elevated angle.
466+
</small></td>
467+
<td>
468+
<video width="4000" controls>
469+
<source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/helios/t2v_showcases2.mp4" type="video/mp4">
470+
</video>
471+
</td>
472+
</tr>
473+
</table>
474+
475+
## Image-to-Video Showcases
476+
477+
<table>
478+
<tr>
479+
<th style="text-align: center;">Image</th>
480+
<th style="text-align: center;">Prompt</th>
481+
<th style="text-align: center;">Generated Video</th>
482+
</tr>
483+
<tr>
484+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/helios/i2v_showcases1.jpg" style="height: auto; width: 300px;"></td>
485+
<td><small>A sleek red Kia car speeds along a rural road under a cloudy sky, its modern design and dynamic movement emphasized by the blurred motion of the surrounding fields and trees stretching into the distance. The car's glossy exterior reflects the overcast sky, highlighting its aerodynamic shape and sporty stance. The license plate reads "KIA 626," and the vehicle's headlights are on, adding to the sense of motion and energy. The road curves gently, with the car positioned slightly off-center, creating a sense of forward momentum. A dynamic front three-quarter view captures the car's powerful presence against the serene backdrop of rolling hills and scattered trees.
486+
</small></td>
487+
<td>
488+
<video width="2000" controls>
489+
<source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/helios/i2v_showcases1.mp4" type="video/mp4">
490+
</video>
491+
</td>
492+
</tr>
493+
<tr>
494+
<td><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/helios/i2v_showcases2.jpg" style="height: auto; width: 300px;"></td>
495+
<td><small>A close-up captures a fluffy orange cat with striking green eyes and white whiskers, gazing intently towards the camera. The cat's fur is soft and well-groomed, with a mix of warm orange and cream tones. Its large, expressive eyes are a vivid green, reflecting curiosity and alertness. The cat's nose is small and pink, and its mouth is slightly open, revealing a hint of its pink tongue. The background is softly blurred, suggesting a cozy indoor setting with neutral tones. The photo has a shallow depth of field, focusing sharply on the cat's face while the background remains out of focus. A close-up shot from a slightly elevated perspective.
496+
</small></td>
497+
<td>
498+
<video width="2000" controls>
499+
<source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/helios/i2v_showcases2.mp4" type="video/mp4">
500+
</video>
501+
</td>
502+
</tr>
503+
</table>
504+
505+
## Interactive-Video Showcases
506+
507+
<table>
508+
<tr>
509+
<th style="text-align: center;">Prompt</th>
510+
<th style="text-align: center;">Generated Video</th>
511+
</tr>
512+
<tr>
513+
<td><small>The prompt can be found <a href="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/helios/interactive_showcases1.txt">here</a></small></td>
514+
<td>
515+
<video width="680" controls>
516+
<source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/helios/interactive_showcases1.mp4" type="video/mp4">
517+
</video>
518+
</td>
519+
</tr>
520+
<tr>
521+
<td><small>The prompt can be found <a href="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/helios/interactive_showcases2.txt">here</a></small></td>
522+
<td>
523+
<video width="680" controls>
524+
<source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/helios/interactive_showcases2.mp4" type="video/mp4">
525+
</video>
526+
</td>
527+
</tr>
528+
</table>
529+
530+
## Resources
531+
532+
Learn more about Helios with the following resources.
533+
- Watch [video1](https://www.youtube.com/watch?v=vd_AgHtOUFQ) and [video2](https://www.youtube.com/watch?v=1GeIU2Dn7UY) for a demonstration of Helios's key features.
534+
- The research paper, [Helios: Real Real-Time Long Video Generation Model](https://huggingface.co/papers/2603.04379) for more details.
535+
448536
## HeliosPipeline
449537

450538
[[autodoc]] HeliosPipeline

docs/source/en/api/pipelines/hunyuandit.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ HunyuanDiT has the following components:
3232
> Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
3333
3434
> [!TIP]
35-
> You can further improve generation quality by passing the generated image from [`HungyuanDiTPipeline`] to the [SDXL refiner](../../using-diffusers/sdxl#base-to-refiner-model) model.
35+
> You can further improve generation quality by passing the generated image from [`HungyuanDiTPipeline`] to the [SDXL refiner](./stable_diffusion/stable_diffusion_xl#base-to-refiner-model) model.
3636
3737
## Optimization
3838

0 commit comments

Comments
 (0)