Skip to content

Commit e2ac8bd

Browse files
committed
upd
Signed-off-by: Lancer <maruixiang6688@gmail.com>
1 parent 533d6b6 commit e2ac8bd

2 files changed

Lines changed: 3 additions & 9 deletions

File tree

docs/source/en/api/pipelines/longcat_audio_dit.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,8 +46,9 @@ sf.write("longcat.wav", audio, pipeline.sample_rate)
4646
## Tips
4747

4848
- `audio_duration_s` is the most direct way to control output duration.
49-
- `seed` makes generation reproducible (optional, defaults to None).
49+
- Use `generator=torch.Generator("cuda").manual_seed(42)` to make generation reproducible.
5050
- Output shape is `(batch, channels, samples)` - use `.audios[0, 0]` to get a single audio sample.
51+
- The pipeline outputs mono audio (1 channel). If you need stereo, you can duplicate the channel: `audio.unsqueeze(0).repeat(1, 2, 1)`.
5152

5253
## LongCatAudioDiTPipeline
5354

src/diffusers/pipelines/longcat_audio_dit/pipeline_longcat_audio_dit.py

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -43,9 +43,7 @@
4343
>>> pipe.to("cuda")
4444
4545
>>> prompt = "A calm ocean wave ambience with soft wind in the background."
46-
>>> audio = pipe(prompt, audio_duration_s=5.0, num_inference_steps=20, guidance_scale=4.0, seed=42).audios[
47-
... 0, 0
48-
... ]
46+
>>> audio = pipe(prompt, audio_duration_s=5.0, num_inference_steps=20, guidance_scale=4.0, generator=torch.Generator("cuda").manual_seed(42)).audios[0, 0]
4947
>>> sf.write("output.wav", audio, pipe.sample_rate)
5048
```
5149
"""
@@ -240,7 +238,6 @@ def __call__(
240238
Pre-generated noisy latents of shape `(batch_size, duration, latent_dim)`.
241239
num_inference_steps (`int`, defaults to 16): Number of denoising steps.
242240
guidance_scale (`float`, defaults to 4.0): Guidance scale for classifier-free guidance.
243-
seed (`int`, *optional*): A seed used to make generation deterministic.
244241
generator (`torch.Generator` or `list[torch.Generator]`, *optional*): Random generator(s).
245242
output_type (`str`, defaults to `"np"`): Output format: `"np"`, `"pt"`, or `"latent"`.
246243
return_dict (`bool`, defaults to `True`): Whether to return `AudioPipelineOutput`.
@@ -252,10 +249,6 @@ def __call__(
252249
253250
Examples:
254251
"""
255-
# Create generator from seed if provided
256-
if generator is None and seed is not None:
257-
generator = torch.Generator(device=self.device).manual_seed(seed)
258-
259252
if prompt is None:
260253
prompt = []
261254
elif isinstance(prompt, str):

0 commit comments

Comments
 (0)