You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Switch `--mode` to match the workflow you are targeting (`text2image`, `text2video`, `image2video`). The command writes the upsampled prompt(s) to the `--output` file as a JSON array (one object per non-empty line in `--input`); pass a `.jsonl` path instead to get one JSON object per line. For `image2video`, you must also supply the conditioning image via `--image-url` (a URL or local path) or `--image-list` (one image per prompt).
79
79
80
+
<!-- TODO: Add prompt upsampling support for video inputs (video-to-video) to the upsampler CLI. -->
81
+
80
82
A pre-upsampled positive prompt (`assets/example_t2v_prompt.json`) and negative prompt (`assets/negative_prompt.json`) are provided for convenience, and are used by the generation examples below. The examples load these JSON files and pass them to the pipeline as JSON strings via `json.dumps(...)`.
Pass a conditioning clip via `video=` (e.g. from `load_video`). The pipeline anchors the leading latent frames given by `condition_frame_indexes_vision` (default `[0, 1]`) to the clip and denoises the rest. Use `condition_video_keep` (`"first"` or `"last"`) to choose which end of a longer source clip the conditioning frames are taken from. As with the other modes, the prompt should follow the descriptive JSON structure described in [Prompt upsampling](#prompt-upsampling).
284
+
285
+
<!-- TODO: Add prompt upsampling support for video inputs (video-to-video) to the upsampler CLI. -->
286
+
287
+
<hfoptionsid="model">
288
+
<hfoptionid="Nano">
289
+
290
+
```python
291
+
import json
292
+
import torch
293
+
from diffusers import Cosmos3OmniPipeline
294
+
from diffusers.schedulers.scheduling_unipc_multistep import UniPCMultistepScheduler
295
+
from diffusers.utils import export_to_video, load_video
296
+
297
+
# JSON-upsampled positive and negative prompts (see "Prompt upsampling" above).
When the checkpoint carries a `sound_tokenizer`, add `enable_sound=True` to the video-to-video call to jointly generate a synchronized audio track. The waveform is returned alongside the video and can be muxed into the MP4 with [`~utils.encode_video`].
377
+
378
+
<hfoptionsid="model">
379
+
<hfoptionid="Nano">
380
+
381
+
```python
382
+
import json
383
+
import torch
384
+
from diffusers import Cosmos3OmniPipeline
385
+
from diffusers.schedulers.scheduling_unipc_multistep import UniPCMultistepScheduler
386
+
from diffusers.utils import encode_video, load_video
387
+
388
+
# JSON-upsampled positive and negative prompts (see "Prompt upsampling" above).
When the checkpoint carries a `sound_tokenizer`, pass `enable_sound=True` to jointly generate a synchronized audio track. The waveform is returned alongside the video and can be muxed into the MP4 with [`~utils.encode_video`].
0 commit comments