Skip to content

Commit 88a5a93

Browse files
[feat] [3/n] Improve API: extend support to cli (#1226)
1 parent c591d6d commit 88a5a93

50 files changed

Lines changed: 2023 additions & 1107 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/design/inference_schema_parity_inventory.yaml

Lines changed: 1 addition & 217 deletions
Original file line numberDiff line numberDiff line change
@@ -495,230 +495,14 @@ cli:
495495
notes:
496496
- "CLI parity is checked against the actual generate/serve parser dest sets."
497497
- "The inventory tracks parser dest names, excluding argparse's implicit help action."
498+
- "The refactored inference CLI is config-only: subcommands expose only --config, and any additional CLI input must use dotted override paths."
498499
generate:
499500
explicit_local_fields:
500501
- config
501502
expected_dests:
502-
- VSA_sparsity
503-
- boundary_ratio
504-
- bsa_cdf_threshold
505-
- bsa_chunk_k
506-
- bsa_chunk_q
507-
- bsa_sparsity
508503
- config
509-
- disable_autocast
510-
- dist_timeout
511-
- distributed_executor_backend
512-
- dit_config.prefix
513-
- dit_config.quant_config
514-
- dit_cpu_offload
515-
- dit_layerwise_offload
516-
- dit_precision
517-
- dmd_denoising_steps
518-
- embedded_cfg_scale
519-
- enable_bsa
520-
- enable_stage_verification
521-
- enable_torch_compile
522-
- flow_shift
523-
- fps
524-
- guidance_rescale
525-
- guidance_scale
526-
- height
527-
- hsdp_replicate_dim
528-
- hsdp_shard_dim
529-
- image_encoder_cpu_offload
530-
- image_encoder_precision
531-
- image_path
532-
- inference_mode
533-
- init_weights_from_safetensors
534-
- init_weights_from_safetensors_2
535-
- lora_nickname
536-
- lora_path
537-
- lora_target_modules
538-
- ltx2_initial_latent_path
539-
- ltx2_vae_spatial_tile_overlap_in_pixels
540-
- ltx2_vae_spatial_tile_size_in_pixels
541-
- ltx2_vae_temporal_tile_overlap_in_frames
542-
- ltx2_vae_temporal_tile_size_in_frames
543-
- ltx2_vae_tiling
544-
- master_port
545-
- moba_config_path
546-
- mode
547-
- model_path
548-
- negative_prompt
549-
- num_cond_frames
550-
- num_frames
551-
- num_gpus
552-
- num_inference_steps
553-
- num_videos_per_prompt
554-
- output_path
555-
- output_type
556-
- output_video_name
557-
- override_pipeline_cls_name
558-
- override_text_encoder_quant
559-
- override_text_encoder_safetensors
560-
- override_transformer_cls_name
561-
- pin_cpu_memory
562-
- pipeline_config_path
563-
- preprocess.dataloader_num_workers
564-
- preprocess.dataset_output_dir
565-
- preprocess.dataset_path
566-
- preprocess.dataset_type
567-
- preprocess.do_temporal_sample
568-
- preprocess.drop_short_ratio
569-
- preprocess.flush_frequency
570-
- preprocess.max_height
571-
- preprocess.max_width
572-
- preprocess.model_path
573-
- preprocess.num_frames
574-
- preprocess.preprocess_video_batch_size
575-
- preprocess.samples_per_file
576-
- preprocess.seed
577-
- preprocess.speed_factor
578-
- preprocess.train_fps
579-
- preprocess.training_cfg_rate
580-
- preprocess.video_length_tolerance_range
581-
- preprocess.video_loader_type
582-
- preprocess.with_audio
583-
- prompt
584-
- prompt_path
585-
- prompt_txt
586-
- refine_from
587-
- return_frames
588-
- return_trajectory_decoded
589-
- return_trajectory_latents
590-
- revision
591-
- save_video
592-
- seed
593-
- sp_size
594-
- spatial_refine_only
595-
- t_thresh
596-
- text_encoder_configs
597-
- text_encoder_cpu_offload
598-
- text_encoder_precisions
599-
- torch_compile_kwargs
600-
- tp_size
601-
- trust_remote_code
602-
- use_fsdp_inference
603-
- vae_config.blend_num_frames
604-
- vae_config.load_decoder
605-
- vae_config.load_encoder
606-
- vae_config.tile_sample_min_height
607-
- vae_config.tile_sample_min_num_frames
608-
- vae_config.tile_sample_min_width
609-
- vae_config.tile_sample_stride_height
610-
- vae_config.tile_sample_stride_num_frames
611-
- vae_config.tile_sample_stride_width
612-
- vae_config.use_parallel_tiling
613-
- vae_config.use_temporal_tiling
614-
- vae_config.use_tiling
615-
- vae_cpu_offload
616-
- vae_precision
617-
- vae_sp
618-
- vae_tiling
619-
- video_path
620-
- width
621-
- workload_type
622504
serve:
623505
explicit_local_fields:
624506
- config
625-
- host
626-
- output_dir
627-
- port
628507
expected_dests:
629-
- VSA_sparsity
630-
- bsa_cdf_threshold
631-
- bsa_chunk_k
632-
- bsa_chunk_q
633-
- bsa_sparsity
634508
- config
635-
- disable_autocast
636-
- dist_timeout
637-
- distributed_executor_backend
638-
- dit_config.prefix
639-
- dit_config.quant_config
640-
- dit_cpu_offload
641-
- dit_layerwise_offload
642-
- dit_precision
643-
- dmd_denoising_steps
644-
- embedded_cfg_scale
645-
- enable_bsa
646-
- enable_stage_verification
647-
- enable_torch_compile
648-
- flow_shift
649-
- host
650-
- hsdp_replicate_dim
651-
- hsdp_shard_dim
652-
- image_encoder_cpu_offload
653-
- image_encoder_precision
654-
- inference_mode
655-
- init_weights_from_safetensors
656-
- init_weights_from_safetensors_2
657-
- lora_nickname
658-
- lora_path
659-
- lora_target_modules
660-
- ltx2_initial_latent_path
661-
- ltx2_vae_spatial_tile_overlap_in_pixels
662-
- ltx2_vae_spatial_tile_size_in_pixels
663-
- ltx2_vae_temporal_tile_overlap_in_frames
664-
- ltx2_vae_temporal_tile_size_in_frames
665-
- ltx2_vae_tiling
666-
- master_port
667-
- mode
668-
- model_path
669-
- num_gpus
670-
- output_dir
671-
- output_type
672-
- override_pipeline_cls_name
673-
- override_text_encoder_quant
674-
- override_text_encoder_safetensors
675-
- override_transformer_cls_name
676-
- pin_cpu_memory
677-
- pipeline_config_path
678-
- port
679-
- preprocess.dataloader_num_workers
680-
- preprocess.dataset_output_dir
681-
- preprocess.dataset_path
682-
- preprocess.dataset_type
683-
- preprocess.do_temporal_sample
684-
- preprocess.drop_short_ratio
685-
- preprocess.flush_frequency
686-
- preprocess.max_height
687-
- preprocess.max_width
688-
- preprocess.model_path
689-
- preprocess.num_frames
690-
- preprocess.preprocess_video_batch_size
691-
- preprocess.samples_per_file
692-
- preprocess.seed
693-
- preprocess.speed_factor
694-
- preprocess.train_fps
695-
- preprocess.training_cfg_rate
696-
- preprocess.video_length_tolerance_range
697-
- preprocess.video_loader_type
698-
- preprocess.with_audio
699-
- prompt_txt
700-
- revision
701-
- sp_size
702-
- text_encoder_cpu_offload
703-
- text_encoder_precisions
704-
- torch_compile_kwargs
705-
- tp_size
706-
- trust_remote_code
707-
- use_fsdp_inference
708-
- vae_config.blend_num_frames
709-
- vae_config.load_decoder
710-
- vae_config.load_encoder
711-
- vae_config.tile_sample_min_height
712-
- vae_config.tile_sample_min_num_frames
713-
- vae_config.tile_sample_min_width
714-
- vae_config.tile_sample_stride_height
715-
- vae_config.tile_sample_stride_num_frames
716-
- vae_config.tile_sample_stride_width
717-
- vae_config.use_parallel_tiling
718-
- vae_config.use_temporal_tiling
719-
- vae_config.use_tiling
720-
- vae_cpu_offload
721-
- vae_precision
722-
- vae_sp
723-
- vae_tiling
724-
- workload_type

docs/distillation/dmd.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,8 @@ Both models are trained on **61×448×832** resolution but support generating vi
1616
First install [VSA](../attention/vsa/index.md). Set `MODEL_BASE` to your own model path and run:
1717

1818
```bash
19-
bash scripts/inference/v1_inference_wan_dmd.sh
19+
FASTVIDEO_ATTENTION_BACKEND=VIDEO_SPARSE_ATTN \
20+
fastvideo generate --config scripts/inference/inference_wan_VSA_DMD_1_3B.yaml
2021
```
2122

2223
## 🗂️ Dataset

docs/inference/architecture.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -455,6 +455,6 @@ User: generator.generate_video(prompt, ...)
455455
`fastvideo/pipelines/stages/`, implement `forward()`, optionally
456456
implement `verify_input()`/`verify_output()`.
457457

458-
7. **Verify** — Run `fastvideo generate --model-path <path> --prompt
459-
"test" --num-inference-steps 2` to confirm the pipeline loads and
460-
generates output.
458+
7. **Verify** — Run `fastvideo generate --config <config.yaml>` with a
459+
minimal nested config to confirm the pipeline loads and generates
460+
output.

0 commit comments

Comments
 (0)