Skip to content

[feat] [3/n] Improve API: extend support to cli#1226

Merged
SolitaryThinker merged 9 commits intomainfrom
will/api_3
Apr 14, 2026
Merged

[feat] [3/n] Improve API: extend support to cli#1226
SolitaryThinker merged 9 commits intomainfrom
will/api_3

Conversation

@SolitaryThinker
Copy link
Copy Markdown
Collaborator

Motivation

The previous CLI surface mirrored every SamplingParam / FastVideoArgs field
as a flat argparse flag (~80+ flags on generate alone). This made it hard to
keep CLI parity with the config schema, created brittle translation code, and
required per-model shell scripts full of duplicate flag lists. The nested config
approach lets users maintain one YAML per model and override individual values
on the command line without needing to know the full flag set.

Purpose

  • Replaces the flat argparse-flag inference CLI with a config-driven model.
    fastvideo generate and fastvideo serve now accept only --config <path>
    plus optional dotted overrides (e.g. --request.sampling.seed 42).

  • Config-only CLI: Both subcommands take a single --config pointing to a
    nested YAML/JSON file with generator: and request: (or server:)
    top-level sections. All other CLI input uses dotted paths that map directly
    into the config tree.

  • Request metadata tracking: New request_metadata.py module tracks which
    GenerationRequest fields were explicitly provided by the user vs filled by
    schema defaults. Uses a 3-way merge (raw config + baseline snapshot + current
    state) with __setattr__-based dirty-path recording so that post-load
    mutations (e.g. config.request.sampling.seed = 7) are correctly reflected
    when translating to the legacy SamplingParam.

  • Typed config builders: New inference_config.py provides
    build_generate_run_config() and build_serve_config() with validation
    (prompt source checks, num_gpus > 0, dotted-override prefix enforcement).

  • Shell scripts → YAML configs: 12 per-model shell scripts replaced with
    15 declarative YAML configs plus a single run.sh launcher.

  • FlexibleArgumentParser: Extended with parse_known_args override and
    deferred config loading so dotted overrides pass through to the config
    builders instead of being consumed by argparse.

  • Dashed-key normalization: --request.output.output-path is automatically
    normalized to request.output.output_path in the override parser.

Changes

Test Plan

# Commands you ran

Test Results

Test output
# Paste output here

Checklist

  • I ran pre-commit run --all-files and fixed all issues
  • I added or updated tests for my changes
  • I updated documentation if needed
  • I considered GPU memory impact of my changes

For model/pipeline changes, also check:

  • I verified SSIM regression tests pass
  • I updated the support matrix if adding a new model

@mergify mergify Bot added scope: inference Inference pipeline, serving, CLI scope: infra CI, tests, Docker, build scope: docs Documentation labels Apr 9, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 9, 2026

⚠️ PR title format required

Your PR title must start with a type tag in brackets. Examples:

  • [feat] Add new model support
  • [bugfix] Fix VAE tiling corruption
  • [refactor] Restructure training pipeline
  • [perf] Optimize attention kernel
  • [ci] Update test infrastructure
  • [docs] Add inference guide
  • [misc] Clean up configs
  • [new-model] Port Flux2 to FastVideo

Valid tags: feat, feature, bugfix, fix, refactor, perf, ci, doc, docs, misc, chore, kernel, new-model

Please update your PR title and the merge protection check will pass automatically.

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 9, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 PR merge requirements

Waiting for:

  • #approved-reviews-by>=1
  • check-success=fastcheck-passed
  • check-success=full-suite-passed
This rule is failing.
  • #approved-reviews-by>=1
  • check-success=fastcheck-passed
  • check-success=full-suite-passed
  • check-success~=pre-commit
  • title~=(?i)^\[(feat|feature|bugfix|fix|refactor|perf|ci|doc|docs|misc|chore|kernel|new.?model)\]

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the FastVideo CLI to be "config-first," moving from a flat argument structure to a nested YAML/JSON configuration with dotted-path overrides. This change simplifies CLI usage by consolidating parameters into structured config files and introduces a new request_metadata module to track explicitly provided fields. The old CLI scripts have been replaced with new YAML config files and a generic run.sh wrapper. Documentation has been updated to reflect these changes. Review comments highlight a potential architectural concern regarding the global side effects of __setattr__ patching for request tracking and suggest improving error messages for users transitioning from the old flat config format.


original_setattr(self, name, value)

type.__setattr__(config_type, "__setattr__", _tracking_setattr)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Patching __setattr__ on the class level has global side effects for all instances of these dataclasses within the process. While this is necessary for the current tracking implementation, it should be noted that any part of the codebase using these types will now incur the overhead of the tracking logic, even if they are not part of a tracked request. Consider if a more localized approach (e.g., a wrapper or a specific subclass) could achieve the same goal without global monkey-patching.

Comment on lines +62 to +64
if not isinstance(raw.get("generator"), Mapping):
raise ValueError("Inference config must use the nested schema with a top-level "
"'generator' mapping")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check for the generator key is used to distinguish between the new nested schema and the legacy flat schema. It might be beneficial to provide a more descriptive error message or a migration hint if a flat config is detected, as users transitioning from older versions might be confused by the requirement for a top-level mapping.

@Eigensystem
Copy link
Copy Markdown
Collaborator

/merge

@github-actions github-actions Bot added the ready PR is ready to merge label Apr 9, 2026
@Eigensystem Eigensystem changed the title [3/n] Improve API: extend support to cli [feat] [3/n] Improve API: extend support to cli Apr 9, 2026
@mergify mergify Bot added the type: feat New feature or capability label Apr 9, 2026
@mergify mergify Bot added the scope: kernel CUDA kernels, fastvideo-kernel label Apr 14, 2026
Convert test_vmoba_inference.py and test_bsa_inference.py from flat CLI
args (--model-path, --prompt, etc.) to nested JSON config files matching
the new fastvideo generate --config interface. Update stale CLI examples
in architecture.md and bench_serving.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@SolitaryThinker SolitaryThinker merged commit 88a5a93 into main Apr 14, 2026
13 of 19 checks passed
@SolitaryThinker SolitaryThinker deleted the will/api_3 branch April 14, 2026 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready PR is ready to merge scope: docs Documentation scope: inference Inference pipeline, serving, CLI scope: infra CI, tests, Docker, build scope: kernel CUDA kernels, fastvideo-kernel type: feat New feature or capability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants