Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 17 additions & 19 deletions jenkins/scripts/cbts/rules/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,33 +162,31 @@ still force VG stages.
Block selection — entry-pattern based only:
VisualGen has no `condition.terms.backend` of its own; VG entries
live in `backend: pytorch` and `backend: tensorrt` blocks. A block
"belongs to VG" iff any of its `tests:` entries matches one of the
three stable VG path families:

- `unittest/_torch/visual_gen/...` (28 entries)
- `examples/test_visual_gen.py...` (1 entry)
- `visual_gen/test_visual_gen_benchmark.py` (1 entry)
"belongs to VG" iff any of its `tests:` entries lives under a dedicated
`visual_gen/` test path or is the VisualGen perf-sanity entry under
`perf/`.

For each matched block, `block_filters` keeps only the VG entries.
Non-VG siblings in the same block stay governed by other rules.

Outward-facing fallback: unlike AutoDeploy, VG is imported eagerly
(top-level `from tensorrt_llm._torch.visual_gen.config import ...`
in `commands/serve.py`, `commands/utils.py`,
`serve/openai_server.py`). The 5 files that define / re-export the
public API symbols (`VisualGenArgs`, `ParallelConfig`, `VisualGen`,
`VisualGenParams`) are listed in `_VG_OUTWARD_FILES`; touching any
of them claims the changed files but emits `scope=None` so Selector
falls back to baseline. This protects trtllm-serve / trtllm-bench
startup paths from VG signature drift slipping through pre-merge.
Outward-facing fallback: unlike AutoDeploy, VG public symbols are
imported eagerly by non-VG startup paths such as `commands/serve.py`,
`commands/utils.py`, and `serve/openai_server.py`. The public API
package prefix (`tensorrt_llm/visual_gen/`) is listed in
`_VG_OUTWARD_PREFIXES`; touching any non-doc file under it claims the
changed files but emits `scope=None` so Selector falls back to baseline.
This protects trtllm-serve / trtllm-bench startup paths from VG
signature drift slipping through pre-merge.

Outcomes:

- No VG source files in the diff → rule returns `None`.
- VG source touched, all internal → `scope=visualgenonly`; sanity
off (VG changes don't affect wheel sanity); perfsanity on iff a
matched block lives in `l0_perf` or `*perf_sanity*`.
- VG source touched, any outward-facing file → `scope=None`
- VG source touched, all internal (`examples/visual_gen/**` or
`tensorrt_llm/_torch/visual_gen/**`) → `scope=visualgenonly`;
sanity off (VG changes don't affect wheel sanity); perfsanity on iff
a matched block lives in `l0_perf` or `*perf_sanity*`.
- VG source touched, any outward-facing path under
`tensorrt_llm/visual_gen/` → `scope=None`
(fallback).
- VG source touched but no VG block found anywhere (defensive) →
`scope=None` (fallback).
Expand Down
47 changes: 17 additions & 30 deletions jenkins/scripts/cbts/rules/visual_gen_rule.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,21 +20,18 @@
Block selection — entry-pattern based only:
VisualGen does NOT have its own `condition.terms.backend`; VG test
entries live in `backend: pytorch` and `backend: tensorrt` blocks.
A block "belongs to VG" iff any of its `tests:` entries matches one
of the three stable VG entry path families:
- `unittest/_torch/visual_gen/...` (28 entries)
- `examples/test_visual_gen.py...` (1 entry)
- `visual_gen/test_visual_gen_benchmark.py` (1 entry)
A block "belongs to VG" iff any of its `tests:` entries lives under a
dedicated `visual_gen/` test path or is the VisualGen perf-sanity entry.

Outward-facing fallback:
Unlike AutoDeploy, VG is imported eagerly (module-level) by non-VG
code: `commands/serve.py`, `commands/utils.py`, and
`serve/openai_server.py` import `VisualGenArgs` / `ParallelConfig` /
`VisualGen` / `VisualGenParams` at top level. A signature change to
those symbols can break trtllm-serve startup, which would affect
non-VG tests. The 5 files that define / re-export those symbols are
listed in `_VG_OUTWARD_FILES`; touching any of them forces fallback
even if the rest of the diff is VG-internal.
non-VG tests. The public API package prefix is listed in
`_VG_OUTWARD_PREFIXES`; touching any file under it forces fallback even
if the rest of the diff is VG-internal.
"""

from __future__ import annotations
Expand All @@ -55,27 +52,17 @@
"tensorrt_llm/visual_gen/",
)

# Files inside _VG_SRC_PREFIXES that are imported eagerly by non-VG
# code (top-level `from ... import VisualGenArgs / ParallelConfig /
# VisualGen / VisualGenParams`). Touching any of these can break
# trtllm-serve / trtllm-bench startup paths, so the rule defers to
# baseline rather than narrowing.
_VG_OUTWARD_FILES: frozenset[str] = frozenset(
{
"tensorrt_llm/_torch/visual_gen/config.py",
"tensorrt_llm/visual_gen/__init__.py",
"tensorrt_llm/visual_gen/args.py",
"tensorrt_llm/visual_gen/params.py",
"tensorrt_llm/visual_gen/visual_gen.py",
}
)
# Public VisualGen API package imported eagerly by non-VG code. Touching
# any non-doc file under this prefix can break trtllm-serve / trtllm-bench
# startup paths, so the rule defers to baseline rather than narrowing.
_VG_OUTWARD_PREFIXES: tuple[str, ...] = ("tensorrt_llm/visual_gen/",)

# Substrings that mark a test entry as VG. Cover all three path
# families that appear in test-db YAMLs (audited 2026-05).
# Substrings that mark a test entry as VG. VG tests are expected to live
# under dedicated visual_gen test directories, except the perf-sanity
# frontend which stays with the shared perf tests.
_VG_ENTRY_PATTERNS: tuple[str, ...] = (
"unittest/_torch/visual_gen/",
"examples/test_visual_gen.py",
"visual_gen/test_visual_gen_benchmark.py",
"visual_gen/",
"perf/test_visual_gen_perf_sanity.py",
)


Expand Down Expand Up @@ -122,19 +109,19 @@ def apply(self, pr: PRInputs) -> Optional[RuleResult]:
if not claimed:
return None

# Outward-facing VG files break the "self-contained subsystem"
# Outward-facing VG paths break the "self-contained subsystem"
# assumption — they are imported eagerly by trtllm-serve /
# trtllm-bench. Claim the files (so they don't go unhandled and
# silently fallback) but emit scope=None so Selector falls back
# to baseline coverage instead of narrowing to VG-only stages.
outward = claimed & _VG_OUTWARD_FILES
outward = {f for f in claimed if f.startswith(_VG_OUTWARD_PREFIXES)}
if outward:
return RuleResult(
handled_files=claimed,
affected_stages=set(),
scope=None,
reason=(
f"visualgen: {len(outward)} outward-facing VG file(s) "
f"visualgen: {len(outward)} outward-facing VG path(s) "
f"touched ({sorted(outward)[0]}{'...' if len(outward) > 1 else ''}); "
"fallback to baseline"
),
Expand Down
2 changes: 1 addition & 1 deletion tensorrt_llm/bench/benchmark/visual_gen.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,8 +197,8 @@ def visual_gen_command(
"""Benchmark VisualGen (image/video generation) models offline."""
import yaml

from tensorrt_llm._torch.visual_gen.config import VisualGenArgs
from tensorrt_llm.visual_gen import VisualGen, VisualGenParams
from tensorrt_llm.visual_gen.args import VisualGenArgs

if prompt is None and prompt_file is None:
raise click.UsageError("Either --prompt or --prompt_file must be specified.")
Expand Down
2 changes: 1 addition & 1 deletion tensorrt_llm/commands/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
import click
from click.core import ParameterSource

from tensorrt_llm._torch.visual_gen.config import ParallelConfig
from tensorrt_llm.llmapi.utils import download_hf_partial
from tensorrt_llm.visual_gen.args import ParallelConfig

logger = logging.getLogger(__name__)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
import torch
import torch.distributed as dist
import torch.multiprocessing as mp
from defs.examples.test_visual_gen import (
from defs.examples.visual_gen.test_visual_gen import (
WAN22_LPIPS_FRAME_RATE,
WAN22_LPIPS_GUIDANCE_SCALE,
WAN22_LPIPS_HEIGHT,
Expand All @@ -40,8 +40,8 @@
)

try:
from tensorrt_llm._torch.visual_gen.config import ParallelConfig
from tensorrt_llm._utils import get_free_port
from tensorrt_llm.visual_gen.args import ParallelConfig

MODULES_AVAILABLE = True
except ImportError:
Expand Down
2 changes: 1 addition & 1 deletion tests/integration/test_lists/test-db/l0_a10.yml
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ l0_a10:
# visual_gen
- unittest/_torch/visual_gen/test_visual_gen_params.py
- unittest/visual_gen/test_output.py
- unittest/media/test_encoding.py
- unittest/visual_gen/test_media_encoding.py
- unittest/_torch/visual_gen/test_tensor_payload.py
# llmapi
- unittest/llmapi/test_llm_utils.py
Expand Down
28 changes: 14 additions & 14 deletions tests/integration/test_lists/test-db/l0_b200.yml
Original file line number Diff line number Diff line change
Expand Up @@ -251,13 +251,13 @@ l0_b200:
- unittest/_torch/visual_gen/test_wan_transformer.py
- unittest/_torch/visual_gen/test_cosmos3_transformer.py
- unittest/_torch/visual_gen/test_cosmos3_pipeline.py
- examples/test_visual_gen.py::test_wan_t2v_example
- examples/test_visual_gen.py::test_flux1_example
- examples/test_visual_gen.py::test_flux2_example
- examples/test_visual_gen.py::test_ltx2_example
- examples/test_visual_gen.py::test_wan_i2v_example
- examples/test_visual_gen.py::test_cosmos3_example
# - examples/test_visual_gen.py
- examples/visual_gen/test_visual_gen.py::test_wan_t2v_example
- examples/visual_gen/test_visual_gen.py::test_flux1_example
- examples/visual_gen/test_visual_gen.py::test_flux2_example
- examples/visual_gen/test_visual_gen.py::test_ltx2_example
- examples/visual_gen/test_visual_gen.py::test_wan_i2v_example
- examples/visual_gen/test_visual_gen.py::test_cosmos3_example
# - examples/visual_gen/test_visual_gen.py
# ------------- Host perf module regression tests (6 representative scenarios) ---------------
- perf/host_perf/test_module_scheduler.py::test_scheduler_production[production_gen_only_bs8]
- perf/host_perf/test_module_scheduler.py::test_scheduler_production[production_mixed_32gen_4ctx]
Expand Down Expand Up @@ -353,13 +353,13 @@ l0_b200:
- accuracy/test_llm_api_pytorch.py::TestGPTOSS::test_w4_1gpu[v2_kv_cache-True-True-trtllm-auto]
- accuracy/test_llm_api_pytorch_multimodal.py::TestNanoV3Omni::test_auto_dtype[bf16]
# ------------- VisualGen single-GPU tests ---------------
- examples/test_visual_gen.py::test_visual_gen_quickstart
- examples/test_visual_gen.py::test_visual_gen_api_walkthrough
- examples/test_visual_gen.py::test_flux1_lpips_against_golden
- examples/test_visual_gen.py::test_flux2_lpips_against_golden
- examples/test_visual_gen.py::test_ltx2_lpips_against_golden
- examples/test_visual_gen.py::test_wan21_t2v_lpips_against_golden
- examples/test_visual_gen.py::test_wan22_t2v_lpips_against_golden
- examples/visual_gen/test_visual_gen.py::test_visual_gen_quickstart
- examples/visual_gen/test_visual_gen.py::test_visual_gen_api_walkthrough
- examples/visual_gen/test_visual_gen.py::test_flux1_lpips_against_golden
- examples/visual_gen/test_visual_gen.py::test_flux2_lpips_against_golden
- examples/visual_gen/test_visual_gen.py::test_ltx2_lpips_against_golden
- examples/visual_gen/test_visual_gen.py::test_wan21_t2v_lpips_against_golden
- examples/visual_gen/test_visual_gen.py::test_wan22_t2v_lpips_against_golden
- visual_gen/test_visual_gen_benchmark.py::test_offline_benchmark
- visual_gen/test_visual_gen_benchmark.py::test_online_benchmark[openai-videos]
# ------------- AutoDeploy Backend Stages ---------------
Expand Down
28 changes: 14 additions & 14 deletions tests/integration/test_lists/test-db/l0_dgx_b200.yml
Original file line number Diff line number Diff line change
Expand Up @@ -224,8 +224,8 @@ l0_dgx_b200:
- accuracy/test_llm_api_pytorch.py::TestMistralLarge3_675B::test_fp8[latency_moe_deepgemm] TIMEOUT (60)
- accuracy/test_llm_api_pytorch.py::TestNemotronV3Super::test_nvfp4_parallelism[TP8_PP1] TIMEOUT (60)
- test_e2e.py::test_deepseek_r1_mtp_bench TIMEOUT(60) # Cover https://nvbugs/5670108
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[cfg2_ulysses2_attn2d_2x1]
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[attn2d_2x2_ulysses2]
- examples/visual_gen/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[cfg2_ulysses2_attn2d_2x1]
- examples/visual_gen/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[attn2d_2x2_ulysses2]
- condition:
ranges:
system_gpu_count:
Expand Down Expand Up @@ -309,18 +309,18 @@ l0_dgx_b200:
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_nvfp4_4gpus[moe_backend=CUTEDSL-mtp_nextn=2-ep4-fp8kv=False-attention_dp=True-cuda_graph=False-overlap_scheduler=False-low_precision_combine=True-torch_compile=False]
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_nvfp4_4gpus[moe_backend=CUTEDSL-mtp_nextn=2-ep4-fp8kv=True-attention_dp=True-cuda_graph=True-overlap_scheduler=True-low_precision_combine=True-torch_compile=False]
- accuracy/test_llm_api_pytorch.py::TestLlama3_3_70BInstruct::test_fp4_tp2pp2[torch_compile=False-enable_gemm_allreduce_fusion=False]
- examples/test_visual_gen.py::test_wan_t2v_example
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[ulysses4]
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[cfg2_ulysses2]
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[attn2d_2x2]
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_tp[tp2]
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_tp[cfg2_tp2]
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_tp[tp2_ulysses2]
- examples/test_visual_gen.py::test_vbench_dimension_score_wan
- examples/test_visual_gen.py::test_vbench_dimension_score_wan22_a14b_fp8
- examples/test_visual_gen.py::test_vbench_dimension_score_wan22_a14b_nvfp4
- examples/test_visual_gen.py::test_vbench_dimension_score_ltx2_bf16
- examples/test_visual_gen.py::test_vbench_dimension_score_ltx2_fp8
- examples/visual_gen/test_visual_gen.py::test_wan_t2v_example
- examples/visual_gen/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[ulysses4]
- examples/visual_gen/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[cfg2_ulysses2]
- examples/visual_gen/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[attn2d_2x2]
- examples/visual_gen/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_tp[tp2]
- examples/visual_gen/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_tp[cfg2_tp2]
- examples/visual_gen/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_tp[tp2_ulysses2]
- examples/visual_gen/test_visual_gen.py::test_vbench_dimension_score_wan
- examples/visual_gen/test_visual_gen.py::test_vbench_dimension_score_wan22_a14b_fp8
- examples/visual_gen/test_visual_gen.py::test_vbench_dimension_score_wan22_a14b_nvfp4
- examples/visual_gen/test_visual_gen.py::test_vbench_dimension_score_ltx2_bf16
- examples/visual_gen/test_visual_gen.py::test_vbench_dimension_score_ltx2_fp8
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B_Instruct_2507::test_skip_softmax_attention_4gpus[target_sparsity_0.5-fp8kv=False]
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B_Instruct_2507::test_skip_softmax_attention_4gpus[target_sparsity_0.5-fp8kv=True]
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B_Instruct_2507::test_skip_softmax_attention_4gpus[target_sparsity_0.9-fp8kv=False]
Expand Down
4 changes: 2 additions & 2 deletions tests/integration/test_lists/test-db/l0_gh200.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ l0_gh200:
- unittest/bindings
- unittest/llmapi/test_llm_quant.py
- llmapi/test_llm_examples.py::test_llmapi_quickstart_atexit
- examples/test_visual_gen.py::test_visual_gen_quickstart
- examples/test_visual_gen.py::test_visual_gen_api_walkthrough
- examples/visual_gen/test_visual_gen.py::test_visual_gen_quickstart
- examples/visual_gen/test_visual_gen.py::test_visual_gen_api_walkthrough
- unittest/test_model_runner_cpp.py
- accuracy/test_cli_flow.py::TestGptNext::test_auto_dtype
- examples/test_medusa.py::test_llm_medusa_with_qaunt_base_model_1gpu[fp8-use_py_session-medusa-vicuna-7b-v1.3-4-heads-float16-bs1] TIMEOUT (90)
Expand Down
4 changes: 2 additions & 2 deletions tests/integration/test_lists/test-db/l0_h100.yml
Original file line number Diff line number Diff line change
Expand Up @@ -317,8 +317,8 @@ l0_h100:
- unittest/test_model_runner_cpp.py
- unittest/llmapi/test_llm_quant.py # 5.5 mins on H100
- llmapi/test_llm_examples.py::test_llmapi_quickstart_atexit
- examples/test_visual_gen.py::test_visual_gen_quickstart
- examples/test_visual_gen.py::test_visual_gen_api_walkthrough
- examples/visual_gen/test_visual_gen.py::test_visual_gen_quickstart
- examples/visual_gen/test_visual_gen.py::test_visual_gen_api_walkthrough
- unittest/trt/attention/test_gpt_attention_IFB.py
- accuracy/test_cli_flow.py::TestLlama3_1_8BInstruct::test_fp8_prequantized
- examples/test_multimodal.py::test_llm_multimodal_general[Llama-3.2-11B-Vision-pp:1-tp:1-bfloat16-bs:1-cpp_e2e:False-nb:1]
Expand Down
4 changes: 2 additions & 2 deletions tests/integration/test_lists/test-db/l0_l40s.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,8 @@ l0_l40s:
- examples/test_llama.py::test_llm_llama_v3_dora_1gpu[commonsense-llama-v3-8b-dora-r32-llama-v3-8b-hf-base_fp16]
- examples/test_nemotron_nas.py::test_nemotron_nas_summary_1gpu[DeciLM-7B]
- llmapi/test_llm_examples.py::test_llmapi_quickstart
- examples/test_visual_gen.py::test_visual_gen_quickstart
- examples/test_visual_gen.py::test_visual_gen_api_walkthrough
- examples/visual_gen/test_visual_gen.py::test_visual_gen_quickstart
- examples/visual_gen/test_visual_gen.py::test_visual_gen_api_walkthrough
- llmapi/test_llm_examples.py::test_llmapi_example_inference
- llmapi/test_llm_examples.py::test_llmapi_example_inference_async
- llmapi/test_llm_examples.py::test_llmapi_example_inference_async_streaming
Expand Down
Loading
Loading