Skip to content

Commit e47f26e

Browse files
authored
[TRTLLM-13027][ci] Relocate under-using tests to right-sized stages (#14684)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
1 parent 4279e5b commit e47f26e

3 files changed

Lines changed: 38 additions & 20 deletions

File tree

jenkins/L0_Test.groovy

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4053,6 +4053,7 @@ def launchTestJobs(pipeline, testFilter)
40534053
"DGX_B200-Triton-Post-Merge-1": ["auto:dgx-b200-flex", "l0_b200", 1, 1, 1, 1, true],
40544054
"DGX_B200-PyTorch-Post-Merge-1": ["auto:dgx-b200-flex", "l0_b200", 1, 2, 1, 1, true],
40554055
"DGX_B200-PyTorch-Post-Merge-2": ["auto:dgx-b200-flex", "l0_b200", 2, 2, 1, 1, true],
4056+
"DGX_B200-2_GPUs-PyTorch-1": ["auto:dgx-b200-flex", "l0_dgx_b200", 1, 1, 2, 1, true],
40564057
"DGX_B200-4_GPUs-PyTorch-1": ["auto:dgx-b200-flex", "l0_dgx_b200", 1, 3, 4, 1, true],
40574058
"DGX_B200-4_GPUs-PyTorch-2": ["auto:dgx-b200-flex", "l0_dgx_b200", 2, 3, 4, 1, true],
40584059
"DGX_B200-4_GPUs-PyTorch-3": ["auto:dgx-b200-flex", "l0_dgx_b200", 3, 3, 4, 1, true],

tests/integration/test_lists/test-db/l0_b200.yml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,7 @@ l0_b200:
120120
- unittest/_torch/modules/moe/test_moe_module.py::test_configurable_moe_single_gpu -k "CUTEDSL"
121121
- unittest/_torch/modules/moe/test_moe_module.py::test_configurable_moe_single_gpu -k "DEEPGEMM"
122122
- unittest/_torch/modules/moe/test_moe_module.py::test_configurable_moe_single_gpu -k "DENSEGEMM"
123+
- unittest/_torch/modules/moe/test_moe_module.py::test_configurable_moe_single_gpu -k "MEGAMOE_DEEPGEMM"
123124
# ------------- MoE: FlashInfer & TRTLLM symbol collision tests ---------------
124125
- unittest/_torch/flashinfer/test_trtllm_flashinfer_symbol_collision.py
125126
# --- MoE end
@@ -307,6 +308,16 @@ l0_b200:
307308
- accuracy/test_llm_api_pytorch.py::TestGPTOSS::test_w4_1gpu[v1_kv_cache-True-True-trtllm-auto]
308309
- accuracy/test_llm_api_pytorch.py::TestGPTOSS::test_w4_1gpu[v2_kv_cache-True-True-trtllm-auto]
309310
- accuracy/test_llm_api_pytorch_multimodal.py::TestNanoV3Omni::test_auto_dtype[bf16]
311+
# ------------- VisualGen single-GPU tests ---------------
312+
- examples/test_visual_gen.py::test_visual_gen_quickstart
313+
- examples/test_visual_gen.py::test_visual_gen_api_walkthrough
314+
- examples/test_visual_gen.py::test_flux1_lpips_against_golden
315+
- examples/test_visual_gen.py::test_flux2_lpips_against_golden
316+
- examples/test_visual_gen.py::test_ltx2_lpips_against_golden
317+
- examples/test_visual_gen.py::test_wan21_t2v_lpips_against_golden
318+
- examples/test_visual_gen.py::test_wan22_t2v_lpips_against_golden
319+
- visual_gen/test_visual_gen_benchmark.py::test_offline_benchmark
320+
- visual_gen/test_visual_gen_benchmark.py::test_online_benchmark[openai-videos]
310321
# ------------- AutoDeploy Backend Stages ---------------
311322
- condition:
312323
ranges:

tests/integration/test_lists/test-db/l0_dgx_b200.yml

Lines changed: 26 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,30 @@
11
version: 0.0.1
22
l0_dgx_b200:
3+
- condition:
4+
ranges:
5+
system_gpu_count:
6+
gte: 2
7+
lte: 2
8+
wildcards:
9+
gpu:
10+
- '*b200*'
11+
linux_distribution_name: ubuntu*
12+
cpu: x86_64
13+
terms:
14+
stage: pre_merge
15+
backend: pytorch
16+
orchestrator: mpi
17+
tests:
18+
- unittest/_torch/misc/test_autotuner.py::test_autotuner_distributed_strategy
19+
- accuracy/test_llm_api_pytorch.py::TestQwen3_5_35B_A3B::test_bf16[tp2-CUTLASS]
20+
- accuracy/test_llm_api_pytorch.py::TestQwen3_5_35B_A3B::test_bf16[tp2-TRTLLM]
21+
# ------------- KV Cache V2 Scheduler IT (multi-GPU) ---------------
22+
- kv_cache/test_kv_cache_v2_scheduler.py::TestKVCacheV2DSv3Lite::test_mtp_draft_tokens
23+
- kv_cache/test_kv_cache_v2_scheduler.py::TestKVCacheV2DSv3Lite::test_mtp_chunked_draft_tokens
24+
- kv_cache/test_kv_cache_v2_scheduler.py::TestKVCacheV2DSv3Lite::test_mtp_eviction
25+
# ------------- VisualGen multi-GPU tests ---------------
26+
- unittest/_torch/visual_gen/test_flux_pipeline.py::TestFluxParallelism::test_ulysses_2gpu_correctness
27+
- unittest/_torch/visual_gen/test_flux_pipeline.py::TestFluxCombinedOptimizations::test_all_optimizations_combined
328
- condition:
429
ranges:
530
system_gpu_count:
@@ -15,7 +40,6 @@ l0_dgx_b200:
1540
backend: pytorch
1641
orchestrator: mpi
1742
tests:
18-
- unittest/_torch/misc/test_autotuner.py::test_autotuner_distributed_strategy
1943
- accuracy/test_llm_api_pytorch.py::TestNemotronV3Super::test_auto_dtype_4gpus[4-4-False-True-True]
2044
- accuracy/test_llm_api_pytorch.py::TestNemotronV3Super::test_auto_dtype_4gpus[4-4-True-True-True]
2145
- accuracy/test_llm_api_pytorch.py::TestNemotronV3Super::test_nvfp4_4gpu_mtp_ar TIMEOUT (60)
@@ -30,8 +54,6 @@ l0_dgx_b200:
3054
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B::test_nvfp4[dep4_latency_moe_trtllm-torch_compile=False]
3155
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B::test_nvfp4[dep4_latency_moe_cutlass-torch_compile=False]
3256
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B::test_nvfp4[dep4_latency_moe_cutlass-torch_compile=True]
33-
- accuracy/test_llm_api_pytorch.py::TestQwen3_5_35B_A3B::test_bf16[tp2-CUTLASS]
34-
- accuracy/test_llm_api_pytorch.py::TestQwen3_5_35B_A3B::test_bf16[tp2-TRTLLM]
3557
- disaggregated/test_disaggregated.py::test_disaggregated_deepseek_v3_lite_fp8_ucx[DeepSeek-V3-Lite-fp8]
3658
- disaggregated/test_disaggregated.py::test_disaggregated_deepseek_v3_lite_fp8_nixl[DeepSeek-V3-Lite-fp8]
3759
- disaggregated/test_disaggregated.py::test_disaggregated_gpt_oss_120b_harmony[gpt_oss/gpt-oss-120b]
@@ -42,10 +64,6 @@ l0_dgx_b200:
4264
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_bfloat16_4gpus_python_scheduler[ep4-mtp_nextn=2]
4365
- accuracy/test_llm_api_pytorch.py::TestMiniMaxM2::test_4gpus[attention_dp=False-cuda_graph=True-overlap_scheduler=True-tp_size=4-ep_size=4] TIMEOUT (60)
4466
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_bfloat16_4gpus[pp4-mtp_nextn=0-attention_dp=False-cuda_graph=False-overlap_scheduler=False-torch_compile=False] TIMEOUT (60)
45-
# ------------- KV Cache V2 Scheduler IT (multi-GPU) ---------------
46-
- kv_cache/test_kv_cache_v2_scheduler.py::TestKVCacheV2DSv3Lite::test_mtp_draft_tokens
47-
- kv_cache/test_kv_cache_v2_scheduler.py::TestKVCacheV2DSv3Lite::test_mtp_chunked_draft_tokens
48-
- kv_cache/test_kv_cache_v2_scheduler.py::TestKVCacheV2DSv3Lite::test_mtp_eviction
4967
# ------------- NVBug 6025177: trtllm-serve cross-request KV contamination (OpenAI) ---------------
5068
- test_e2e.py::test_openai_kv_cache_contamination TIMEOUT (120)
5169
- condition:
@@ -81,7 +99,6 @@ l0_dgx_b200:
8199
- unittest/_torch/modules/moe/test_moe_module.py::test_configurable_moe_multi_gpu -k "DEEPGEMM and not MEGAMOE_DEEPGEMM"
82100
# --- MEGAMOE_DEEPGEMM (W4A8_MXFP4_MXFP8 only) ---
83101
- unittest/_torch/modules/moe/test_moe_module.py::test_configurable_moe_multi_gpu -k "MEGAMOE_DEEPGEMM"
84-
- unittest/_torch/modules/moe/test_moe_module.py::test_configurable_moe_single_gpu -k "MEGAMOE_DEEPGEMM"
85102
# ------------- MoE: test_multi_gpu_eplb ---------------
86103
- unittest/_torch/modules/moe/test_moe_module.py::test_configurable_moe_multi_gpu_eplb
87104
- condition:
@@ -165,8 +182,6 @@ l0_dgx_b200:
165182
- accuracy/test_disaggregated_serving.py::TestQwen3NextInstruct::test_auto_dtype[use_py_transceiver=False] TIMEOUT (60)
166183
# ------------- VisualGen multi-GPU tests ---------------
167184
- unittest/_torch/visual_gen/multi_gpu
168-
- unittest/_torch/visual_gen/test_flux_pipeline.py::TestFluxParallelism::test_ulysses_2gpu_correctness
169-
- unittest/_torch/visual_gen/test_flux_pipeline.py::TestFluxCombinedOptimizations::test_all_optimizations_combined
170185
- condition:
171186
ranges:
172187
system_gpu_count:
@@ -192,7 +207,6 @@ l0_dgx_b200:
192207
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_fp8_blockscale[baseline_fp8kv] TIMEOUT (60)
193208
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_fp8_blockscale[latency] TIMEOUT (60)
194209
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_fp8_blockscale[disable_skip_indexer] TIMEOUT (60)
195-
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_nvfp4_attn_multi_gpus TIMEOUT (60)
196210
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_nvfp4_multi_gpus[baseline_fp8kv] TIMEOUT (60)
197211
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_nvfp4_multi_gpus[latency] TIMEOUT (60)
198212
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_nvfp4_multi_gpus[disable_skip_indexer] TIMEOUT (60)
@@ -305,30 +319,22 @@ l0_dgx_b200:
305319
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_nvfp4_4gpus[moe_backend=CUTEDSL-mtp_nextn=2-ep4-fp8kv=False-attention_dp=True-cuda_graph=False-overlap_scheduler=False-low_precision_combine=True-torch_compile=False]
306320
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_nvfp4_4gpus[moe_backend=CUTEDSL-mtp_nextn=2-ep4-fp8kv=True-attention_dp=True-cuda_graph=True-overlap_scheduler=True-low_precision_combine=True-torch_compile=False]
307321
- accuracy/test_llm_api_pytorch.py::TestLlama3_3_70BInstruct::test_fp4_tp2pp2[torch_compile=False-enable_gemm_allreduce_fusion=False]
308-
- examples/test_visual_gen.py::test_visual_gen_quickstart
309-
- examples/test_visual_gen.py::test_visual_gen_api_walkthrough
310322
- examples/test_visual_gen.py::test_wan_t2v_example
311-
- examples/test_visual_gen.py::test_flux1_lpips_against_golden
312-
- examples/test_visual_gen.py::test_flux2_lpips_against_golden
313-
- examples/test_visual_gen.py::test_ltx2_lpips_against_golden
314-
- examples/test_visual_gen.py::test_wan21_t2v_lpips_against_golden
315-
- examples/test_visual_gen.py::test_wan22_t2v_lpips_against_golden
316323
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[ulysses4]
317324
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[cfg2_ulysses2]
318325
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[ulysses2_ring2]
319326
- examples/test_visual_gen_multi_gpu.py::test_wan22_t2v_lpips_against_golden_multi_gpu[attn2d_2x2]
320327
- examples/test_visual_gen.py::test_vbench_dimension_score_wan
321328
- examples/test_visual_gen.py::test_vbench_dimension_score_wan22_a14b_fp8
322329
- examples/test_visual_gen.py::test_vbench_dimension_score_wan22_a14b_nvfp4
323-
- visual_gen/test_visual_gen_benchmark.py::test_offline_benchmark
324-
- visual_gen/test_visual_gen_benchmark.py::test_online_benchmark[openai-videos]
325330
- examples/test_visual_gen.py::test_vbench_dimension_score_ltx2_bf16
326331
- examples/test_visual_gen.py::test_vbench_dimension_score_ltx2_fp8
327332
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B_Instruct_2507::test_skip_softmax_attention_4gpus[target_sparsity_0.5-fp8kv=False]
328333
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B_Instruct_2507::test_skip_softmax_attention_4gpus[target_sparsity_0.5-fp8kv=True]
329334
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B_Instruct_2507::test_skip_softmax_attention_4gpus[target_sparsity_0.9-fp8kv=False]
330335
- accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B_Instruct_2507::test_skip_softmax_attention_4gpus[target_sparsity_0.9-fp8kv=True]
331336
- disaggregated/test_disaggregated.py::test_disaggregated_mamba_conc_greater_than_mbs[NVIDIA-Nemotron-3-Super-120B-A12B-FP8]
337+
- accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_nvfp4_attn_multi_gpus TIMEOUT (60)
332338
# ------------- AutoDeploy Backend Stages ---------------
333339
- condition:
334340
ranges:

0 commit comments

Comments
 (0)