Skip to content

Commit e43ae2f

Browse files
[Klaud Cold] Update dsr1-fp8-b200-trt (+mtp) TRT-LLM image to v1.3.0rc14 (#1488)
* Update dsr1-fp8-b200-trt (+mtp) TRT-LLM image to v1.3.0rc14 Update TensorRT-LLM image (off: v1.2.0rc6.post2 109d / mtp: v1.2.0rc6.post3 102d) to v1.3.0rc14 (latest pre-release) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: fill pr-link for #1488 --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 67230af commit e43ae2f

2 files changed

Lines changed: 9 additions & 2 deletions

File tree

.github/configs/nvidia-master.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2776,7 +2776,7 @@ kimik2.5-fp4-b300-vllm-agentic:
27762776
- { tp: 8, ep: 1, offloading: cpu, conc-list: [1, 2, 4, 8, 16, 32, 40, 48, 56, 64] }
27772777

27782778
dsr1-fp8-b200-trt:
2779-
image: nvcr.io#nvidia/tensorrt-llm/release:1.2.0rc6.post2
2779+
image: nvcr.io#nvidia/tensorrt-llm/release:1.3.0rc14
27802780
model: deepseek-ai/DeepSeek-R1-0528
27812781
model-prefix: dsr1
27822782
runner: b200
@@ -2799,7 +2799,7 @@ dsr1-fp8-b200-trt:
27992799
- { tp: 8, ep: 1, conc-start: 4, conc-end: 8 }
28002800

28012801
dsr1-fp8-b200-trt-mtp:
2802-
image: nvcr.io#nvidia/tensorrt-llm/release:1.2.0rc6.post3
2802+
image: nvcr.io#nvidia/tensorrt-llm/release:1.3.0rc14
28032803
model: deepseek-ai/DeepSeek-R1-0528
28042804
model-prefix: dsr1
28052805
runner: b200

perf-changelog.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2755,3 +2755,10 @@
27552755
- "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
27562756
- "Disable JIT DeepGemm (SGL_ENABLE_JIT_DEEPGEMM=0) to bypass v0.5.12 DeepGemm TMA-descriptor regression on B300 — see sgl-project/sglang#25551"
27572757
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1421
2758+
2759+
- config-keys:
2760+
- dsr1-fp8-b200-trt
2761+
- dsr1-fp8-b200-trt-mtp
2762+
description:
2763+
- "Update TensorRT-LLM image (off: v1.2.0rc6.post2 109d / mtp: v1.2.0rc6.post3 102d) to v1.3.0rc14 (latest pre-release)"
2764+
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1488

0 commit comments

Comments
 (0)