Skip to content

Commit 4abc590

Browse files
committed
switch to native offloading
1 parent 907ad2e commit 4abc590

1 file changed

Lines changed: 1 addition & 0 deletions

File tree

.github/configs/nvidia-master.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1782,6 +1782,7 @@ dsv4-fp4-b200-vllm-agentic:
17821782
# vLLM's hybrid KV manager to be disabled, so this is not an HMA/CSA/HCA
17831783
# parity run against the no-offload path.
17841784
- { tp: 8, ep: 8, dp-attn: true, offloading: cpu, conc-list: [12, 16, 24, 32, 48, 64] }
1785+
- { tp: 8, ep: 8, dp-attn: true, offloading: none, conc-list: [12, 16, 24, 32, 48, 64] }
17851786

17861787
dsv4-fp4-b200-trt:
17871788
image: ghcr.io#semianalysisai/trtllm-deepseek-v4:feat-deepseek_v4-9aa3715

0 commit comments

Comments
 (0)