Skip to content

Commit 907ad2e

Browse files
committed
switch to native offloading
1 parent 21ed1eb commit 907ad2e

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

.github/configs/nvidia-master.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1781,7 +1781,7 @@ dsv4-fp4-b200-vllm-agentic:
17811781
# Experimental LMCache MP offload. LMCacheMPConnector currently requires
17821782
# vLLM's hybrid KV manager to be disabled, so this is not an HMA/CSA/HCA
17831783
# parity run against the no-offload path.
1784-
- { tp: 8, ep: 8, dp-attn: true, offloading: lmcache-mp, conc-list: [12, 16, 24, 32, 48, 64] }
1784+
- { tp: 8, ep: 8, dp-attn: true, offloading: cpu, conc-list: [12, 16, 24, 32, 48, 64] }
17851785

17861786
dsv4-fp4-b200-trt:
17871787
image: ghcr.io#semianalysisai/trtllm-deepseek-v4:feat-deepseek_v4-9aa3715

0 commit comments

Comments
 (0)