Skip to content

Commit 88a1153

Browse files
cquil11claude
andcommitted
nvidia-master(kimik2.5-fp4-b200-vllm-agentic): bump vLLM v0.20.2 -> v0.21.0
v0.20.2's bundled huggingface_hub==1.14.0 silently fetches Git-LFS pointer files instead of LFS content for `hf download --repo-type dataset`. Every kimik2.5-fp4-b200-vllm-agentic job in run 26536606210 hit "pyarrow.lib.ArrowInvalid: JSON parse error: Missing a name for object member. in row 0" -- the signature of pyarrow trying to parse the literal `version https://git-lfs.github.com/spec/v1` line of an LFS pointer file as JSON. b200-dgxc has no persistent /mnt/hf_hub_cache mount (per launcher diff), so every container re-downloads the dataset and re-hits the bug. v0.21.0 ships a newer huggingface_hub that resolves LFS correctly. v0.20.x's flashinfer fix for the max_model_len=131072 + prefix-caching warmup crash is included in v0.21.0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Cam Quilici <cjquilici@gmail.com>
1 parent eab58e9 commit 88a1153

1 file changed

Lines changed: 7 additions & 5 deletions

File tree

.github/configs/nvidia-master.yaml

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2699,13 +2699,15 @@ kimik2.5-fp4-b200-vllm:
26992699
# Diverged from kimik2.5-fp4-b200-vllm (agentic-coding sibling). Reasons below;
27002700
# the original kimik2.5-fp4-b200-vllm entry is left identical to origin/main so
27012701
# its fixed-seq-len sweep is unaffected.
2702-
# - image: 'vllm/vllm-openai:v0.17.0' -> 'vllm/vllm-openai:v0.20.2'
2702+
# - image: 'vllm/vllm-openai:v0.17.0' -> 'vllm/vllm-openai:v0.21.0'
27032703
# - runner: 'b200' -> 'b200-dgxc'
27042704
kimik2.5-fp4-b200-vllm-agentic:
2705-
# Same image as the INT4 sibling: v0.20.x carries the flashinfer fix that
2706-
# cleared the agentic-coding warmup crash on max_model_len=131072 +
2707-
# prefix caching.
2708-
image: vllm/vllm-openai:v0.20.2
2705+
# v0.21.0 ships a newer huggingface_hub that resolves LFS content correctly
2706+
# in `hf download` (1.14.0 in v0.20.x silently fetched LFS pointer files,
2707+
# which pyarrow then choked on with "Missing a name for object member" --
2708+
# see run 26536606210). v0.20.x's flashinfer fix for the agentic-coding
2709+
# warmup crash on max_model_len=131072 + prefix caching is included.
2710+
image: vllm/vllm-openai:v0.21.0
27092711
model: nvidia/Kimi-K2.5-NVFP4
27102712
model-prefix: kimik2.5
27112713
runner: b200-dgxc

0 commit comments

Comments
 (0)