-
Notifications
You must be signed in to change notification settings - Fork 186
[Klaud Cold] Update dsv4-fp8-h200-vllm (+mtp) vLLM image to v0.22.0 #1597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
functionstackx
wants to merge
1
commit into
main
Choose a base branch
from
klaud-cold/dsv4-fp8-h200-vllm-v0.22.0
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+9
−2
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 The rationale comments above both
dsv4-fp8-h200-vllm(lines 2883–2885) anddsv4-fp8-h200-vllm-mtp(lines 2907–2909) are stale and now contradict the image lines this PR is editing. The non-MTP block still says "Uses the cu129 image" but the entry pins the stockvllm/vllm-openai:v0.22.0tag, and the MTP block claims it uses "the canonical v0.20.1 image" while the non-MTP entry above is "still on the deepseekv4-cu129 tag" — both factually wrong post-PR (MTP is v0.22.0, non-MTP is v0.22.0). Pre-existing staleness (the prior v0.21.0 bump already failed to refresh these), but since this PR edits both image lines, it would be a natural place to refresh the adjacent rationale. Severity: nit.Extended reasoning...
What the bug is
This PR bumps both
dsv4-fp8-h200-vllm(line 2887) anddsv4-fp8-h200-vllm-mtp(line 2911) fromvllm/vllm-openai:v0.21.0tovllm/vllm-openai:v0.22.0. Both entries have explanatory comment blocks immediately above them whose factual claims no longer match the image they document.Comment #1 — above
dsv4-fp8-h200-vllm(lines 2883–2885):The comment says "Uses the cu129 image" — historically this referred to a custom
deepseekv4-cu129build. The pinned tag is now the stock public semverv0.22.0, so the rationale-for-image-choice no longer applies.Comment #2 — above
dsv4-fp8-h200-vllm-mtp(lines 2907–2909):Both factual claims are wrong post-PR: (1) the MTP image is
v0.22.0, notv0.20.1; (2) the non-MTP entry directly above is alsov0.22.0, notdeepseekv4-cu129.Step-by-step proof
.github/configs/nvidia-master.yamlafter this PR is applied.image: vllm/vllm-openai:v0.22.0— a stock public tag, not a cu129 build. Contradiction confirmed.image: vllm/vllm-openai:v0.22.0. Contradiction [NVIDIA] Add TRT-LLM 70B FP8 via slurm #1 confirmed.image: vllm/vllm-openai:v0.22.0. Contradiction [NVIDIA] Add TRT 70B (FP8 and FP4) #2 confirmed.git log -pon this file shows PR [Handoff to @Oseltamivir Claude /loop] [Klaud Cold] Update dsv4-fp8-h200-vllm (+mtp) vLLM image to v0.21.0 #1461 bumped these entries from the original tags tov0.21.0without updating the comments; the staleness is therefore pre-existing. This PR widens the gap (v0.21.0 → v0.22.0) but is editing the exact image lines the comments describe, so it is the natural place to refresh the rationale in the same change.Impact
No runtime impact — these are YAML comments only. The hazard is reviewer/maintainer confusion: someone reading the file to understand why a specific image was chosen will get a misleading answer, and the MTP block in particular invites a "wait, the non-MTP entry must still be on cu129" mistake during the next bump.
How to fix
Either (a) drop the now-obsolete provenance sentences entirely and keep only the still-accurate parts (no FP4 path on H200, max-model-len 800k, MTP adds
--speculative-config), or (b) replace the stale lines with neutral rationale that doesn't pin to a specific tag — e.g. for the MTP block, just say "MTP variant of dsv4-fp8-h200-vllm; adds--speculative-config ...." Avoid embedding image tags or cross-references between entries in prose, since those go stale on every bump.Severity
Nit / pre-existing. The staleness was introduced before this PR (the v0.21.0 bump in #1461 already mismatched the cu129/v0.20.1 prose), and the runtime behavior is unaffected. Flagging because the PR is directly editing both image lines and is in the natural position to refresh the adjacent rationale in the same change.