Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/configs/nvidia-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4205,7 +4205,7 @@ gptoss-fp4-b200-trt:
- { tp: 8, conc-start: 4, conc-end: 4}

gptoss-fp4-b200-vllm:
image: vllm/vllm-openai:v0.15.1
image: vllm/vllm-openai:v0.20.2

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The PR title and description say the vLLM image is being updated to v0.21.0, but the actual YAML change in .github/configs/nvidia-master.yaml (line 3972) bumps it to vllm/vllm-openai:v0.20.2, and the perf-changelog.yaml entry also says v0.20.2. Please reconcile by either bumping the image tag to v0.21.0 (matching the title/description) or updating the PR title/description/Ref to v0.20.2.

Extended reasoning...

What the bug is

The PR title is "Update gptoss-fp4-b200-vllm vLLM image to v0.21.0" and the description states it "Updates the vLLM image tag for gptoss-fp4-b200-vllm from v0.15.1 to v0.21.0". However, the actual diff updates the image to a different version:

  • .github/configs/nvidia-master.yaml line 3972: image: vllm/vllm-openai:v0.20.2
  • perf-changelog.yaml line 2460: "Update vLLM image from v0.15.1 to v0.20.2"

The two code-side artifacts are internally consistent at v0.20.2, but they disagree with the PR metadata (title + description), which both claim v0.21.0.

Step-by-step proof

  1. Read the PR title: Update gptoss-fp4-b200-vllm vLLM image to v0.21.0 → claims target = v0.21.0.
  2. Read the PR description: Updates the vLLM image tag for gptoss-fp4-b200-vllm from v0.15.1 to v0.21.0 → claims target = v0.21.0.
  3. Read the diff hunk in .github/configs/nvidia-master.yaml:
    gptoss-fp4-b200-vllm:
    -  image: vllm/vllm-openai:v0.15.1
    +  image: vllm/vllm-openai:v0.20.2
    Actual new tag = v0.20.2 (not v0.21.0).
  4. Read the diff hunk in perf-changelog.yaml:
    - config-keys:
        - gptoss-fp4-b200-vllm
      description:
        - "Update vLLM image from v0.15.1 to v0.20.2"
    Changelog also reports v0.20.2.
  5. Result: when this PR merges, the deployed/benchmarked image will be vllm/vllm-openai:v0.20.2, not v0.21.0 as the title claims.

Impact

The code side is internally consistent (YAML + changelog both v0.20.2), so runtime behavior is well-defined — this is a metadata/communication defect, not a runtime defect. However, this PR carries the full-sweep-enabled label, which triggers a full performance-benchmark sweep for this config. Reviewers and downstream consumers reading the title or changelog/release notes will be misled about which vLLM version was actually benchmarked, and the perf delta will be attributed to the wrong version. It also makes the linked tracking issue (Ref #1154) ambiguous about which version was actually shipped.

How to fix

The author needs to decide which side is canonical and align the other:

  • If v0.21.0 was the intent: change .github/configs/nvidia-master.yaml:3972 to image: vllm/vllm-openai:v0.21.0 and update the perf-changelog.yaml description string to "Update vLLM image from v0.15.1 to v0.21.0".
  • If v0.20.2 was the intent: update the PR title to Update gptoss-fp4-b200-vllm vLLM image to v0.20.2 and amend the description accordingly. The code does not need to change in this case.

Severity rationale

Filing as nit because the YAML and changelog are internally consistent — the deployment will not be broken, only the human-facing metadata is wrong. But it should be reconciled before merge so the benchmark results are correctly attributed.

model: openai/gpt-oss-120b
model-prefix: gptoss
runner: b200
Expand Down
6 changes: 6 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2617,3 +2617,9 @@
description:
- "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1416

- config-keys:
- gptoss-fp4-b200-vllm
description:
- "Update vLLM image from v0.15.1 to v0.20.2"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1394
Loading