vllm: 0.16.0 -> 0.19.0 by CertainLach · Pull Request #498040 · NixOS/nixpkgs

CertainLach · 2026-03-08T21:35:34Z

Diff: vllm-project/vllm@releases/v0.16.0...v0.17.0
Changelog: https://github.com/vllm-project/vllm/releases/tag/v0.17.0

Things done

Only tested on rocm, strix halo

CertainLach · 2026-03-08T21:37:15Z

opentelemetry-api updated because opentelemetry-semantic-conventions-ai wants newer versions

github-actions

The PR's base branch is set to master, but this PR causes 4377 rebuilds.
It is therefore considered a mass rebuild.
Please change the base branch to the right base branch for your changes (probably staging).

CertainLach · 2026-03-08T21:44:11Z

Oh, maybe I should split opentelemetry changes into another PR. Relaxing dependencies for opentelemetry-semantic-conventions-ai doesn't help, as it fails at runtime otherwise.

CertainLach · 2026-03-08T22:08:28Z

For opentelemetry changes, might switch to depend on #489017 instead

GaetanLepage · 2026-03-08T22:09:49Z

Thanks for the PR. However, I would prefer if its scope would be narrower.

CertainLach · 2026-03-08T22:11:43Z

I agree, I only aggregated all the required changes for now, and will split them into PRs

GaetanLepage · 2026-03-08T22:13:38Z

I agree, I only aggregated all the required changes for now, and will split them into PRs

Thanks!

CertainLach · 2026-03-08T22:30:13Z

Dependency graph for this PR:

Will create a separate PR for amd-quark once #498053 and #498052 are merged

Opentelemetry PRs: #498050 #498051

Also kaldi-native-fbank: #498056

CertainLach · 2026-03-08T23:57:52Z

amd-quark pr: #498069

pkieltyka · 2026-04-18T11:38:55Z

as I recall I was hitting tool call warnings and errors and needed to upgrade to package in order to resolve those.

I think upgrading in this PR is a good idea

d-goldin · 2026-04-19T14:16:03Z

So, this is how I got it to work with CUDA for now. PR briefly explains why I think it's safe enough to drop those deps for now.

@CertainLach: CertainLach#3

At the same time though, I must say that 0.19 seems to work a bit worse for me, startup times are longer despite already pre-compiled/filled caches. Also sometimes during startup some background worker seems to die after waiting for something else. But I have no reason to believe this has anything to do with our PR here, nothing in the logs points to it. Rather some new memory profiling features or similar. Also sometimes doesn't react well to clean SIGTERM in my llama-swap setup.

So I'll have to dig around a bit deeper to see, for my personal setup and use.

Ports the vLLM 0.19.0 package from NixOS/nixpkgs#498040 into the overlay, along with new dependencies (kaldi-native-fbank, opentelemetry-semantic-conventions-ai) and bumped opentelemetry packages. Key changes: - vLLM 0.19.0 with Qwen3.5, Gemma 4, and many other new model archs - triton-kernels v3.6.0 - Cap MAX_JOBS to 4 for CUDA compilation (nvcc OOMs at -j=20 on Spark) - Remove flashinfer-cubin and nvidia-cudnn-frontend from runtime deps (not packaged, optional) - New packages: kaldi-native-fbank, opentelemetry-semantic-conventions-ai Tested: builds and runs on DGX Spark (aarch64-linux, CUDA 13.2, SM 12.0). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

stefanboca · 2026-04-23T08:31:33Z

For mistral-common, I've opened #512667. vllm 0.19.0 depends on mistral-common>=1.10.0, and It looks like vllm 0.20.0 is about to be release which depends on mistral-common>=1.11.0.

@d-goldin @CertainLach might I suggest using pythonRemoveDeps instead of patching the requirements files? see here. ex:

  pythonRemoveDeps = [
    "flashinfer-cubin"
    "nvidia-cudnn-frontend"

    # QuACK and Cutlass DSL seem to be added only for FA4
    # which in our case handles its own deps
    "nvidia-cutlass-dsl"
    "quack-kernels"
  ];

CertainLach · 2026-04-24T00:30:48Z

Oof. The HEAD of this branch is correct for vllm deployment, and I have also included mistral-common as a parent for this request, but github doesn't seem to like to see this pointed to staging instead of master, and we can't point it to master since we have opentelemetry changes unmerged. Thus lots of unnecessary commits

Should I point it to master for now?.. It should be pointed to master anyway, the only reason it is being pointed to staging is that we have opentelemetry changes here, and those do require staging

What a mess...

This PR being pointed to master is not correct, but since it is stacked, pointing it to staging makes github very not happy and it might call lots of reviewers here. Should I mark it as a draft until we sort out the situation with python opentelemetry packages?

Review dismissed automatically

Diff: mistralai/mistral-common@v1.8.8...v1.11.0 1.11.0 is published on pypi, but for whatever reason is not listed in github releases.

All of the opentelemetry-instrumentation-requests tests are hardcoding requests version, and since requests package in nixpkgs is newer than expected by the package - all of the tests fail. This should be fixed upstream, I do not see a good way to patch that at nixpkgs side.

- Bumping triton to a newer version, the older one didn't work for me with 0.17 - Drops quarck-kernels and cuteDSL from dependencies. From what I can tell those are only needed for FA4 and would also require some nvidia blobs. We are at FA2 right now, so this shouldn't remove any functionality that was present before - Adding NCCL to wrapper args, for better UX

Vllm also wants bash for aiter

d-goldin · 2026-04-25T09:00:53Z

@d-goldin @CertainLach might I suggest using pythonRemoveDeps instead of patching the requirements files? see here. ex:
  pythonRemoveDeps = [
    "flashinfer-cubin"
    "nvidia-cudnn-frontend"

    # QuACK and Cutlass DSL seem to be added only for FA4
    # which in our case handles its own deps
    "nvidia-cutlass-dsl"
    "quack-kernels"
  ];

Will try to switch to it. Not sure why I did it the other way around, maybe it didn't work with the structure they have with the requirements/* folder.

Edit: Tried, seems to work fine. So yeah, cleaner to do it that way. Will add a commit later.

leo60228 · 2026-05-06T02:06:18Z

For mistral-common, I've opened #512667. vllm 0.19.0 depends on mistral-common>=1.10.0, and It looks like vllm 0.20.0 is about to be release which depends on mistral-common>=1.11.0.

It seems like vllm 0.16.0 already has this dependency. I have the same issue on nixos-unstable.

peperunas · 2026-05-07T13:45:30Z

Is there anything I can help with to speed-up the PR merge?

CertainLach · 2026-05-09T12:40:49Z

Is there anything I can help with to speed-up the PR merge?

This PR merging is blocked on opentelemetry python packages update anyway, and I have no idea what to do with them

github-actions Bot reviewed Mar 8, 2026

View reviewed changes

nixpkgs-ci Bot requested review from GaetanLepage, LunNova, daniel-fahey, happysalada and natsukium March 8, 2026 21:44

CertainLach force-pushed the push-lklxouywkrnv branch from d2d5fff to 5b12a24 Compare March 8, 2026 21:47

CertainLach changed the base branch from master to staging March 8, 2026 22:05

nixpkgs-ci Bot closed this Mar 8, 2026

nixpkgs-ci Bot reopened this Mar 8, 2026

CertainLach force-pushed the push-lklxouywkrnv branch from 5b12a24 to 224d2d6 Compare March 8, 2026 22:23

CertainLach force-pushed the push-lklxouywkrnv branch from 1158b57 to 45815df Compare March 9, 2026 01:35

This comment was marked as outdated.

Sign in to view

stefanboca mentioned this pull request Apr 21, 2026

python3Packages.opentelemetry-{api, instrumentation}: bump #498050

Open

13 tasks

CertainLach force-pushed the push-lklxouywkrnv branch from 6a06753 to f19541c Compare April 24, 2026 00:24

This comment was marked as outdated.

Sign in to view

CertainLach force-pushed the push-lklxouywkrnv branch from f19541c to d5c9bc9 Compare April 24, 2026 00:31

CertainLach changed the base branch from staging to master April 24, 2026 00:32

nixpkgs-ci Bot closed this Apr 24, 2026

nixpkgs-ci Bot reopened this Apr 24, 2026

CertainLach force-pushed the push-lklxouywkrnv branch 3 times, most recently from 8dc0790 to dc73532 Compare April 24, 2026 00:46

nixpkgs-ci Bot requested a review from bgamari April 24, 2026 01:02

stefanboca and others added 8 commits April 24, 2026 03:39

python3Packages.mistral-common: 1.8.8 -> 1.11.0

712a137

Diff: mistralai/mistral-common@v1.8.8...v1.11.0 1.11.0 is published on pypi, but for whatever reason is not listed in github releases.

python3Packages.opentelemetry-api: 1.34.0 -> 1.40.0

d83fc4c

python3Packages.opentelemetry-instrumentation: 0.55b0 -> 0.61b0

7816cf5

python3Packages.opentelemetry-semantic-conventions-ai: init at 0.4.15

43401a4

vllm: 0.16.0 -> 0.19.0

3ba7eda

vllm: set rocm env in wrapper

ba2de00

Vllm also wants bash for aiter

CertainLach force-pushed the push-lklxouywkrnv branch from dc73532 to ba2de00 Compare April 24, 2026 01:42

Uh oh!

Conversation

CertainLach commented Mar 8, 2026

Things done

Uh oh!

CertainLach commented Mar 8, 2026

Uh oh!

github-actions Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CertainLach commented Mar 8, 2026

Uh oh!

CertainLach commented Mar 8, 2026

Uh oh!

GaetanLepage commented Mar 8, 2026

Uh oh!

CertainLach commented Mar 8, 2026

Uh oh!

GaetanLepage commented Mar 8, 2026

Uh oh!

CertainLach commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CertainLach commented Mar 8, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

pkieltyka commented Apr 18, 2026

Uh oh!

d-goldin commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stefanboca commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

CertainLach commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

d-goldin commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leo60228 commented May 6, 2026

Uh oh!

peperunas commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CertainLach commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

github-actions Bot left a comment •

edited

Loading

CertainLach commented Mar 8, 2026 •

edited

Loading

d-goldin commented Apr 19, 2026 •

edited

Loading

stefanboca commented Apr 23, 2026 •

edited

Loading

CertainLach commented Apr 24, 2026 •

edited

Loading

d-goldin commented Apr 25, 2026 •

edited

Loading

peperunas commented May 7, 2026 •

edited

Loading