Skip to content

python3Packages.vllm: 0.17 update: updating CUDA support#2

Merged
CertainLach merged 1 commit into
CertainLach:push-lklxouywkrnvfrom
d-goldin:vllm017-cuda-498040
Apr 6, 2026
Merged

python3Packages.vllm: 0.17 update: updating CUDA support#2
CertainLach merged 1 commit into
CertainLach:push-lklxouywkrnvfrom
d-goldin:vllm017-cuda-498040

Conversation

@d-goldin
Copy link
Copy Markdown

@d-goldin d-goldin commented Mar 21, 2026

Changed to just address the bare minimum the way I see it, everything else has more to do with newer flash-attn (quack kernels etc) that is best done separately.

- Bumping triton to a newer version, the older one didn't
  work for me with 0.17
- Drops quarck-kernels and cuteDSL from dependencies.
  From what I can tell those are only needed for FA4
  and would also require some nvidia blobs. We are at FA2
  right now, so this shouldn't remove any functionality
  that was present before
- Adding NCCL to wrapper args, for better UX

@d-goldin d-goldin force-pushed the vllm017-cuda-498040 branch 3 times, most recently from bccecdf to 8579d71 Compare March 21, 2026 23:09
- Bumping triton to a newer version, the older one didn't
  work for me with 0.17
- Drops quarck-kernels and cuteDSL from dependencies.
  From what I can tell those are only needed for FA4
  and would also require some nvidia blobs. We are at FA2
  right now, so this shouldn't remove any functionality
  that was present before
- Adding NCCL to wrapper args, for better UX
@d-goldin d-goldin force-pushed the vllm017-cuda-498040 branch from b5ea0f3 to efda726 Compare March 28, 2026 14:08
@d-goldin d-goldin changed the title vllm 0.17 cuda: adding quack-kernels, troch-c-dlpack-ext python3Packages.vllm: 0.17 update: updating CUDA support Mar 28, 2026
@CertainLach CertainLach merged commit a7fff51 into CertainLach:push-lklxouywkrnv Apr 6, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants