[ROCm] Enable bitsandbytes quantization support on ROCm by Abdennacer-Badaoui · Pull Request #34688 · vllm-project/vllm

Abdennacer-Badaoui · 2026-02-17T10:46:33Z

Description:

Summary

Enable bitsandbytes quantization on ROCm GPUs (including gfx9 architectures)
This is made possible by the upstream bitsandbytes PR that enables blocksize=64 for 4bit quantization on ROCm: Add blocksize=64 4-bit quantization support for ROCm CDNA (warp64) GPUs bitsandbytes-foundation/bitsandbytes#1856
Remove the gfx9 warp size 64 limitation guard in vllm/platforms/rocm.py

Test plan

pytest tests/models/test_transformers.py::test_quantization passes locally on MI325X (gfx942)

mergify · 2026-02-17T10:47:21Z

Documentation preview: https://vllm--34688.org.readthedocs.build/en/34688/

gemini-code-assist

Code Review

This pull request enables bitsandbytes quantization support on ROCm by updating the bitsandbytes dependency to a version that supports it, removing test skips, and adjusting version checks in the code. The changes look good and are consistent with the goal of the PR. I've identified one area for improvement regarding code duplication in the version checking logic, which should be refactored to improve maintainability.

gemini-code-assist · 2026-02-17T10:49:21Z

+        min_version = "0.49.2" if current_platform.is_rocm() else "0.46.1"
        try:
            import bitsandbytes

-            if version.parse(bitsandbytes.__version__) < version.parse("0.46.1"):
+            if version.parse(bitsandbytes.__version__) < version.parse(min_version):
                raise ImportError(
                    "bitsandbytes version is wrong. Please "
-                    "install bitsandbytes>=0.46.1."
+                    f"install bitsandbytes>={min_version}."
                )
        except ImportError as err:
            raise ImportError(
-                "Please install bitsandbytes>=0.46.1 via "
-                "`pip install bitsandbytes>=0.46.1` to use "
+                f"Please install bitsandbytes>={min_version} via "
+                f"`pip install bitsandbytes>={min_version}` to use "
                "bitsandbytes quantizer."
            ) from err


This version check logic is duplicated from BitsAndBytesLinearMethod.__init__ (lines 186-200). To improve maintainability and avoid future inconsistencies, consider extracting this logic into a shared helper function at the module level.

For example:

def _check_bitsandbytes_version(): min_version = "0.49.2" if current_platform.is_rocm() else "0.46.1" try: import bitsandbytes from packaging import version if version.parse(bitsandbytes.__version__) < version.parse(min_version): raise ImportError( "bitsandbytes version is wrong. Please " f"install bitsandbytes>={min_version}." ) except ImportError as err: raise ImportError( f"Please install bitsandbytes>={min_version} via " f"`pip install bitsandbytes>={min_version}` to use " "bitsandbytes quantizer." ) from err

Then both __init__ methods can simply call _check_bitsandbytes_version().

hmellor · 2026-02-17T10:52:12Z

I like Gemini's suggestion, could you extract the check to a method as described in its review comment?

Abdennacer-Badaoui · 2026-02-17T10:53:09Z

Yes of course :)

hmellor · 2026-02-17T15:35:18Z

cc @AndreasKaratzas can you confirm if any of the AMD failures are caused by this PR or if they already existed?

Titus-von-Koeller · 2026-02-19T11:42:09Z

Hey all, thanks to everyone!

Lgtm for me as well.

AndreasKaratzas · 2026-02-19T15:59:28Z

cc @AndreasKaratzas can you confirm if any of the AMD failures are caused by this PR or if they already existed?

Sry for delay. Will look try and into it today.

AndreasKaratzas · 2026-02-19T21:55:42Z

@Abdennacer-Badaoui Could you please rebase before I take a look into AMD CI failures?

AndreasKaratzas · 2026-02-19T21:58:04Z

Also, can we add a test that runs a bits and bytes model if bits and bytes package is found? This will immediately give us a feedback regarding correctness of bitsandbytes on ROCm.

Also, lets add the package requirement on rocm-test.txt as well in this PR.

EDIT: Oops missed the transformers test there with bitsandbytes. Do you think we need any other bitsandbytes correctness test?

hmellor · 2026-02-20T10:28:54Z

+@pytest.mark.parametrize(
+    "model",
+    [
+        ("unsloth/tinyllama-bnb-4bit"),
+    ],
+)


Could we add this as a parametrisation to the test_quantization test instead of creating a new one? Then we have a case for online quantisation and pre-quantised for bnb

Yes, it would be cleaner. Thanks

Abdennacer-Badaoui · 2026-02-20T10:35:22Z

@AndreasKaratzas
test_quantization (in the transformers tests) now covers both inflight quantization and pre-quantized 4-bit checkpoints for bitsandbytes, comparing auto vs transformers backends for logprob consistency. I think this gives us a good coverage for now.

AndreasKaratzas · 2026-02-20T17:49:49Z

This appears to have been done already

@hmellor Version is there, but not pinned.

Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

hmellor · 2026-02-21T08:28:47Z

This appears to have been done already

@hmellor Version is there, but not pinned.

Oh my mistake, I misunderstood what you meant

dosubot · 2026-02-21T08:35:03Z

Related Documentation

Checked 0 published document(s) in 1 knowledge base(s). No updates required.

^{How did I do? Any feedback?}

…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

Abdennacer-Badaoui requested review from hmellor, mgoin, pavanimajety, robertgshaw2-redhat, tjtanaa, tlrmchlsmth and yewentao256 as code owners February 17, 2026 10:46

mergify Bot added documentation Improvements or additions to documentation ci/build rocm Related to AMD ROCm labels Feb 17, 2026

github-project-automation Bot added this to AMD Feb 17, 2026

github-project-automation Bot moved this to Todo in AMD Feb 17, 2026

gemini-code-assist Bot reviewed Feb 17, 2026

View reviewed changes

hmellor approved these changes Feb 17, 2026

View reviewed changes

hmellor enabled auto-merge (squash) February 17, 2026 11:12

github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 17, 2026

auto-merge was automatically disabled February 20, 2026 10:25
Head branch was pushed to by a user without write access

Abdennacer-Badaoui force-pushed the bnb-support-in-rocm branch from f473139 to d86ba03 Compare February 20, 2026 10:25

hmellor reviewed Feb 20, 2026

View reviewed changes

Comment thread vllm/model_executor/layers/quantization/bitsandbytes.py Outdated

hmellor mentioned this pull request Feb 20, 2026

Update to transformers v5 #30566

Merged

Abdennacer-Badaoui added 8 commits February 20, 2026 17:51

add BnB

2aca22d

Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

fix tests

5c265bf

Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

doc

f6666b5

Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

helper func

cc37d2f

Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

rocm-test

4e642ba

Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

add a new test

332bb94

Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

merge tests

82c01ca

Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

pin version

418fc14

Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

Abdennacer-Badaoui force-pushed the bnb-support-in-rocm branch from 7fb4ae2 to 418fc14 Compare February 20, 2026 17:52

version for transformers v5

118bd0c

Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

AndreasKaratzas approved these changes Feb 20, 2026

View reviewed changes

vllm-bot merged commit 8dc8a99 into vllm-project:main Feb 21, 2026
109 of 114 checks passed

github-project-automation Bot moved this from Todo to Done in AMD Feb 21, 2026

yugong333 pushed a commit to yugong333/vllm that referenced this pull request Feb 22, 2026

[ROCm] Enable bitsandbytes quantization support on ROCm (vllm-project…

0e84b72

…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

jmamou pushed a commit to jmamou/vllm that referenced this pull request Feb 23, 2026

[ROCm] Enable bitsandbytes quantization support on ROCm (vllm-project…

f49d216

…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

Copilot AI pushed a commit to machov/vllm that referenced this pull request Mar 10, 2026

[ROCm] Enable bitsandbytes quantization support on ROCm (vllm-project…

283a017

…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

jiangkuaixue123 pushed a commit to jiangkuaixue123/vllm that referenced this pull request Apr 28, 2026

[ROCm] Enable bitsandbytes quantization support on ROCm (vllm-project…

267df9e

…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026

[ROCm] Enable bitsandbytes quantization support on ROCm (vllm-project…

858d03f

…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026

[ROCm] Enable bitsandbytes quantization support on ROCm (vllm-project…

e8a22db

…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026

[ROCm] Enable bitsandbytes quantization support on ROCm (vllm-project…

52ef1ea

…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

0826joyce pushed a commit to 0826joyce/vllm-serving-optimization that referenced this pull request May 19, 2026

[ROCm] Enable bitsandbytes quantization support on ROCm (vllm-project…

60aefa2

…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>

Uh oh!

Conversation

Abdennacer-Badaoui commented Feb 17, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

mergify Bot commented Feb 17, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

hmellor commented Feb 17, 2026

Uh oh!

Abdennacer-Badaoui commented Feb 17, 2026

Uh oh!

hmellor commented Feb 17, 2026

Uh oh!

Titus-von-Koeller commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndreasKaratzas commented Feb 19, 2026

Uh oh!

AndreasKaratzas commented Feb 19, 2026

Uh oh!

AndreasKaratzas commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hmellor Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Abdennacer-Badaoui Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Abdennacer-Badaoui commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

AndreasKaratzas commented Feb 20, 2026

Uh oh!

hmellor commented Feb 21, 2026

Uh oh!

Uh oh!

dosubot Bot commented Feb 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Abdennacer-Badaoui commented Feb 17, 2026 •

edited by github-actions Bot

Loading

Titus-von-Koeller commented Feb 19, 2026 •

edited

Loading

AndreasKaratzas commented Feb 19, 2026 •

edited

Loading

Abdennacer-Badaoui commented Feb 20, 2026 •

edited

Loading