[ROCm] Enable bitsandbytes quantization support on ROCm#34688
Conversation
|
Documentation preview: https://vllm--34688.org.readthedocs.build/en/34688/ |
There was a problem hiding this comment.
Code Review
This pull request enables bitsandbytes quantization support on ROCm by updating the bitsandbytes dependency to a version that supports it, removing test skips, and adjusting version checks in the code. The changes look good and are consistent with the goal of the PR. I've identified one area for improvement regarding code duplication in the version checking logic, which should be refactored to improve maintainability.
| min_version = "0.49.2" if current_platform.is_rocm() else "0.46.1" | ||
| try: | ||
| import bitsandbytes | ||
|
|
||
| if version.parse(bitsandbytes.__version__) < version.parse("0.46.1"): | ||
| if version.parse(bitsandbytes.__version__) < version.parse(min_version): | ||
| raise ImportError( | ||
| "bitsandbytes version is wrong. Please " | ||
| "install bitsandbytes>=0.46.1." | ||
| f"install bitsandbytes>={min_version}." | ||
| ) | ||
| except ImportError as err: | ||
| raise ImportError( | ||
| "Please install bitsandbytes>=0.46.1 via " | ||
| "`pip install bitsandbytes>=0.46.1` to use " | ||
| f"Please install bitsandbytes>={min_version} via " | ||
| f"`pip install bitsandbytes>={min_version}` to use " | ||
| "bitsandbytes quantizer." | ||
| ) from err |
There was a problem hiding this comment.
This version check logic is duplicated from BitsAndBytesLinearMethod.__init__ (lines 186-200). To improve maintainability and avoid future inconsistencies, consider extracting this logic into a shared helper function at the module level.
For example:
def _check_bitsandbytes_version():
min_version = "0.49.2" if current_platform.is_rocm() else "0.46.1"
try:
import bitsandbytes
from packaging import version
if version.parse(bitsandbytes.__version__) < version.parse(min_version):
raise ImportError(
"bitsandbytes version is wrong. Please "
f"install bitsandbytes>={min_version}."
)
except ImportError as err:
raise ImportError(
f"Please install bitsandbytes>={min_version} via "
f"`pip install bitsandbytes>={min_version}` to use "
"bitsandbytes quantizer."
) from errThen both __init__ methods can simply call _check_bitsandbytes_version().
|
I like Gemini's suggestion, could you extract the check to a method as described in its review comment? |
|
Yes of course :) |
|
cc @AndreasKaratzas can you confirm if any of the AMD failures are caused by this PR or if they already existed? |
|
Hey all, thanks to everyone! Lgtm for me as well. |
Sry for delay. Will look try and into it today. |
|
@Abdennacer-Badaoui Could you please rebase before I take a look into AMD CI failures? |
|
Also, can we add a test that runs a bits and bytes model if bits and bytes package is found? This will immediately give us a feedback regarding correctness of bitsandbytes on ROCm. Also, lets add the package requirement on EDIT: Oops missed the transformers test there with bitsandbytes. Do you think we need any other bitsandbytes correctness test? |
Head branch was pushed to by a user without write access
f473139 to
d86ba03
Compare
| @pytest.mark.parametrize( | ||
| "model", | ||
| [ | ||
| ("unsloth/tinyllama-bnb-4bit"), | ||
| ], | ||
| ) |
There was a problem hiding this comment.
Could we add this as a parametrisation to the test_quantization test instead of creating a new one? Then we have a case for online quantisation and pre-quantised for bnb
There was a problem hiding this comment.
Yes, it would be cleaner. Thanks
|
@AndreasKaratzas |
@hmellor Version is there, but not pinned. |
Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
7fb4ae2 to
418fc14
Compare
Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
Oh my mistake, I misunderstood what you meant |
…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
…#34688) Signed-off-by: badaoui <abdennacerbadaoui0@gmail.com>
Description:
Summary
vllm/platforms/rocm.pyTest plan
pytest tests/models/test_transformers.py::test_quantizationpasses locally on MI325X (gfx942)