Skip to content

Commit deda1ba

Browse files
TimDettmersclaude
andauthored
Remove dead bitsandbytes CxB code from 8-bit inference path (vllm-project#34633)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 1b88965 commit deda1ba

1 file changed

Lines changed: 0 additions & 10 deletions

File tree

vllm/model_executor/layers/quantization/bitsandbytes.py

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -336,16 +336,6 @@ def _apply_8bit_weight(
336336

337337
current_index += output_size
338338

339-
# only update the matmul_states if it is not profile_run
340-
if (
341-
generation > 0
342-
and not self.quant_config.llm_int8_has_fp16_weight
343-
and matmul_states[i].CB is not None
344-
and matmul_states[i].CxB is not None
345-
):
346-
del matmul_states[i].CB
347-
qweight[offsets[i] : offsets[i + 1]] = matmul_states[i].CxB
348-
349339
out = out.to(original_type)
350340

351341
if reshape_after_matmul:

0 commit comments

Comments
 (0)