@@ -390,8 +390,7 @@ bitsandbytes.optim.optimizer.Optimizer8bit(params, defaults, optim_bits=32, is_p
390390bitsandbytes.optim.optimizer.Optimizer2State(
391391 optimizer_name, params, lr=1e-3, betas=(0.9, 0.999),
392392 eps=1e-8, weight_decay=0.0, optim_bits=32, args=None,
393- min_8bit_size=4096, percentile_clipping=100,
394- block_wise=True, max_unorm=0.0, skip_zeros=False,
393+ min_8bit_size=4096, max_unorm=0.0, skip_zeros=False,
395394 is_paged=False, alpha=0.0, t_alpha=None, t_beta3=None,
396395)
397396```
@@ -405,8 +404,7 @@ bitsandbytes.optim.optimizer.Optimizer2State(
405404bitsandbytes.optim.optimizer.Optimizer1State(
406405 optimizer_name, params, lr=1e-3, betas=(0.9, 0.0),
407406 eps=1e-8, weight_decay=0.0, optim_bits=32, args=None,
408- min_8bit_size=4096, percentile_clipping=100,
409- block_wise=True, max_unorm=0.0, skip_zeros=False,
407+ min_8bit_size=4096, max_unorm=0.0, skip_zeros=False,
410408 is_paged=False,
411409)
412410```
@@ -532,8 +530,6 @@ All bnb optimizers share these parameters beyond the standard PyTorch ones:
532530| -----------| ------| ---------| -------------|
533531| ` optim_bits ` | ` int ` | 32 | 32 for full precision state, 8 for quantized state |
534532| ` min_8bit_size ` | ` int ` | 4096 | Parameters smaller than this use 32-bit state even in 8-bit mode |
535- | ` percentile_clipping ` | ` int ` | 100 | Gradient clipping at a percentile. 100 = disabled |
536- | ` block_wise ` | ` bool ` | ` True ` | Block-wise quantization of optimizer states (vs global) |
537533| ` max_unorm ` | ` float ` | 0.0 | Maximum update norm relative to weight norm. 0 = disabled |
538534| ` skip_zeros ` | ` bool ` | ` False ` | Skip zero gradients in sparse models |
539535| ` is_paged ` | ` bool ` | ` False ` | Use CUDA managed memory for state offloading |
@@ -1313,7 +1309,6 @@ removed in a future release.
13131309| ` quantize_no_absmax ` | ` functional ` | ` quantize_blockwise ` |
13141310| ` dequantize_no_absmax ` | ` functional ` | ` dequantize_blockwise ` |
13151311| ` optimizer_update_8bit ` | ` functional ` | ` optimizer_update_8bit_blockwise ` |
1316- | ` percentile_clipping ` | ` functional ` | N/A (still used internally by non-blockwise path) |
13171312
13181313---
13191314
0 commit comments