Commit c6a7e02
committed
Production-hardening: fix v2.1.0rc1 quantization-state and architecture-fallback bugs
Reflects the issues identified in the codebase review on top of PR #27.
Correctness fixes
-----------------
* TurboModel._is_quantized is now a property derived from the loaded
model's config.quantization_config and BitsAndBytes layer types,
with an opt-in override slot used by from_gguf. This fixes:
- from_config_only=True returning a random-weights model that was
misreported as quantized;
- missing bitsandbytes installs falling through silently while the
flag stayed True;
- pre-quantized HF repos (GPTQ/AWQ/etc.) not being recognized when
the user passed quantize=False.
* resolve_model_type now consults DEFAULT_ARCHITECTURE_FALLBACKS for
unknown HF model_types and recognizes version-suffix patterns
(qwen3 -> qwen2, llama4 -> llama, phi4 -> phi3, gemma3 -> gemma2,
...). The old logic only consulted the table when the config's
model_type was empty, which never happens in practice.
* register_architecture(model_class=...) is now discoverable under
the original architecture name as well as the resolved base family,
matching the documented API.
* Removed an accidentally duplicated 'if is_bnb and is_8bit ...'
block in the existing-quant detection branch.
Robustness for new architectures and consumer hardware
-----------------------------------------------------
* Greatly expanded DEFAULT_ARCHITECTURE_FALLBACKS (Llama 2/3/4, Qwen
2/2-MoE/3, Phi/3/4, Gemma/2/3, DeepSeek V2/V3, Cohere/Command-R,
OLMo/2, SmolLM/2/3, Yi, StarCoder/2, InternLM/2, Baichuan, ChatGLM,
StableLM, Falcon).
* Pre-quantized HF repo names (Unsloth-style *-bnb-4bit, *-AWQ,
*-GPTQ, *-INT4, *-FP8, etc.) are detected and surfaced as a hint;
the embedded quantization_config is honoured.
* GGUF-only repo names trigger a friendly hint pointing at from_gguf.
* New TurboModel.report() returns a structured snapshot of the actual
loaded model state (quant_method, device, dtype, params_billion).
* TurboModel.is_quantized public property is the canonical answer
rather than an instance flag that could drift.
Production hygiene
------------------
* New .github/workflows/ci.yml runs ruff + pytest on Python 3.10/3.11
/3.12 and validates the build with python -m build / twine check.
* New pyproject.toml provides PEP 517/518 build metadata plus a
conservative ruff lint profile (only blocker-class rules) and
pytest defaults.
* New .pre-commit-config.yaml for local pre-commit enforcement.
* New CHANGELOG.md documenting every change.
Tests
-----
* tests/test_quantization_state.py covers the from_config_only and
is_quantized property fixes, the report() schema, and the override
setter.
* tests/test_resolve_model_type.py covers the fallback-table
consultation, family-suffix matching, and registry-class lookup
ergonomics.
Docs
----
* docs/guide/loading-models.md updated to reflect the now-automatic
fallbacks, the pre-quantized repo detection, and report().
* docs/guide/consumer-hardware.md added with per-tier guidance for
CPU-only, Apple Silicon, 4-8 GB / 12-24 GB / multi-GPU.1 parent c32c63d commit c6a7e02
9 files changed
Lines changed: 1220 additions & 60 deletions
File tree
- .github/workflows
- docs/guide
- quantllm/core
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
0 commit comments