Skip to content

Commit 2e59606

Browse files
Address PR review refinements for architecture fallback loading
Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/8867f3b4-18ae-4207-b2e8-51444418c7aa Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>
1 parent 83b66ae commit 2e59606

3 files changed

Lines changed: 876 additions & 505 deletions

File tree

docs/guide/loading-models.md

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,11 +87,14 @@ register_architecture("newmodel", base_model_type="llama")
8787
model = turbo(
8888
"new-model-org/NewModel-7B",
8989
model_type_override="llama", # optional explicit override
90-
base_model_fallback=True, # retry with resolved fallback config
90+
base_model_fallback=True, # enabled by default; can be disabled
9191
trust_remote_code=True,
9292
)
9393
```
9494

95+
> ⚠️ **Security note:** `trust_remote_code=True` executes model-provided code.
96+
> Only enable it for trusted publishers, especially when loading unregistered or very new architectures.
97+
9598
You can also load from config only (no checkpoint weights) while waiting for upstream support:
9699

97100
```python
@@ -110,6 +113,20 @@ model = turbo(
110113
- `turbo("org/model", base_model_fallback=True, trust_remote_code=True)`
111114
3. Add/extend a focused test in `tests/test_architecture_fallback.py`.
112115

116+
#### Real-world style "released yesterday" example
117+
118+
```python
119+
from quantllm import turbo, register_architecture
120+
121+
# Example: transformers doesn't recognize Qwen3 yet
122+
register_architecture("qwen3", base_model_type="qwen2")
123+
124+
model = turbo(
125+
"Qwen/Qwen3-8B",
126+
trust_remote_code=True,
127+
)
128+
```
129+
113130
### Memory Options
114131

115132
```python

0 commit comments

Comments
 (0)