@@ -74,6 +74,59 @@ model = turbo(
7474)
7575```
7676
77+ ### New Architecture Fallbacks (for very recent model releases)
78+
79+ If ` transformers ` does not recognize a just-released architecture yet, register a fallback family:
80+
81+ ``` python
82+ from quantllm import turbo, register_architecture
83+
84+ # Map new architecture/model_type to a compatible base family
85+ register_architecture(" newmodel" , base_model_type = " llama" )
86+
87+ model = turbo(
88+ " new-model-org/NewModel-7B" ,
89+ model_type_override = " llama" , # optional explicit override
90+ base_model_fallback = True , # enabled by default; can be disabled
91+ trust_remote_code = True ,
92+ )
93+ ```
94+
95+ > ⚠️ ** Security note:** ` trust_remote_code=True ` executes model-provided code.
96+ > Only enable it for trusted publishers, especially when loading unregistered or very new architectures.
97+
98+ You can also load from config only (no checkpoint weights) while waiting for upstream support:
99+
100+ ``` python
101+ model = turbo(
102+ " new-model-org/NewModel-7B" ,
103+ from_config_only = True ,
104+ trust_remote_code = True ,
105+ )
106+ ```
107+
108+ #### Fast contribution template for new architectures
109+
110+ 1 . Add a registration in your code or PR:
111+ - ` register_architecture("new-arch", base_model_type="llama") `
112+ 2 . Validate loading with:
113+ - ` turbo("org/model", base_model_fallback=True, trust_remote_code=True) `
114+ 3 . Add/extend a focused test in ` tests/test_architecture_fallback.py ` .
115+
116+ #### Real-world style "released yesterday" example
117+
118+ ``` python
119+ from quantllm import turbo, register_architecture
120+
121+ # Example: transformers doesn't recognize Qwen3 yet
122+ register_architecture(" qwen3" , base_model_type = " qwen2" )
123+
124+ model = turbo(
125+ " Qwen/Qwen3-8B" ,
126+ trust_remote_code = True ,
127+ )
128+ ```
129+
77130### Memory Options
78131
79132``` python
0 commit comments