Commit 51f528c
Feat:(model) qwen image vae checkpoint (invoke-ai#9108)
* feat(qwen-image): standalone VAE checkpoint and Qwen2.5-VL encoder support
Add standalone model types so Qwen Image can be run without downloading the
full ~40 GB Diffusers pipeline. The VAE and Qwen2.5-VL encoder can now each
come from their own model, with the Component Source (Diffusers) acting as a
fallback for any submodel not provided separately.
* feat(qwen-image): support ComfyUI single-file Qwen2.5-VL encoder
Add a checkpoint loader for ComfyUI-style consolidated Qwen2.5-VL encoder
files (e.g. qwen_2.5_vl_7b_fp8_scaled.safetensors), which bundle the language
model and visual tower into one safetensors with FP8 + per-tensor weight_scale
quantization. This drops the standalone encoder footprint from ~16 GB
(Diffusers folder, FP16) to ~7 GB.
* feat(qwen-image): register standalone components as starter models
Add three new starter models so users can install a complete GGUF Qwen Image
setup in one click without ever touching the full ~40 GB Diffusers pipeline:
- "Qwen Image VAE" — single-file VAE checkpoint pulled from the Qwen-Image
repo (~250 MB).
- "Qwen2.5-VL Encoder (fp8 scaled)" — ComfyUI single-file FP8 encoder
(~7 GB).
- "Qwen2.5-VL Encoder (Diffusers)" — full-precision encoder via multi-folder
HF download (text_encoder+tokenizer+processor, ~16 GB).
The 8 GGUF main starters (Q2_K / Q4_K_M / Q6_K / Q8_0 for both Edit and
txt2img) now declare the VAE + fp8 encoder as dependencies, so installing
any of them automatically pulls in everything needed to generate. The
fp8 encoder is preferred as the default dependency since it's smaller and
the on-the-fly dequantization is essentially free at runtime.
The Qwen Image starter bundle gets the VAE and fp8 encoder prepended so
the bundled Lightning LoRA variants also benefit.
* Chore Ruff Format
* fix(qwen-image): backfill VAE/encoder fields on persisted state, recall in metadata, optimize scan
- bump params slice persisted state to v3 with a v2→v3 migration that
backfills qwenImageVaeModel and qwenImageQwenVLEncoderModel to null,
preventing existing users from losing all persisted params on upgrade
- emit qwen_image_vae and qwen_image_qwen_vl_encoder into graph metadata
and add recall handlers so generations using standalone components are
reproducible
- clear the two new fields in the modelSelected listener when switching
away from qwen-image, matching the existing cleanup pattern
- identify single-file Qwen VL encoder checkpoints by reading only the
safetensors key index via safe_open, instead of loading the full ~7GB
state dict into RAM during model scan
- log a clear info message and raise an actionable RuntimeError when the
first-time HuggingFace tokenizer/config download is needed but offline,
pointing users to the diffusers folder layout as an offline alternative
- add unit tests for the migration, metadata recall, and identification
* fix(qwen-image): auto-select VAE/encoder, clarify GGUF tip, fix fp8 single-file encoder crash
- Auto-select first available standalone VAE and Qwen2.5-VL encoder when
switching to a Qwen Image model, so GGUF users are ready-to-go without
digging into Advanced. Prefers the diffusers-folder encoder over the
single-file checkpoint.
- Update the "Required for GGUF models" placeholder to clarify that
the diffusers source is only required when a standalone VAE & encoder
is not installed.
- Fix QwenVLEncoderCheckpointLoader crash on ComfyUI fp8_scaled
single-file encoders. Two issues: (1) handle the `.scale_weight` /
`.scale_input` quantization key scheme alongside `.weight_scale`,
and (2) apply Qwen2_5_VLForConditionalGeneration's
_checkpoint_conversion_mapping before load_state_dict so legacy
`visual.*` / `model.*` keys map onto the new `model.visual.*` /
`model.language_model.*` layout expected by transformers ≥4.50.
---------
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>1 parent b9bd8ef commit 51f528c
26 files changed
Lines changed: 1408 additions & 60 deletions
File tree
- invokeai
- app/invocations
- backend/model_manager
- configs
- load/model_loaders
- frontend/web
- public/locales
- src
- app/store/middleware/listenerMiddleware/listeners
- features
- controlLayers/store
- metadata
- modelManagerV2
- subpanels/ModelManagerPanel
- nodes
- types
- util/graph/generation
- parameters/components/Advanced
- queue/store
- services/api
- hooks
- tests/backend/model_manager/configs
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | | - | |
46 | | - | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
47 | 50 | | |
48 | | - | |
49 | | - | |
| 51 | + | |
| 52 | + | |
50 | 53 | | |
51 | 54 | | |
52 | 55 | | |
| |||
57 | 60 | | |
58 | 61 | | |
59 | 62 | | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
60 | 83 | | |
61 | 84 | | |
62 | | - | |
63 | | - | |
64 | | - | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
65 | 88 | | |
66 | 89 | | |
67 | 90 | | |
| |||
76 | 99 | | |
77 | 100 | | |
78 | 101 | | |
79 | | - | |
80 | | - | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
81 | 106 | | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
82 | 121 | | |
83 | 122 | | |
84 | 123 | | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
| 124 | + | |
93 | 125 | | |
94 | 126 | | |
95 | 127 | | |
96 | 128 | | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
101 | 132 | | |
102 | 133 | | |
103 | 134 | | |
104 | 135 | | |
105 | 136 | | |
106 | 137 | | |
107 | 138 | | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
164 | | - | |
165 | 164 | | |
166 | | - | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
167 | 185 | | |
168 | | - | |
169 | | - | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
175 | 193 | | |
176 | 194 | | |
177 | 195 | | |
| |||
264 | 282 | | |
265 | 283 | | |
266 | 284 | | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
267 | 291 | | |
268 | 292 | | |
269 | 293 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
93 | 97 | | |
94 | 98 | | |
95 | 99 | | |
| |||
111 | 115 | | |
112 | 116 | | |
113 | 117 | | |
| 118 | + | |
114 | 119 | | |
115 | 120 | | |
116 | 121 | | |
| |||
194 | 199 | | |
195 | 200 | | |
196 | 201 | | |
| 202 | + | |
197 | 203 | | |
198 | 204 | | |
199 | 205 | | |
| |||
242 | 248 | | |
243 | 249 | | |
244 | 250 | | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
245 | 254 | | |
246 | 255 | | |
247 | 256 | | |
| |||
Lines changed: 154 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
0 commit comments