Commit 5be34e6
committed
to_hf.py: rescue chat_template.jinja before deleting it
Modern HuggingFace transformers (~4.42+) moves long chat_template
strings out of tokenizer_config.json into a separate
chat_template.jinja file to keep the JSON readable. Qwen3-1.7B's
4168-char template triggers this split; Nemotron-Nano's shorter
template stays inline.
The old code deleted chat_template.jinja before reading
tokenizer_config.json, assuming the inline copy was always complete.
For Qwen3 that meant the exported checkpoint shipped with an empty
chat_template -- vLLM's apply_chat_template returned a prompt
without the <|audio|> placeholder, which broke multimodal prompt
replacement (Failed to apply prompt replacement for mm_items['audio'][0]).
Now read chat_template.jinja, inline it into tokenizer_config.json
when non-empty, and only then delete the file. Nemotron's inline-only
path is unchanged because .jinja doesn't get written for small
templates.
Made-with: Cursor1 parent 6f8a26f commit 5be34e6
1 file changed
Lines changed: 11 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
183 | 183 | | |
184 | 184 | | |
185 | 185 | | |
186 | | - | |
187 | | - | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
188 | 194 | | |
189 | 195 | | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
190 | 199 | | |
191 | 200 | | |
192 | 201 | | |
193 | | - | |
194 | | - | |
195 | 202 | | |
196 | 203 | | |
197 | 204 | | |
| |||
0 commit comments