-
|
Hi everyone, Is anyone using Gemma 4 with speculative decoding? If so, do you happen to know if thinking still works with it? I'm sending --chat-template-kwargs '{"enable_thinking":true}' with just Gemma 31B, and it works, but if I do the same thing and use Gemma 26B as a draft model, thinking doesn't work anymore. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
|
Nevermind I don't think the draft model receives any parameter like "chat_template_kwargs" either through the api or as a starting flag in the launcher. |
Beta Was this translation helpful? Give feedback.
Nevermind I don't think the draft model receives any parameter like "chat_template_kwargs" either through the api or as a starting flag in the launcher.
I solved the issue by simply adding <|think|> as a system message, which perfectly worked.