Commit 5f3dc2b
committed
fix Gemma 4 multimodal chat-template markers in processor_gemma4
The Gemma 4 multimodal SFT path was emitting Gemma 3 chat-template markers
("<start_of_turn>", "<end_of_turn>") which are NOT special tokens in the
Gemma 4 tokenizer. They BPE-tokenize into 7-token noise sequences each, so a
training label like "A<end_of_turn>" became an 8-token sequence
([236776 'A', 236820 '<', 643 'end', 236779 '_', 1340 'of', 236779 '_',
887 'turn', 236813 '>']).
With sft_train_on_completion_only=true the model learned to reproduce this
noise sequence after every answer, producing severe response-format collapse
post-SFT (e.g. "A<B<C<D<...").
The Gemma 4 chat template uses different special tokens:
<bos> (id 2)
<|turn> (id 105)
<turn|> (id 106)
This CL switches the prompt and response formatters to use them.
PiperOrigin-RevId: 9313965451 parent 3190805 commit 5f3dc2b
2 files changed
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
135 | 135 | | |
136 | 136 | | |
137 | 137 | | |
138 | | - | |
| 138 | + | |
139 | 139 | | |
140 | 140 | | |
141 | 141 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
97 | | - | |
| 97 | + | |
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
| |||
0 commit comments