Commit bedd3c6
gemma4: save/restore target_feat in prefix cache snapshot
Matching Qwen35's approach: save target_feat (BF16 feature ring buffer) and
last_tok as part of the KV snapshot. On restore, target_feat is copied back
to GPU before the delta prefill + feature mirror resync.
Previously, only K/V tensors were snapshotted. After restore, the feature
mirror contained stale data from the previous request's decode phase, causing
the draft model to make poor predictions and halving speculative decode
acceptance rate (52% → 24%).
With this fix, the full feature state is correctly restored, and the
subsequent draft_feature_mirror_sync_tail ensures the mirror matches.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>1 parent 8e24fc5 commit bedd3c6
3 files changed
Lines changed: 35 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
633 | 633 | | |
634 | 634 | | |
635 | 635 | | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
636 | 642 | | |
637 | 643 | | |
| 644 | + | |
638 | 645 | | |
639 | 646 | | |
640 | 647 | | |
| |||
766 | 773 | | |
767 | 774 | | |
768 | 775 | | |
| 776 | + | |
769 | 777 | | |
770 | | - | |
| 778 | + | |
771 | 779 | | |
772 | 780 | | |
773 | 781 | | |
| |||
787 | 795 | | |
788 | 796 | | |
789 | 797 | | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
790 | 808 | | |
791 | 809 | | |
792 | 810 | | |
793 | 811 | | |
| 812 | + | |
794 | 813 | | |
795 | 814 | | |
796 | 815 | | |
| |||
820 | 839 | | |
821 | 840 | | |
822 | 841 | | |
| 842 | + | |
823 | 843 | | |
824 | | - | |
825 | | - | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
826 | 851 | | |
827 | 852 | | |
828 | 853 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
193 | 193 | | |
194 | 194 | | |
195 | 195 | | |
| 196 | + | |
196 | 197 | | |
197 | 198 | | |
| 199 | + | |
| 200 | + | |
198 | 201 | | |
199 | 202 | | |
200 | 203 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
478 | 478 | | |
479 | 479 | | |
480 | 480 | | |
481 | | - | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
482 | 485 | | |
483 | 486 | | |
484 | 487 | | |
0 commit comments