Commit 9e0d34d
clean up some runtime potential bugs
Summary:
1. Zero cache on allocation (line 65-66): std::fill on cache_data_ and update_data_ after allocator_.allocate() —eliminates uninitialized memory garbage that varies across devices.
2. Zero cache on reset (line 191): std::fill on cache_data_ in reset() — ensures stale KV cache from a previous prompt is fully cleared, not just the position counters.
3. Zero padding in last prefill chunk (line 618-621): When batch_len < input_len, fill the tail of the input buffer with zeros — prevents stale tokens from a previous chunk leaking through the embedding layer. sa_runner.cpp
4. Call runner.reset() before each prompt in the multi-prompt loop, stdin prompt loop, and stdin tokens loop —ensures the KV cache, masks, and input_pos_ are fully reset between prompts
Differential Revision: D1046159931 parent a49171d commit 9e0d34d
1 file changed
Lines changed: 7 additions & 0 deletions
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
| 67 | + | |
| 68 | + | |
67 | 69 | | |
68 | 70 | | |
69 | 71 | | |
| |||
186 | 188 | | |
187 | 189 | | |
188 | 190 | | |
| 191 | + | |
189 | 192 | | |
190 | 193 | | |
191 | 194 | | |
| |||
613 | 616 | | |
614 | 617 | | |
615 | 618 | | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
616 | 623 | | |
617 | 624 | | |
618 | 625 | | |
| |||
0 commit comments