Commit 311be20
Enable infinite generation with RoPE position remapping for attention sink (#19011)
Summary:
Pull Request resolved: #19011
Previously, attention sink models could not generate beyond max_context_len
because RoPE used the raw monotonic input_pos to index into the pre-computed
freqs_cis table, causing OOB when pos >= max_context_len.
This change adds position remapping in RopeWithAttentionSink:
- Sink token positions (< sink_size) are preserved as-is
- Window token positions are wrapped into the ring buffer range
[sink_size, sink_size + ring_size) using modular arithmetic
The 2x ring buffer (ring_size = 2 * window_size) ensures the live window
of tokens never spans a wrap boundary, preserving correct relative
distances in RoPE space.
This enables attention sink models to generate indefinitely — the KV cache
ring buffer recycles space while RoPE positions stay bounded.
Reviewed By: lucylq
Differential Revision: D1007287481 parent 799bf5a commit 311be20
2 files changed
Lines changed: 64 additions & 14 deletions
Lines changed: 46 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
33 | | - | |
34 | | - | |
| 32 | + | |
| 33 | + | |
35 | 34 | | |
36 | | - | |
37 | | - | |
38 | | - | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
39 | 42 | | |
40 | 43 | | |
41 | 44 | | |
| |||
47 | 50 | | |
48 | 51 | | |
49 | 52 | | |
50 | | - | |
51 | | - | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
52 | 62 | | |
53 | 63 | | |
54 | 64 | | |
55 | | - | |
56 | | - | |
57 | | - | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
58 | 70 | | |
59 | 71 | | |
60 | | - | |
61 | | - | |
62 | | - | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
63 | 95 | | |
64 | 96 | | |
65 | 97 | | |
| |||
Lines changed: 18 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
398 | 398 | | |
399 | 399 | | |
400 | 400 | | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
401 | 419 | | |
402 | 420 | | |
403 | 421 | | |
| |||
0 commit comments