Commit 6e80430
PR-K1.B: K/V merge primitive for v0.4 dLM K/V Restoration
Second foundational PR of the K-series implementing ADR 0008 §11.7
phase K1. Builds directly on PR #70's K1.A capture infrastructure.
This PR implements the merge half: given the verifier's locally-
computed K/V at all positions plus captured proposer K/V at evicted
positions, return merged K/V tensors where evicted slots are
overridden by the captured values. ADR 0008 §11.5: 'The verifier
performs standard softmax attention over (sink+window K/V from cache)
\xe2\x8a\x95 (reconstructed K/V from proposer transient).'
This PR is RoPE-agnostic on purpose — it merges raw projection outputs
at any consistent RoPE state. K1.B uses pre-RoPE merge (matching K1.A
captures); K1.C will reuse HF's apply_rotary_pos_emb inside the
verifier's standard attention forward (after the merge), so we don't
duplicate Gemma3-specific RoPE machinery in this layer.
Files:
inference_engine/v04/kv_merge.py (272 lines)
* merge_kv_at_evicted_positions(K_local, V_local, K_captured,
V_captured, evicted_positions) -> (K_merged, V_merged).
Returns clones so callers can mutate freely. Rank check first
(so K_local.size(1) is meaningfully T), then position validation
(sortedness/dedup/range), then full shape consistency
(batch/heads/dim/dtype/device). Empty evicted list is the
identity case. ADR 0008 §6.2: no silent fallback on validation.
* compute_evicted_positions(seq_len, sink_size, window_size) ->
List[int]. Computes the contiguous range
[sink_size, seq_len - window_size) that the v0.4 architecture
evicts from the verifier's permanent cache. Returns [] when
sink+window covers the whole sequence (no eviction needed).
* _validate_positions, _validate_shapes — internal helpers,
separated for unit testability and to keep the public function
readable.
inference_engine/v04/__init__.py
Updated to re-export compute_evicted_positions and
merge_kv_at_evicted_positions. Public API now covers K1.A
(capture) + K1.B (merge); K1.C will add the verifier wrapper
on top.
tests/inference_engine/v04/test_kv_merge.py (366 lines, 39 cases,
all <0.10 s on Linux CI)
Test classes:
* TestComputeEvictedPositions — sink+window range arithmetic
(typical, no-eviction-when-covered, zero-sink, zero-window,
everything-evicted, seq_len=0 boundary, negative inputs raise).
* TestMergeKVHappyPath — basic correctness on small fixtures:
evicted positions get captured values bit-exactly; non-evicted
positions preserve local values bit-exactly; output shape matches
local; output is a clone (mutating it doesn't affect inputs).
* TestMergeKVPositionValidation — unsorted, duplicate, negative,
at-seqlen, beyond-seqlen positions all raise ValueError with
descriptive messages.
* TestMergeKVShapeValidation — local K/V mismatch, rank-3 inputs,
batch mismatch, num_kv_heads mismatch, head_dim mismatch,
captured T-dim mismatch, dtype mismatch, captured K/V internal
mismatch all raise ValueError.
* TestMergeKVDifferentiability — gradient flows through K_captured
/ V_captured (so K2/K3's learnable f_\xce\xb8 projection can be trained
end-to-end through the merge); gradient flows through K_local /
V_local at non-evicted positions; gradient is severed at evicted
positions on the local branch (those values are overridden by
the merge — this is a deliberate v0.4 boundary condition).
* TestMergeKVEdgeCases — empty evicted list returns clone of local;
all-positions-evicted gives merged == captured; single position;
boundary positions 0 and seqlen-1; consecutive position blocks
(the common case from compute_evicted_positions); B>1; bf16
dtype preservation.
Running tests/inference_engine/v04/ now passes all 71 cases (32 K1.A
+ 39 K1.B) in <0.10 s on Linux CI without any HF model download.
What's next:
K1.C — DLMRestoredVerifier wrapper that ties capture + merge into
the verifier's actual attention forward (Gemma3Attention
monkey-patch or subclass), reusing HF's apply_rotary_pos_emb
to apply RoPE post-merge. End-to-end inference path on
Linux smoke + Mac M4 NIAH validation.
K1.D — NIAH recall validation harness on Mac M4 against the v0.3
sink+window baseline and the full-attention oracle. ADR
0008 §11.8 v0.4 GA gate (a): NIAH mid-context recall \xe2\x89\xa5 95 %
at 100 k-token context.
This PR's base branch is main (now containing PR #69 ADR 0008 v0.4
amendment and PR #70 K1.A K/V capture).
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>1 parent 8dea55f commit 6e80430
3 files changed
Lines changed: 773 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
36 | 40 | | |
37 | 41 | | |
| 42 | + | |
38 | 43 | | |
39 | 44 | | |
40 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
41 | 49 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
0 commit comments