Commit 58447b2
authored
[MLX] Support multiple KV cache sessions, with shared constant data (#20408)
MLX backend already has mutable state in a separate execution context
from its constant data. This PR exposes a way to configure that for
external callers, and uses this to support serve.py on MLX like CUDA
backend.1 parent 4433d1f commit 58447b2
15 files changed
Lines changed: 888 additions & 61 deletions
File tree
- .github/workflows
- backends/mlx
- custom_kernel_ops
- test
- runtime
- test
- examples/models/qwen3_5_moe
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
69 | | - | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
70 | 74 | | |
71 | 75 | | |
72 | 76 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
255 | 255 | | |
256 | 256 | | |
257 | 257 | | |
258 | | - | |
259 | | - | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
260 | 262 | | |
261 | 263 | | |
262 | 264 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
56 | 65 | | |
57 | 66 | | |
58 | 67 | | |
| |||
101 | 110 | | |
102 | 111 | | |
103 | 112 | | |
| 113 | + | |
104 | 114 | | |
105 | 115 | | |
106 | 116 | | |
| |||
450 | 460 | | |
451 | 461 | | |
452 | 462 | | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
453 | 490 | | |
454 | 491 | | |
455 | 492 | | |
| |||
Lines changed: 2 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
96 | 96 | | |
97 | 97 | | |
98 | 98 | | |
99 | | - | |
100 | | - | |
101 | | - | |
| 99 | + | |
| 100 | + | |
102 | 101 | | |
103 | 102 | | |
104 | 103 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
| |||
277 | 278 | | |
278 | 279 | | |
279 | 280 | | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
280 | 287 | | |
281 | 288 | | |
282 | 289 | | |
| |||
366 | 373 | | |
367 | 374 | | |
368 | 375 | | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
369 | 384 | | |
370 | 385 | | |
371 | 386 | | |
| |||
431 | 446 | | |
432 | 447 | | |
433 | 448 | | |
| 449 | + | |
434 | 450 | | |
435 | 451 | | |
436 | 452 | | |
| |||
0 commit comments