Commit e23a541
Revert DSv4 B200/B300 TRT (non-MTP) to 2dd03e6 + disable SWA scratch reuse
The c914d6d image's kv_cache_manager_v2 patch was wrong: freeing SWA scratch
slots on the attention-DP revert->resize(shrink) path hits finish_event=None
(a deferred request never forwarded), crashing every dpa=true job and hanging
the engine. Root cause is a V2-scheduler / SWA-scratch-reuse conflict: the V2
scheduler grows a context request's KV cache (incl. SWA scratch) before delay
batching can defer it, so revert_allocate_context -> resize(shrink) must release
scratch slots that have no finish_event.
Revert both non-MTP images to feat-deepseek_v4-2dd03e6 and set
TRTLLM_DSV4_ENABLE_SWA_SCRATCH_REUSE=0 in the launchers so no scratch slots are
allocated and the revert shrinks cleanly. MTP configs untouched (9aa3715).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent 242ab88 commit e23a541
4 files changed
Lines changed: 17 additions & 3 deletions
File tree
- .github/configs
- benchmarks/single_node/fixed_seq_len
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1801 | 1801 | | |
1802 | 1802 | | |
1803 | 1803 | | |
1804 | | - | |
| 1804 | + | |
1805 | 1805 | | |
1806 | 1806 | | |
1807 | 1807 | | |
| |||
3049 | 3049 | | |
3050 | 3050 | | |
3051 | 3051 | | |
3052 | | - | |
| 3052 | + | |
3053 | 3053 | | |
3054 | 3054 | | |
3055 | 3055 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
50 | 57 | | |
51 | 58 | | |
52 | 59 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
62 | 69 | | |
63 | 70 | | |
64 | 71 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3382 | 3382 | | |
3383 | 3383 | | |
3384 | 3384 | | |
3385 | | - | |
| 3385 | + | |
3386 | 3386 | | |
3387 | 3387 | | |
3388 | 3388 | | |
| |||
0 commit comments