fix(disagg): unstuck decode aborts under prealloc pressure by whybeyoung · Pull Request #25561 · sgl-project/sglang

whybeyoung · 2026-05-18T03:53:52Z

Fix three abort-handling bugs that caused aborted decode requests to linger until WAITING_TIMEOUT instead of being released immediately.

decode.py _update_handshake_waiters: skip the early-return when any receiver has been flipped to KVPoll.Failed (e.g. by an abort), so aborted reqs are not held until transfer begins.
scheduler.py abort_request (DECODE): in addition to calling kv_receiver.abort(), mark req.finished_reason = FINISH_ABORT for reqs in prealloc/transfer queues, so pop_preallocated / pop_transferred actually drop them.
tokenizer_manager.py abort_request: always forward to the scheduler when tokenizer_worker_num > 1 (the local rid_to_state is per-worker and load balancing may route abort to a non-owner). Add a guard against empty rid being treated as a startswith-prefix match for every request.
CC @ShangmingCai

CI States

Latest PR Test (Base): Run #26017657590
Latest PR Test (Extra): ⚠️ Not enabled — add run-ci-extra label to opt in.

Fix three abort-handling bugs that caused aborted decode requests to linger until WAITING_TIMEOUT (~15min) instead of being released immediately. 1. decode.py _update_handshake_waiters: skip the early-return when any receiver has been flipped to KVPoll.Failed (e.g. by an abort), so aborted reqs are not held until transfer begins. 2. scheduler.py abort_request (DECODE): in addition to calling kv_receiver.abort(), mark req.finished_reason = FINISH_ABORT for reqs in prealloc/transfer queues, so pop_preallocated / pop_transferred actually drop them. 3. tokenizer_manager.py abort_request: always forward to the scheduler when tokenizer_worker_num > 1 (the local rid_to_state is per-worker and load balancing may route abort to a non-owner). Add a guard against empty rid being treated as a startswith-prefix match for every request.

gemini-code-assist · 2026-05-18T03:53:56Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

whybeyoung · 2026-05-18T03:54:23Z

/tag-and-rerun-ci

ShangmingCai · 2026-05-18T08:18:21Z

+                    if not isinstance(decode_req.req.finished_reason, FINISH_ABORT):
+                        decode_req.req.finished_reason = FINISH_ABORT()


nit: could be redundant since we have prepare_abort in PD module

ShangmingCai

LGTM

ShangmingCai · 2026-05-18T14:57:11Z

CI has passed.

…ct#25561)

whybeyoung requested review from ByronHsu, ShangmingCai, Ying1123, hnyls2002, merrymercy and xiezhq-hermann as code owners May 18, 2026 03:53

github-actions Bot added the run-ci label May 18, 2026

chore(disagg): trim verbose abort-fix comments

51ea446

ShangmingCai reviewed May 18, 2026

View reviewed changes

Comment thread python/sglang/srt/managers/tokenizer_manager.py Outdated

chore(tokenizer): drop unclear multi-worker comment

b958bbb

ShangmingCai reviewed May 18, 2026

View reviewed changes

ShangmingCai approved these changes May 18, 2026

View reviewed changes

sgl-project deleted a comment from github-actions Bot May 18, 2026

ShangmingCai merged commit d1acd62 into sgl-project:main May 18, 2026
187 of 199 checks passed

Shunkangz pushed a commit to Shunkangz/sglang that referenced this pull request May 27, 2026

fix(disagg): unstuck decode aborts under prealloc pressure (sgl-proje…

a755e9a

…ct#25561)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(disagg): unstuck decode aborts under prealloc pressure#25561

fix(disagg): unstuck decode aborts under prealloc pressure#25561
ShangmingCai merged 3 commits into
sgl-project:mainfrom
whybeyoung:fix_abort_multi

whybeyoung commented May 18, 2026 •

edited by github-actions Bot

Loading

Uh oh!

gemini-code-assist Bot commented May 18, 2026

Uh oh!

whybeyoung commented May 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

ShangmingCai May 18, 2026

Uh oh!

ShangmingCai left a comment

Uh oh!

ShangmingCai commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if not isinstance(decode_req.req.finished_reason, FINISH_ABORT):
		decode_req.req.finished_reason = FINISH_ABORT()

Conversation

whybeyoung commented May 18, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI States

Uh oh!

gemini-code-assist Bot commented May 18, 2026

Uh oh!

whybeyoung commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ShangmingCai May 18, 2026

Choose a reason for hiding this comment

Uh oh!

ShangmingCai left a comment

Choose a reason for hiding this comment

Uh oh!

ShangmingCai commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

whybeyoung commented May 18, 2026 •

edited by github-actions Bot

Loading

whybeyoung commented May 18, 2026 •

edited

Loading