[https://nvbugs/6330273][fix] In StorageManager.__init__, when typical_batch is supplied, append a synthetic… by tensorrt-cicd · Pull Request #15465 · NVIDIA/TensorRT-LLM

tensorrt-cicd · 2026-06-17T21:37:36Z

Summary

Root cause: typical_batch concurrency was never used to floor _min_slots, so windowed pool groups (window_size < tokens_per_block) collapsed to non_stale=1 per request and the absolute floor of 1 slot per pool group caused scheduler deadlock at concurrency > 1.
Fix: In StorageManager.init, when typical_batch is supplied, append a synthetic BatchDesc of len(typical_batch.kv_caches) decode requests with capacity=tokens_per_block, history_length=tokens_per_block-1 (yielding non_stale=1 in every PG) before computing _min_slots — flooring every pool group at len(typical_batch.kv_caches).
Automated fix generated by repair-bot

Test plan

Verify fix on the same GPU type as the original failure
Check for regressions in related tests

Links

Bug: https://nvbugs/6330273

Summary by CodeRabbit

Refactor
- Enhanced runtime resource allocation efficiency through improved constraint computation logic.

…id windowed-pool deadlock When KVCacheManagerV2 is built with a typical_batch describing the working set (e.g., max_batch_size concurrent decode requests with capacity=max_seq_len), windowed pool groups whose window_size is smaller than tokens_per_block previously collapsed to min_slots=1 because get_stale_range() consumed all but one block per request, and _compute_min_slots_from_constraints() only enforced an absolute floor of 1 slot per pool group. With more than 1 concurrent decode request, the V2 scheduler could not find a free slot in windowed pools and deadlocked. Synthesize a constraint from the typical_batch: one KVCacheDesc(capacity=tokens_per_block, history_length=tokens_per_block-1) per request. For every pool group, this yields non_stale=1 per request, so the new floor is len(typical_batch.kv_caches) — large enough to support the scheduler's full concurrency. Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>

coderabbitai · 2026-06-17T21:39:43Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 577c21a7-dd0a-4224-a923-bde5ffbb63fc

📥 Commits

Reviewing files that changed from the base of the PR and between 42a3e55 and 5740b31.

📒 Files selected for processing (1)

tensorrt_llm/runtime/kv_cache_manager_v2/_storage_manager.py

📝 Walkthrough

Walkthrough

In StorageManager.__init__, a synthetic BatchDesc is now derived from typical_batch.kv_caches when present, using a single-decode KVCacheDesc per cache (capacity=tokens_per_block, history_length=tokens_per_block - 1). This synthetic constraint is appended to form effective_constraints, which is then passed to _compute_min_slots_from_constraints instead of the raw constraints or [].

Changes

StorageManager min-slots constraint synthesis

Layer / File(s)	Summary
Effective constraints computation in `StorageManager.__init__` `tensorrt_llm/runtime/kv_cache_manager_v2/_storage_manager.py`	Adds logic to build `effective_constraints` by appending a synthesized `BatchDesc` (one decode-step `KVCacheDesc` per KV cache from `typical_batch`) to the provided `constraints`, then passes `effective_constraints` to `_compute_min_slots_from_constraints` instead of `constraints or []`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

[Bug]: KVCacheManagerV2: windowed pool min_slots underflow causes scheduler deadlock at high concurrency #15401: The synthetic KVCacheDesc(capacity=tokens_per_block, history_length=tokens_per_block - 1) constraint introduced here directly implements the fix proposed in that issue to prevent scheduler deadlock caused by underflowed _min_slots in windowed KV cache pools.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title references the NVBugs ticket and fix type, and describes the core change: appending a synthetic constraint in StorageManager.init when typical_batch is supplied.
Description check	✅ Passed	The description provides root cause analysis, the specific fix implemented, test verification, and links to the bug. However, it lacks some template sections like explicit PR Checklist confirmation.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

lowsfer · 2026-06-24T06:02:45Z

Similar fix is already included in #15462

tensorrt-cicd assigned lowsfer Jun 17, 2026

github-actions Bot assigned tensorrt-cicd Jun 17, 2026

lowsfer closed this Jun 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[https://nvbugs/6330273][fix] In StorageManager.init, when typical_batch is supplied, append a synthetic…#15465

[https://nvbugs/6330273][fix] In StorageManager.init, when typical_batch is supplied, append a synthetic…#15465
tensorrt-cicd wants to merge 1 commit into
NVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6330273

tensorrt-cicd commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 17, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Uh oh!

lowsfer commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

tensorrt-cicd commented Jun 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Links

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 17, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Uh oh!

lowsfer commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tensorrt-cicd commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading