[Bugfix] AS block leaks#7890
Conversation
|
Thanks for your contribution! |
CI报告基于以下代码生成(30分钟更新一次): 1 任务总览当前 Required 任务无失败,但仍有 1 个 Required 任务正在运行;需等待该任务完成后再判断是否可合并。Optional 任务有 3 个失败、1 个运行中,不阻塞合并,仅供参考。
2 任务状态汇总日志列说明:失败任务使用工具生成的 Job 链接;运行中任务使用对应 Workflow 链接。 2.1 Required任务 : 9/10 通过
2.2 可选任务 — 27/31 通过
3 失败详情(仅 required)无 required 失败任务。本轮未调用 |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #7890 +/- ##
==========================================
Coverage ? 64.03%
==========================================
Files ? 467
Lines ? 64965
Branches ? 9962
==========================================
Hits ? 41601
Misses ? 20540
Partials ? 2824
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/re-run all failed |
|
/re-run all-failed |
|
/re-run all-failed |
|
/re-run all-failed |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 Paddle-CI-Agent | pr_review |
2026-05-25 17:10:25
📋 Review 摘要
PR 概述:修复 enable_prefix_caching + 分层 KV Cache 路径下,can_allocate_gpu_blocks 未经 _get_can_schedule_prefill_threshold_block 计算预留块阈值导致 AS block 泄漏的问题。
变更范围:fastdeploy/engine/sched/resource_manager_v1.py
影响面 Tag:[Scheduler] [KVCache]
问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🟡 建议 | resource_manager_v1.py:1073 |
缺少针对此修复路径的单元测试 |
| 📝 PR 规范 | — | 标题 Tag 大小写偏差;描述模板多个段落为空 |
📝 PR 规范检查
标题 [Bugfix] 应规范为 [BugFix](官方 Tag 列表中为 [BugFix]);Modifications、Usage or Command、Accuracy Tests 段落为空,需补充。
标题建议(可直接复制):
[BugFix][Scheduler][KVCache] Fix AS block leaks in enable_prefix_caching + hierarchical KV cache path
PR 描述建议(可直接复制,必须复刻 checklist §D2 模板的完整结构):
## Motivation
修复在 enable_prefix_caching + 分层 KV Cache 路径下,`can_allocate_gpu_blocks` 未通过 `_get_can_schedule_prefill_threshold_block` 计算预留块阈值,导致 AS block 泄漏的问题。
## Modifications
- `fastdeploy/engine/sched/resource_manager_v1.py`:在 `_allocate_decode_and_extend` 的两处分层 KV Cache 预检路径中,将直接传入原始 block 数改为先调用 `_get_can_schedule_prefill_threshold_block` 计算含 reserve_blocks 的阈值,再传给 `can_allocate_gpu_blocks`,与主调度路径保持一致。
- 在 `_free_blocks` 调用前新增 Warning 注释,说明在 `update_cache_blocks` 之前调用 `_free_blocks` 可能导致 storage blocks 泄漏。
## Usage or Command
N/A
## Accuracy Tests
N/A
## Checklist
- [x] Add at least a tag in the PR title.
- Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
- You can add new tags based on the PR content, but the semantics must be clear.
- [x] Format your code, run `pre-commit` before commit.
- [ ] Add unit tests. Please write the reason in this PR if no unit tests.
- [x] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.总体评价
修复逻辑正确,将分层 KV Cache 预检路径与主调度路径对齐,消除了 AS block 泄漏风险。建议补充 tests/scheduler/ 下的单元测试覆盖此修复路径,并完善 PR 描述。
| or self.config.cache_config.kvcache_storage_backend | ||
| ): | ||
| if not self.cache_manager.can_allocate_gpu_blocks( | ||
| can_schedule_block_num_threshold = self._get_can_schedule_prefill_threshold_block( |
There was a problem hiding this comment.
🟡 建议 此修复路径缺少对应的单元测试。
建议在 tests/scheduler/ 下补充测试用例,覆盖 enable_prefix_caching=True + num_cpu_blocks > 0(或 kvcache_storage_backend 非空)场景下,_get_can_schedule_prefill_threshold_block 阈值计算是否正确阻止了 block 泄漏。
Motivation
修复在 enable_prefix_caching + 分层 KV Cache 路径下,can_allocate_gpu_blocks 未通过 _get_can_schedule_prefill_threshold_block 计算预留块阈值,导致 AS block 泄漏的问题。
变更范围:fastdeploy/engine/sched/resource_manager_v1.py(调度资源管理)
影响面 Tag:[Scheduler] [KVCache]
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.