Skip to content

Commit 8dea20a

Browse files
committed
[KVCache][Scheduler] disable write_cache_to_storage* calls under cache manager v1
## Motivation 在 cache manager v1 下,KV cache 的存储回写由 v1 内部的 RadixTree 机制处理, resource_manager_v1 中的 write_cache_to_storage / write_cache_to_storage_decode 调用属于冗余,应跳过。 ## Modifications - resource_manager_v1.py:preemption 路径的两处存储回写调用(decode/非decode)加上 `and not self.enable_cache_manager_v1` 条件,v1 下不再触发 - cache_manager/v1/cache_manager.py:prefix caching 未启用时,补充初始化 `request._match_result = MatchResult()`,避免后续访问空属性 ## Usage or Command 启动服务时设置 `--enable-cache-manager-v1` 即可复现修复效果: ```bash python -m fastdeploy.entrypoints.openai.api_server \ --enable-cache-manager-v1 \ ... ```
1 parent b50b6da commit 8dea20a

2 files changed

Lines changed: 3 additions & 2 deletions

File tree

fastdeploy/cache_manager/v1/cache_manager.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -503,6 +503,7 @@ def match_prefix(
503503
None. Match result is stored in request._match_result.
504504
"""
505505
if not self.enable_prefix_caching or self._radix_tree is None:
506+
request._match_result = MatchResult()
506507
return
507508

508509
with self._lock:

fastdeploy/engine/sched/resource_manager_v1.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -482,13 +482,13 @@ def _trigger_preempt(self, request, num_new_blocks, preempted_reqs, batch_reques
482482
del self.requests[preempted_req.request_id]
483483
if preempted_req.request_id in self.req_dict:
484484
del self.req_dict[preempted_req.request_id]
485-
if envs.FD_SAVE_OUTPUT_CACHE_FOR_PREEMPTED_REQUEST:
485+
if envs.FD_SAVE_OUTPUT_CACHE_FOR_PREEMPTED_REQUEST and not self.enable_cache_manager_v1:
486486
if self.config.cache_config.kvcache_storage_backend:
487487
self.cache_manager.write_cache_to_storage_decode(preempted_req)
488488
self._free_blocks(preempted_req)
489489
llm_logger.info(f"Preemption is triggered! Preempted request id: {preempted_req.request_id}")
490490
else:
491-
if envs.FD_SAVE_OUTPUT_CACHE_FOR_PREEMPTED_REQUEST:
491+
if envs.FD_SAVE_OUTPUT_CACHE_FOR_PREEMPTED_REQUEST and not self.enable_cache_manager_v1:
492492
if self.config.cache_config.kvcache_storage_backend:
493493
self.cache_manager.write_cache_to_storage(preempted_req)
494494
self._free_blocks(preempted_req)

0 commit comments

Comments
 (0)