Skip to content

Commit 42afd8b

Browse files
authored
[Fix] Fix dump err when 0 < len(load_blocks) < fully hit (#927)
## Purpose Fix the dump failure issue that occurs when len(load_blocks) < fully_hit in the external hit scenario with **use_lite=true**.The failure is caused by len(ucm_block_ids) < len(vllm_block_ids) during wait_for_save.This discrepancy arises because, in **_generate_dispatch_meta**, the **new_tokens** parameter still includes external hits, even though req_meta.token_processed has already accounted for those tokens. ## Modifications - Modify func **get_num_new_matched_tokens** of UCMLiteConnector: using hbm_hit_tokens as request_meta.token_processed, as it needs to return 0 hit in external storage. - Modify func **_generate_dispatch_meta** of UCMLiteConnector: only generating dump block infos when req_meta.token_processed + new_tokens >= total_hit_tokens ## Test Tested with online llmperf pipeline.
1 parent 3a11f16 commit 42afd8b

1 file changed

Lines changed: 11 additions & 1 deletion

File tree

ucm/integration/vllm/ucm_connector.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1116,8 +1116,18 @@ def get_num_new_matched_tokens(self, request, num_computed_tokens):
11161116
external_hit_blocks = (
11171117
request_meta.total_hit_block_num - request_meta.hbm_hit_block_num
11181118
)
1119+
need_dump_blks = request_meta.ucm_block_ids[
1120+
request_meta.total_hit_block_num :
1121+
]
1122+
shard_indexs = [0] * len(need_dump_blks)
1123+
total_ptrs = [[0]] * len(need_dump_blks)
1124+
try:
1125+
task = self.store.dump_data(need_dump_blks, shard_indexs, total_ptrs)
1126+
self.store.wait(task)
1127+
except RuntimeError as e:
1128+
logger.error(f"request {request.request_id} wait dump task error. {e}")
1129+
self.requests_meta[request.request_id] = RequestMeta()
11191130

1120-
request_meta.total_hit_block_num = request_meta.hbm_hit_block_num
11211131
self.total_hit_block_nums += external_hit_blocks
11221132

11231133
logger.info(

0 commit comments

Comments
 (0)