refactor: refactor instance_mgr, split components.#51
refactor: refactor instance_mgr, split components.#51magicheng0816 wants to merge 1 commit intojd-opensource:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the InstanceMgr by decomposing its responsibilities into three specialized components: InstanceMetrics, InstanceTopology, and InstanceKVCache. This modularization improves the management of cluster state, metrics tracking, and KV cache locations. As part of this change, GlobalKVCacheMgr has been renamed and integrated into InstanceMgr, and load balancing policies have been updated to use the refactored interfaces. A critical regression was identified in the shared round-robin routing logic, which now strictly requires a decode instance to be present for scheduling to succeed; a suggestion has been provided to restore the previous behavior and allow scheduling to proceed with only prefill instances.
Currently, the responsibilities of the instance manager are too heavy, with increasing logic being added, and there is no clear boundary between various parts of the logic. Therefore, the instance manager needs to be split up: