Skip to content

refactor: refactor instance managers and load balance policy.#53

Open
magicheng0816 wants to merge 3 commits intojd-opensource:mainfrom
magicheng0816:add_group_manager
Open

refactor: refactor instance managers and load balance policy.#53
magicheng0816 wants to merge 3 commits intojd-opensource:mainfrom
magicheng0816:add_group_manager

Conversation

@magicheng0816
Copy link
Copy Markdown
Collaborator

Currently, the responsibilities of the instance manager are too heavy, with increasing logic being added, and there is no clear boundary between various parts of the logic. Therefore, the instance manager needs to be split up:

instance_topology, responsible for instance registration and status maintenance; 2. instance_metrics, responsible for collecting and reporting instance metrics; 3. scheduling strategy, which is gradually being migrated to loadbalance_policy, but not completely yet, and requires further refactoring in the future.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the InstanceMgr by decomposing it into three specialized components: InstanceTopologyImpl, InstanceMetricsImpl, and InstanceKVCache. This improves modularity and clarifies the responsibilities within the scheduler's management layer. I have identified two significant issues: the current implementation relies on InstanceMgr directly accessing and locking private mutexes of its components, which violates encapsulation, and the use of -1 as a sentinel value for an unsigned uint64_t index is non-portable and error-prone.

Comment on lines +155 to +157
std::shared_lock<std::shared_mutex> topo_lock(topology_impl_->cluster_mutex_);
std::shared_lock<std::shared_mutex> metrics_lock(
metrics_impl_->metrics_mutex_);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This implementation directly accesses and locks private mutexes (cluster_mutex_, metrics_mutex_) from child components (topology_impl_, metrics_impl_). While this works because InstanceMgr is a friend class, it breaks encapsulation and creates a strong, hidden coupling between InstanceMgr and the internal implementation of its components.

This makes the code harder to maintain and reason about. Future changes to the locking strategy in InstanceTopologyImpl or InstanceMetricsImpl could break InstanceMgr in non-obvious ways.

A cleaner approach would be for InstanceMgr to coordinate the components through their public interfaces. For example, the mutexes could be owned by InstanceMgr and passed by reference to the components' constructors. This would make the ownership and locking strategy explicit and centralized in the InstanceMgr facade.

const std::string& name,
const InstanceMetaInfo& info) {
uint64_t index = info.instance_index;
if (index == static_cast<uint64_t>(-1)) return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using -1 as a sentinel value for an unsigned type like uint64_t is problematic. The conversion of -1 to an unsigned type is implementation-defined before C++20. While it might work on systems using two's complement arithmetic, it's not portable and can lead to subtle bugs.

It's better to use std::numeric_limits<uint64_t>::max() as the sentinel value. This requires including the <limits> header.

I recommend changing the check here, and also updating the default value for instance_index in InstanceMetaInfo (in common/types.h) to std::numeric_limits<uint64_t>::max() for consistency and correctness. You will need to #include <limits> in this file.

Suggested change
if (index == static_cast<uint64_t>(-1)) return;
if (index == std::numeric_limits<uint64_t>::max()) return;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant