Improve host_source locking and ring refresh concurrency#1944
Open
nanassito wants to merge 2 commits intoapache:trunkfrom
Open
Improve host_source locking and ring refresh concurrency#1944nanassito wants to merge 2 commits intoapache:trunkfrom
nanassito wants to merge 2 commits intoapache:trunkfrom
Conversation
Remove ringDescriber mutex around GetHosts I/O; prevHosts/prevPartitioner were never written. Use read locks for ConnectAddressAndPort and a read-fast path for HostnameAndPort. Read peer validity under a single RLock in isValidPeer. Release errorBroadcaster mutex before sending to listeners. Made-with: Cursor
worryg0d
reviewed
Apr 20, 2026
Contributor
worryg0d
left a comment
There was a problem hiding this comment.
Hey, I reviewed this patch - jfyi, I'm not an apache committer, so I'm just writing comments.
HostInfo methods might be on a hot path, so it is fine to narrow write-lock to read-lock where it is possible.
Having a benchmark would be useful to see the performance impact.
Comment on lines
-488
to
-490
| mu sync.Mutex | ||
| prevHosts []*HostInfo | ||
| prevPartitioner string |
Contributor
There was a problem hiding this comment.
I just realized they are never used
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

We discovered during a load test of one of our service that there is a lot of locking going on in
host_source.go. It turns out that a few can be optimized.Here are the set of changes in this PR:
1. ringDescriber / GetHosts
The mutex surrounded all of getLocalHostInfo and getClusterPeerInfo (control-connection I/O). prevHosts / prevPartitioner were never written anywhere, so error returns were always nil / "". The mutex only serialized unrelated ring refreshes for no benefit. Removed mu, prevHosts, and prevPartitioner, and return nil, "", err on failure so concurrent GetHosts no longer block each other on network work.
2. HostInfo.ConnectAddressAndPort
It only reads fields; it used Lock and blocked writers. Switched to RLock so it matches other read-only accessors.
This is the most costly one exposed by our load test.
3. HostInfo.HostnameAndPort
Uses a read-optimized path: RLock when hostname is already set (common case); Lock only when lazily filling hostname.
4. isValidPeer
Previously mixed RPCAddress() (under lock) with direct reads of hostId, dataCenter, etc. (racy). One RLock covers all fields and avoids extra lock cycles from a separate RPCAddress call.
5. errorBroadcaster.broadcast
Stopped holding the mutex while sending on listener channels (could stall other newListener / broadcast callers). Snapshot and clear listeners under the lock, then unlock and notify.