fix: close head tracker/broadcaster before balance monitor to prevent race#425
Merged
fix: close head tracker/broadcaster before balance monitor to prevent race#425
Conversation
Reorder chain.Close() to stop event sources (headTracker, headBroadcaster) before event consumers (balanceMonitor). Previously, the balance monitor was closed while the head pipeline was still running, allowing headBroadcaster to deliver a last-second OnNewLongestChain that wakes the balance monitor's SleeperTask. When SleeperTask.Stop() then closes chStop, both chStop and chQueue are ready in the select — Go picks randomly, and 50% of the time runs one more Work() that calls BalanceAt on a torn-down mock client, triggering a data race detected by go_core_race_tests.
Contributor
|
👋 Fletch153, thanks for creating this pull request! To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team. Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks! |
Contributor
📊 API Diff Results
|
jmank88
approved these changes
Apr 16, 2026
pavel-raykov
approved these changes
Apr 16, 2026
This was referenced Apr 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the root cause of a long-standing data race in
go_core_race_testsby reorderingchain.Close()to stop event sources before event consumers.Root Cause
chain.Close()stopped the balance monitor (event consumer) while the headTracker and headBroadcaster (event sources) were still running. This allowed a last-secondOnNewLongestChaincallback to wake the balance monitor'sSleeperTaskduring shutdown:Before (broken order):
The race sequence:
chain.Close()callsbalanceMonitor.Close()→sleeperTask.Stop()→ closeschStopOnNewLongestChain→sleeperTask.WakeUp()→ queues work inchQueueworkerLoop'sselect, bothchStopandchQueueare ready — Go picks randomlychQueue, runs one moreWork(ctx)→ callsBalanceAton the mock clientCtxCancelgoroutine firescancel()on the contextcallString()reads context viafmt.Sprintf("%#v", ctx)whilecancel()writes → DATA RACEFix
Move headTracker and headBroadcaster shutdown before balanceMonitor:
After (correct order):
No new heads can arrive → no
OnNewLongestChain→ no last-secondWakeUp()→ no race.Also fixed
merr = c.balanceMonitor.Close()→merr = multierr.Combine(merr, ...)which was silently dropping prior errors.Evidence
Core Tests (go_core_race_tests), test:TestJobsController_Index_HappyPath(leaked goroutine from earlier test)monitor.(*worker).checkAccountBalanceandservices.StopRChan.CtxCancel.func1