Skip to content

fix: close head tracker/broadcaster before balance monitor to prevent race#425

Merged
jmank88 merged 1 commit intodevelopfrom
fix/chain-close-shutdown-order
Apr 16, 2026
Merged

fix: close head tracker/broadcaster before balance monitor to prevent race#425
jmank88 merged 1 commit intodevelopfrom
fix/chain-close-shutdown-order

Conversation

@Fletch153
Copy link
Copy Markdown
Contributor

Summary

Fixes the root cause of a long-standing data race in go_core_race_tests by reordering chain.Close() to stop event sources before event consumers.

Root Cause

chain.Close() stopped the balance monitor (event consumer) while the headTracker and headBroadcaster (event sources) were still running. This allowed a last-second OnNewLongestChain callback to wake the balance monitor's SleeperTask during shutdown:

Before (broken order):

1. logPoller.Close()
2. balanceMonitor.Close()    ← consumer stopped
3. logBroadcaster.Close()
4. headTracker.Close()       ← source still running at step 2!
5. headBroadcaster.Close()   ← source still running at step 2!
6. txm.Close()

The race sequence:

  1. chain.Close() calls balanceMonitor.Close()sleeperTask.Stop() → closes chStop
  2. Meanwhile, headBroadcaster (still alive) delivers OnNewLongestChainsleeperTask.WakeUp() → queues work in chQueue
  3. In workerLoop's select, both chStop and chQueue are ready — Go picks randomly
  4. 50% chance: picks chQueue, runs one more Work(ctx) → calls BalanceAt on the mock client
  5. Concurrently, CtxCancel goroutine fires cancel() on the context
  6. testify's callString() reads context via fmt.Sprintf("%#v", ctx) while cancel() writes → DATA RACE

Fix

Move headTracker and headBroadcaster shutdown before balanceMonitor:

After (correct order):

1. logPoller.Close()
2. logBroadcaster.Close()
3. headTracker.Close()       ← sources stopped FIRST
4. headBroadcaster.Close()   ← sources stopped FIRST
5. balanceMonitor.Close()    ← consumer stopped safely
6. txm.Close()

No new heads can arrive → no OnNewLongestChain → no last-second WakeUp() → no race.

Also fixed merr = c.balanceMonitor.Close()merr = multierr.Combine(merr, ...) which was silently dropping prior errors.

Evidence

Reorder chain.Close() to stop event sources (headTracker,
headBroadcaster) before event consumers (balanceMonitor). Previously,
the balance monitor was closed while the head pipeline was still
running, allowing headBroadcaster to deliver a last-second
OnNewLongestChain that wakes the balance monitor's SleeperTask.
When SleeperTask.Stop() then closes chStop, both chStop and chQueue
are ready in the select — Go picks randomly, and 50% of the time
runs one more Work() that calls BalanceAt on a torn-down mock client,
triggering a data race detected by go_core_race_tests.
@Fletch153 Fletch153 requested a review from a team as a code owner April 16, 2026 14:33
@github-actions
Copy link
Copy Markdown
Contributor

👋 Fletch153, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@github-actions
Copy link
Copy Markdown
Contributor

📊 API Diff Results

No changes detected for module github.com/smartcontractkit/chainlink-evm

View full report

@jmank88 jmank88 requested a review from pavel-raykov April 16, 2026 14:44
@jmank88 jmank88 merged commit a43767d into develop Apr 16, 2026
35 checks passed
@jmank88 jmank88 deleted the fix/chain-close-shutdown-order branch April 16, 2026 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants