Skip to content

Fix check-then-set race in Host.Start()#11

Merged
danielsobrado merged 2 commits into
mainfrom
copilot/fix-host-lifecycle-race
Mar 28, 2026
Merged

Fix check-then-set race in Host.Start()#11
danielsobrado merged 2 commits into
mainfrom
copilot/fix-host-lifecycle-race

Conversation

Copilot AI commented Mar 27, 2026

Copy link
Copy Markdown
Contributor

Host.Start() checked h.running under the lock but released it before setting h.running = true, leaving a window where concurrent callers could both observe running == false and both proceed through startup.

Changes

  • pkg/p2p/host/host.go — set h.running = true atomically within the same critical section as the guard check, eliminating the window entirely:
// Before
h.mu.Lock()
if h.running { h.mu.Unlock(); return err }
h.mu.Unlock()
// ... startup work ...
h.mu.Lock()
h.running = true   // too late — concurrent callers already passed the guard
h.mu.Unlock()

// After
h.mu.Lock()
if h.running { h.mu.Unlock(); return err }
h.running = true   // set under the same lock; concurrent callers now see true
h.mu.Unlock()
// ... startup work ...
  • pkg/p2p/host/host_lifecycle_test.go — adds focused tests covering the guard:
    • TestHostStart_GuardAlreadyRunning — single call on a running host is rejected
    • TestHostStop_IdempotentWhenNotRunning — Stop on a stopped host is a no-op
    • TestHostStart_ConcurrentGuard — 20 concurrent callers on a running host all receive errors (run with -race)
    • TestHostIsRunning_ThreadSafe — 50 concurrent IsRunning() reads produce no data race
Original prompt

Create a small follow-up implementation pull request in repository danielsobrado/P2P-Market-Data to fix a remaining lifecycle race after PR #8.

Reference PR: #8

Problem

pkg/p2p/host/host.go improved Start() by checking h.running, but the current implementation still appears to have a race window:

  • goroutine A can lock, see h.running == false, unlock
  • goroutine B can do the same before either goroutine sets h.running = true
  • both may proceed with startup work concurrently

This means double-start may still be possible under concurrent calls.

Goal

Make the host lifecycle transition around Start() atomic enough to prevent concurrent double-start.

Scope

Keep this PR intentionally small and focused.
Likely files:

  • pkg/p2p/host/host.go
  • a small related test file if practical

Required implementation work

  1. Fix the race in Host.Start() so concurrent callers cannot both proceed through startup.
  2. Preserve the improvements from PR Follow-up: host.go thread-safety, config validation, and targeted P2P tests after PR #6 #8 where possible.
  3. If needed, introduce a minimal lifecycle state or equivalent guard, but avoid a large refactor.
  4. Add a focused test if practical to demonstrate the guard against repeated/concurrent starts.

Acceptance criteria

  • The PR contains real code changes.
  • Host.Start() no longer has the obvious check-then-set race window.
  • Scope stays small and reviewable.
  • This is a real implementation PR, not a planning-only PR.

The following is the prior conversation context from the user's chat exploration (may be truncated):

Assistant: The pull request to implement the detailed fixes described in PR #6 is being worked on now.
You can track progress directly at PR #6.

Once complete, it will include implementation updates matching its description. Let me know if you need anything else!

User: Check what are the next natural steps

Assistant: Next natural steps for PR #6:

  1. Finish implementation review

  2. Review the actual diff against the PR description

    • Confirm the code changes really match the listed fixes:
      • script lifecycle
      • P2P accounting/validator/signing
      • migration safety
      • frontend state refresh
  3. Run validation before merge
    Natural checks to run:

    • Go tests
    • frontend build/tests if available
    • smoke test script start/stop flow
    • smoke test peer connection behavior
    • smoke test migration generation behavior
  4. Mark ready for review

    • Since it is still a draft, the next natural repo action is to convert it from draft to ready-for-review once validation looks good.
  5. Review and merge

Important note

I could confirm:

  • PR is open
  • draft = true
  • mergeable = true
  • changed files = 10
  • commits = 3

I could not confirm check run results from the data returned here, so verifying CI/test status is still a natural next step.

Short version

The next natural step is:

review diff → run tests/checks → mark ready for review → merge

If you want, I can next help you inspect whether the diff actually matches the PR description.

User: Do that

Assistant: PR #6’s diff mostly matches its description.

Files changed

The changed files I could confirm are:

  • pkg/scripts/executor.go
  • pkg/scripts/manager.go
  • pkg/scripts/scripts_test.go
  • pkg/peer/connection.go
  • pkg/peer/discovery/validator.go
  • pkg/peer/discovery/validator_test.go
  • pkg/p2p/message/message.go
  • scripts/proto_to_sql.py
  • frontend/src/components/data/utils/handleScripts.ts
  • frontend/src/App.tsx

What the diff clearly covers

1. Script lifecycle hardening

Covered well.

Evidence in diff:

  • pkg/scripts/executor.go
    • introduces scriptRun{cmd, done}
    • removes the old double-Wait() pattern
    • StopScript now does graceful stop → timeout → kill
    • ExecuteScript now returns result+error together
    • ExecuteScriptWithOutputCapture now trims output and returns parsing errors with raw output
    • Python path validation now checks binary naming and exec.LookPath
  • pkg/scripts/manager.go
    • dependency validation temp files moved out of the script dir via os.CreateTemp
  • pkg/scripts/scripts_test.go
    • adds tests for stop behavior, missing Python, temp-file handling, and non-nil result on execution error

Assessment: this part strongly matches the PR description.


2. P2P correctness fixes

Mostly covered.

Evidence in diff:

  • pkg/peer/connection.go
    • guards connection counter updates on actual membership
    • removes manual decrement from DisconnectPeer
  • pkg/peer/discovery/validator.go
    • returns error if all records ...

This pull request was created from Copilot chat.


⌨️ Start Copilot coding agent tasks without leaving your editor — available in VS Code, Visual Studio, JetBrains IDEs and Eclipse.

Copilot AI changed the title [WIP] Fix remaining lifecycle race in Host.Start() method Fix check-then-set race in Host.Start() Mar 27, 2026
Copilot AI requested a review from danielsobrado March 27, 2026 13:41
@danielsobrado danielsobrado marked this pull request as ready for review March 28, 2026 05:44
Copilot AI review requested due to automatic review settings March 28, 2026 05:44
@danielsobrado danielsobrado merged commit 013b44c into main Mar 28, 2026
1 check passed
@danielsobrado danielsobrado deleted the copilot/fix-host-lifecycle-race branch March 28, 2026 05:45

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a lifecycle race in Host.Start() where concurrent callers could both pass the running == false guard before running was set, potentially allowing double-start.

Changes:

  • Set h.running = true inside the same critical section as the h.running guard check in Host.Start().
  • Remove the late h.running = true assignment at the end of Start().
  • Add focused lifecycle tests for Start/Stop guard behavior and IsRunning() thread-safety.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/p2p/host/host.go Eliminates the check-then-set window by setting running under the same lock as the guard check.
pkg/p2p/host/host_lifecycle_test.go Adds tests for already-running start rejection, idempotent stop when not running, and concurrent IsRunning() reads.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/p2p/host/host.go
Comment thread pkg/p2p/host/host_lifecycle_test.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants