Fix server hang after "Running scheduled diagnostics refresh" by Copilot · Pull Request #3628 · microsoft/typescript-go

Copilot · 2026-04-27T21:29:25Z

Two unbounded waits on the critical path fired by Running scheduled diagnostics refresh can deadlock the server when the client is slow to respond, leaving the LSP completely unresponsive until manually restarted.

`RefreshDiagnostics` — fire and forget

workspace/diagnostic/refresh always returns null; waiting for it is unnecessary
Changed to send the request without registering a response channel or blocking
Late client responses are silently dropped by the read loop (no matching pending entry)
Short-circuits on context cancellation before sending

`updateWatch` — per-request 1s timeout on client requests with rollback

WatchFiles/UnwatchFiles calls hold session.watchesMu while synchronously waiting on client/registerCapability responses
Any client delay blocks all other updateWatch callers behind the mutex, and if requestQueue fills up, the read loop can't deliver responses — creating a self-locking deadlock
Each WatchFiles/UnwatchFiles call gets its own 1s context.WithTimeout so earlier calls don't consume the deadline for later ones
Watchers are deduped by key before calling Acquire to prevent refcount inflation from duplicate patterns
On any WatchFiles failure, ALL newly-acquired watchers in that batch are rolled back (not just the failed one) to keep refcounts clean for retry
Failed watcher IDs are tracked as "pending" in the registry so updateWatches retries them on the next snapshot update, even when the watcher identity hasn't changed

`watchRegistry` — encapsulated watch bookkeeping

Extracted a watchRegistry type into internal/project/watch.go that encapsulates the ref-counted map, mutex, and provides Acquire and Release methods
Added MarkPending, ClearPending, and IsPending methods to track watcher IDs whose registration failed
All methods consistently require the caller to hold mu, making the locking discipline explicit and uniform
updateWatch now delegates all bookkeeping to the registry instead of manipulating the map directly, making the function easier to follow
updateWatches checks IsPending (with explicit lock/unlock around each check) for all watcher types (config, program files, typings, auto-imports) and retries as fresh adds when the watcher identity is unchanged but the previous registration failed

Tests

Added TestUpdateWatchTimeoutAndRollback in internal/project/watchtimeout_test.go using synctest and projecttestutil to verify timeout, rollback, and retry behavior with real session operations
Uses DidOpenFile to open a file (triggering watches that time out with a mocked slow client), then DidChangeFile with a single character edit (which does not change any watcher identity), followed by GetLanguageService to flush the pending change and trigger updateWatches
Verifies that the pending tracking causes updateWatches to retry the failed registrations and that the retry succeeds, even though the watcher identities are unchanged
Asserts that retry watcher IDs match the initial attempt IDs and that at least one retried watcher succeeds

- Make RefreshDiagnostics fire-and-forget: send the request but don't wait for a response (client always returns null). Any response from the client is silently ignored by the read loop. - Add 1s timeout to updateWatch for WatchFiles/UnwatchFiles client requests to prevent holding watchesMu indefinitely if the client is slow or unresponsive. Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/9d82d127-ea83-455b-bd36-297c9fe864b7 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/9d82d127-ea83-455b-bd36-297c9fe864b7 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot

Pull request overview

Addresses an LSP server hang triggered by “Running scheduled diagnostics refresh” by removing an unnecessary client-response wait and bounding how long watch registration holds a critical mutex.

Changes:

Make RefreshDiagnostics send workspace/diagnostic/refresh as fire-and-forget (no response channel / no wait).
Add a 1s timeout for WatchFiles/UnwatchFiles client requests made while holding session.watchesMu.
Introduce sendClientRequestFireAndForget helper for non-essential client responses.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
internal/project/session.go	Adds a watch request timeout context for client watcher register/unregister calls to avoid indefinite mutex holds.
internal/lsp/server.go	Changes diagnostics refresh to fire-and-forget and adds a helper to send client requests without awaiting responses.

andrewbranch · 2026-04-28T15:55:58Z

@copilot fix review comments

…lock, rollback + retry on WatchFiles failure Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/b1cea66d-c309-4c26-97cc-f815ab98e5fe Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot · 2026-04-28T16:14:15Z

@copilot fix review comments

Addressed all review comments in e2074f9e:

RefreshDiagnostics now checks ctx.Err() before sending
Timeout context is created after acquiring watchesMu
WatchFiles failures roll back session.watches bookkeeping and schedule a single retry after 1s

…apshot updates re-attempt naturally Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/02f6f16f-047c-455f-981d-b83603fc1df8 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/b572b8f8-1888-4b53-a86f-59468b89d099 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot · 2026-04-28T17:55:05Z

@copilot fix review comments

All review comments have been addressed across e2074f9e, ccd56581, and 60619b9d.

andrewbranch · 2026-04-28T19:19:17Z

@copilot write a test for the updateWatch timeout and retry. I think you should be able to mock the client implementation using synctest.

Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/ddc2f2cd-a802-46ea-a7a2-3977a30c87f0 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot · 2026-04-28T19:36:56Z

@copilot write a test for the updateWatch timeout and retry. I think you should be able to mock the client implementation using synctest.

Added in b8769928. Two synctest subtests: one verifies that a slow client triggers timeout + rollback (registry empty after), and another verifies a subsequent call succeeds (simulating the next snapshot update re-attempting registration).

Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/4d583367-5604-427b-8cfb-be0e60198035 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot · 2026-04-28T21:10:37Z

@copilot write a test for the updateWatch timeout and retry. I think you should be able to mock the client implementation using synctest.

Rewrote the test using real session operations in 9e7b395d. The test uses synctest with projecttestutil.GetSessionInitOptions and mocks WatchFilesFunc to block on the first batch. It opens a file in p1 (WatchFiles times out and rolls back), then closes p1 and opens p2 (WatchFiles succeeds), verifying the rollback left the registry clean for re-registration.

…ternal functions Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/db5b9454-2d1d-4250-b97e-e9f919dd9116 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

…hanged identity When a WatchFiles call times out, the watcher ID is now marked as pending in the watchRegistry. On the next snapshot update, updateWatches checks for pending watchers and retries them as fresh adds, even when the watcher identity hasn't changed between snapshots. Rewrote the test to use a single character DidChangeFile edit (no watcher identity change) followed by GetLanguageService to flush the change and verify the retry succeeds. Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/85f8d007-9ec1-47fa-bf47-4e9a07aba587 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

internal/project/session.go:1203

pending watcher IDs are only cleared on successful registration in the newWatcher path. If a watcher times out (MarkPending) and is later removed (updateWatch called with newWatcher == nil), its pending entry is never cleared, so watchRegistry.pending can grow over the lifetime of a session. Consider calling ClearPending for oldWatcher.Watchers().WatcherID when removing/unwatching, or otherwise pruning pending entries when the corresponding watcher is no longer present.

	if oldWatcher != nil {
		w := oldWatcher.Watchers()
		watchers := append(w.WorkspaceWatchers, w.OutsideWorkspaceWatchers...)
		if len(watchers) > 0 {
			var removedIDs []WatcherID
			for _, watcher := range watchers {
				if id, removed := session.watches.Release(watcher); removed {
					removedIDs = append(removedIDs, id)
				}
			}
			for _, id := range removedIDs {
				if err := session.client.UnwatchFiles(watchCtx, id); err != nil {
					errors = append(errors, err)
				} else if logger != nil && newWatcher == nil {
					logger.Log(fmt.Sprintf("Removed watch: %s", id))
				}
			}
		}
	}

Copilot · 2026-04-29T15:46:07Z

+	// Retry config watchers whose IDs didn't change but whose previous registration failed.
+	for path, newEntry := range newSnapshot.ConfigFileRegistry.configs {
+		if oldEntry, ok := oldSnapshot.ConfigFileRegistry.configs[path]; ok {
+			if oldEntry.rootFilesWatch.ID() == newEntry.rootFilesWatch.ID() && s.watches.IsPending(newEntry.rootFilesWatch.ID()) {
+				errors = append(errors, updateWatch(ctx, s, s.logger, nil, newEntry.rootFilesWatch)...)
+			}


The pending retry path calls updateWatch(ctx, s, ..., nil, newWatcher). If the original updateWatch call partially succeeded (some WatchFiles calls returned nil before a later one timed out), the registry will still contain refcounts for the successful watchers. Retrying with oldWatcher==nil will Acquire again and permanently increment those refcounts (no matching Release), which can prevent future UnwatchFiles from ever firing and leak client registrations. To keep refcounts consistent, consider making the failure path roll back the entire parent watcher’s bookkeeping when any error occurs (not just the failed watcher), or implement a retry helper that registers only missing keys without incrementing refcounts for already-tracked entries.

Copilot · 2026-04-29T15:46:08Z

+// watchRegistry tracks the current watch globs and how many individual
+// WatchedFiles reference each glob. It guards concurrent access with a mutex
+// and provides ref-count helpers so callers don't manipulate the map directly.
+// It also tracks which parent watcher IDs have pending (failed) registrations
+// so that updateWatches can retry them even when the watcher identity hasn't changed.
+type watchRegistry struct {
+	mu      sync.Mutex
+	entries map[fileSystemWatcherKey]*fileSystemWatcherValue
+	pending map[WatcherID]struct{}
+}
+
+func newWatchRegistry() *watchRegistry {
+	return &watchRegistry{
+		entries: make(map[fileSystemWatcherKey]*fileSystemWatcherValue),
+		pending: make(map[WatcherID]struct{}),
+	}
+}
+
+// Acquire increments the ref count for a watcher. If this is the first
+// reference (count goes from 0 to 1), it returns true so the caller knows
+// to register the watcher with the client.
+func (r *watchRegistry) Acquire(watcher *lsproto.FileSystemWatcher, id WatcherID) (isNew bool) {
+	key := toFileSystemWatcherKey(watcher)
+	value := r.entries[key]
+	if value == nil {
+		value = &fileSystemWatcherValue{id: id}
+		r.entries[key] = value
+	}
+	value.count++
+	return value.count == 1
+}
+


watchRegistry is documented as guarding concurrent access with its mutex, but Acquire/Release/MarkPending/ClearPending do not lock r.mu (only IsPending does). Since session.go now locks session.watches.mu directly, the locking discipline is split across call sites, which makes it easy to accidentally call these methods without holding the mutex (data race) and undermines the stated encapsulation. Consider either (1) moving the locking inside watchRegistry methods (and removing external r.mu access), or (2) updating the type/method docs and API to make the required locking explicit (e.g., AcquireLocked/ReleaseLocked or a WithLock helper) so future callers can’t misuse it.

Copilot · 2026-04-29T15:46:08Z

+	session.watches.mu.Lock()
+	defer session.watches.mu.Unlock()
+	// Use a timeout for client requests to prevent holding the mutex indefinitely
+	// if the client is slow or unresponsive.
+	watchCtx, watchCancel := context.WithTimeout(ctx, watchRequestTimeout)
+	defer watchCancel()
 	if newWatcher != nil {
 		w := newWatcher.Watchers()
 		watchers := append(w.WorkspaceWatchers, w.OutsideWorkspaceWatchers...)
 		if len(watchers) > 0 {
 			var newWatchers collections.OrderedMap[WatcherID, *lsproto.FileSystemWatcher]
 			for i, watcher := range watchers {
-				key := toFileSystemWatcherKey(watcher)
-				value := session.watches[key]
 				globId := WatcherID(fmt.Sprintf("%s.%d", w.WatcherID, i))
-				if value == nil {
-					value = &fileSystemWatcherValue{id: globId}
-					session.watches[key] = value
-				}
-				value.count++
-				if value.count == 1 {
+				if session.watches.Acquire(watcher, globId) {
 					newWatchers.Set(globId, watcher)
 				}
 			}
 			for id, watcher := range newWatchers.Entries() {
-				if err := session.client.WatchFiles(ctx, id, []*lsproto.FileSystemWatcher{watcher}); err != nil {
+				if err := session.client.WatchFiles(watchCtx, id, []*lsproto.FileSystemWatcher{watcher}); err != nil {
+					session.watches.Release(watcher)
 					errors = append(errors, err)
 				} else if logger != nil {


updateWatch creates a single watchCtx with a 1s deadline and reuses it for all WatchFiles/UnwatchFiles calls in this update. If multiple registrations/unregistrations are needed, later calls may inherit very little time remaining (e.g., first call takes 900ms → second gets ~100ms), causing avoidable timeouts even when the client is responsive. Consider creating a fresh per-request timeout context inside the WatchFiles/UnwatchFiles loops (or otherwise resetting the deadline per call) while still ensuring the mutex is not held indefinitely.

Copilot · 2026-04-29T15:46:08Z

 			for i, watcher := range watchers {
-				key := toFileSystemWatcherKey(watcher)
-				value := session.watches[key]
 				globId := WatcherID(fmt.Sprintf("%s.%d", w.WatcherID, i))
-				if value == nil {
-					value = &fileSystemWatcherValue{id: globId}
-					session.watches[key] = value
-				}
-				value.count++
-				if value.count == 1 {
+				if session.watches.Acquire(watcher, globId) {
 					newWatchers.Set(globId, watcher)
 				}
 			}
 			for id, watcher := range newWatchers.Entries() {
-				if err := session.client.WatchFiles(ctx, id, []*lsproto.FileSystemWatcher{watcher}); err != nil {
+				if err := session.client.WatchFiles(watchCtx, id, []*lsproto.FileSystemWatcher{watcher}); err != nil {
+					session.watches.Release(watcher)
 					errors = append(errors, err)
 				} else if logger != nil {


On WatchFiles error, updateWatch rolls back bookkeeping via session.watches.Release(watcher) exactly once for that watcher. If the input watcher list contains duplicate patterns (same fileSystemWatcherKey), Acquire will increment the refcount multiple times but only one client registration is attempted; a failed registration would then leave the refcount >0, preventing future retries from ever treating that watch as “new”. This is reachable because some glob lists are only sorted (not deduped) before becoming watchers. Consider deduping watcher keys before calling Acquire, or tracking how many increments were applied per key and fully rolling them back on registration failure.

Copilot · 2026-04-29T15:46:09Z

+			assert.Assert(t, len(successfulIDs) >= 1,
+				"expected at least one watcher to be registered after retry, got %d", len(successfulIDs))


The assertion block says it verifies WatchFiles was retried “with the same watcher IDs”, but the test only checks that there were additional calls and that at least one succeeded. To make the test match its intent (and guard against accidental ID churn), consider asserting that the retry IDs set matches the initial attempted IDs set (or at least that each initial ID is retried).

Suggested change

assert.Assert(t, len(successfulIDs) >= 1,

"expected at least one watcher to be registered after retry, got %d", len(successfulIDs))

firstAttemptSet := make(map[any]struct{}, len(firstAttemptIDs))

for _, id := range firstAttemptIDs {

firstAttemptSet[id] = struct{}{}

}

retrySet := make(map[any]struct{}, len(retryIDs))

for _, id := range retryIDs {

retrySet[id] = struct{}{}

}

assert.Equal(t, len(retrySet), len(firstAttemptSet),

"expected retry to use the same watcher IDs as the first attempt, got first attempt IDs=%v retry IDs=%v",

firstAttemptIDs, retryIDs)

for id := range firstAttemptSet {

_, ok := retrySet[id]

assert.Assert(t, ok,

"expected watcher ID %v from the first attempt to be retried; first attempt IDs=%v retry IDs=%v",

id, firstAttemptIDs, retryIDs)

}

successfulRetriedWatchers := 0

for _, id := range successfulIDs {

if _, ok := retrySet[id]; ok {

successfulRetriedWatchers++

}

}

assert.Assert(t, successfulRetriedWatchers >= 1,

"expected at least one retried watcher to be registered after retry, got %d successful retried watchers (successful IDs=%v, retry IDs=%v)",

successfulRetriedWatchers, successfulIDs, retryIDs)

andrewbranch · 2026-04-29T16:04:37Z

@copilot address review comments

…up watchers, rollback all on failure, stronger test assertions Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/81267cbb-20f5-41d0-b92b-9f1b30604e5d Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/81267cbb-20f5-41d0-b92b-9f1b30604e5d Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot · 2026-04-29T16:24:01Z

@copilot address review comments

Addressed all five review comments in 22abe61:

Partial success refcount leak: On any WatchFiles error, ALL newly-acquired watchers are now rolled back (not just the failed one), keeping refcounts clean for retry.
Consistent locking: All watchRegistry methods now consistently require the caller to hold mu. Removed the internal lock from IsPending; callers in updateWatches now lock/unlock around each check.
Per-request timeout: Each WatchFiles/UnwatchFiles call gets its own context.WithTimeout so earlier calls don't consume the deadline for later ones.
Duplicate patterns: Watchers are deduped by key before calling Acquire to prevent refcount inflation.
Stronger test assertions: Test now verifies retry watcher IDs are a subset of the first attempt IDs and that at least one retried watcher succeeded.

Initial plan

360eba7

Copilot AI assigned Copilot and andrewbranch Apr 27, 2026

Copilot started work on behalf of andrewbranch April 27, 2026 21:29 View session

Copilot AI linked an issue Apr 27, 2026 that may be closed by this pull request

Server stops responding after Running scheduled diagnostics refresh #3615

Open

Copilot AI and others added 2 commits April 27, 2026 21:41

address review: extract watchRequestTimeout constant and improve docs

3091afe

Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/9d82d127-ea83-455b-bd36-297c9fe864b7 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix server stops responding after running scheduled diagnostics refresh~~ Fix server hang after "Running scheduled diagnostics refresh" Apr 27, 2026

Copilot AI requested a review from andrewbranch April 27, 2026 21:45

Copilot finished work on behalf of andrewbranch April 27, 2026 21:45

andrewbranch marked this pull request as ready for review April 28, 2026 00:44

Copilot AI review requested due to automatic review settings April 28, 2026 00:44

Copilot started reviewing on behalf of andrewbranch April 28, 2026 00:45 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

Comment thread internal/project/session.go

Comment thread internal/lsp/server.go

Comment thread internal/lsp/server.go

Comment thread internal/project/session.go Outdated

Copilot started work on behalf of andrewbranch April 28, 2026 15:56 View session

Copilot finished work on behalf of andrewbranch April 28, 2026 16:16

Copilot started work on behalf of andrewbranch April 28, 2026 17:20 View session

Copilot finished work on behalf of andrewbranch April 28, 2026 17:33

andrewbranch reviewed Apr 28, 2026

View reviewed changes

Comment thread internal/project/session.go Outdated

Copilot started work on behalf of andrewbranch April 28, 2026 17:40 View session

refactor: extract watchRegistry type to encapsulate watch bookkeeping

60619b9

Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/b572b8f8-1888-4b53-a86f-59468b89d099 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot finished work on behalf of andrewbranch April 28, 2026 17:56

Copilot AI requested a review from andrewbranch April 28, 2026 17:56

andrewbranch reviewed Apr 28, 2026

View reviewed changes

Comment thread internal/project/watch.go Outdated

Copilot started work on behalf of andrewbranch April 28, 2026 18:20 View session

Copilot started work on behalf of andrewbranch April 28, 2026 19:19 View session

add test for updateWatch timeout and rollback using synctest

b876992

Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/ddc2f2cd-a802-46ea-a7a2-3977a30c87f0 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot finished work on behalf of andrewbranch April 28, 2026 19:38

andrewbranch reviewed Apr 28, 2026

View reviewed changes

Comment thread internal/project/watch_test.go Outdated

Copilot started work on behalf of andrewbranch April 28, 2026 20:47 View session

rewrite updateWatch test to use real session operations with synctest

9e7b395

Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/4d583367-5604-427b-8cfb-be0e60198035 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot finished work on behalf of andrewbranch April 28, 2026 21:12

Copilot AI requested a review from andrewbranch April 28, 2026 21:12

andrewbranch reviewed Apr 28, 2026

View reviewed changes

Comment thread internal/project/watchtimeout_test.go Outdated

Copilot started work on behalf of andrewbranch April 28, 2026 21:24 View session

rewrite test to use DidChangeWatchedFiles + DidOpenFile instead of in…

371c3ff

…ternal functions Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/db5b9454-2d1d-4250-b97e-e9f919dd9116 Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot finished work on behalf of andrewbranch April 28, 2026 21:52

Copilot AI requested a review from andrewbranch April 28, 2026 21:52

andrewbranch requested changes Apr 29, 2026

View reviewed changes

Comment thread internal/project/watchtimeout_test.go Outdated

Copilot started work on behalf of andrewbranch April 29, 2026 01:08 View session

Copilot finished work on behalf of andrewbranch April 29, 2026 01:28

Copilot AI requested a review from andrewbranch April 29, 2026 01:28

andrewbranch requested a review from Copilot April 29, 2026 15:37

Copilot started reviewing on behalf of andrewbranch April 29, 2026 15:38 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Copilot started work on behalf of andrewbranch April 29, 2026 16:04 View session

Copilot AI and others added 2 commits April 29, 2026 16:22

Preallocate seen map with capacity len(watchers)

72f2544

Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/81267cbb-20f5-41d0-b92b-9f1b30604e5d Co-authored-by: andrewbranch <3277153+andrewbranch@users.noreply.github.com>

Copilot finished work on behalf of andrewbranch April 29, 2026 16:25

		assert.Assert(t, len(successfulIDs) >= 1,
		"expected at least one watcher to be registered after retry, got %d", len(successfulIDs))

-			assert.Assert(t, len(successfulIDs) >= 1,
-				"expected at least one watcher to be registered after retry, got %d", len(successfulIDs))
+			firstAttemptSet := make(map[any]struct{}, len(firstAttemptIDs))
+			for _, id := range firstAttemptIDs {
+				firstAttemptSet[id] = struct{}{}
+			}
+			retrySet := make(map[any]struct{}, len(retryIDs))
+			for _, id := range retryIDs {
+				retrySet[id] = struct{}{}
+			}
+			assert.Equal(t, len(retrySet), len(firstAttemptSet),
+				"expected retry to use the same watcher IDs as the first attempt, got first attempt IDs=%v retry IDs=%v",
+				firstAttemptIDs, retryIDs)
+			for id := range firstAttemptSet {
+				_, ok := retrySet[id]
+				assert.Assert(t, ok,
+					"expected watcher ID %v from the first attempt to be retried; first attempt IDs=%v retry IDs=%v",
+					id, firstAttemptIDs, retryIDs)
+			}
+			successfulRetriedWatchers := 0
+			for _, id := range successfulIDs {
+				if _, ok := retrySet[id]; ok {
+					successfulRetriedWatchers++
+				}
+			}
+			assert.Assert(t, successfulRetriedWatchers >= 1,
+				"expected at least one retried watcher to be registered after retry, got %d successful retried watchers (successful IDs=%v, retry IDs=%v)",
+				successfulRetriedWatchers, successfulIDs, retryIDs)

Conversation

Copilot AI commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

RefreshDiagnostics — fire and forget

updateWatch — per-request 1s timeout on client requests with rollback

watchRegistry — encapsulated watch bookkeeping

Tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andrewbranch commented Apr 28, 2026

Uh oh!

Copilot AI commented Apr 28, 2026

Uh oh!

Uh oh!

Copilot AI commented Apr 28, 2026

Uh oh!

Uh oh!

andrewbranch commented Apr 28, 2026

Uh oh!

Copilot AI commented Apr 28, 2026

Uh oh!

Uh oh!

Copilot AI commented Apr 28, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

andrewbranch commented Apr 29, 2026

Uh oh!

Copilot AI commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Apr 27, 2026 •

edited

Loading

`RefreshDiagnostics` — fire and forget

`updateWatch` — per-request 1s timeout on client requests with rollback

`watchRegistry` — encapsulated watch bookkeeping