You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In typescript-native-preview.log the last line before a long, total silence is Running scheduled diagnostics refresh. After that the log gets nothing — no handled method, no Cloning snapshot, no periodic Runtime Metrics — for tens of minutes (observed cases of 12, 17, 30, 33, and 78 minutes on different projects).
While the server is in this state:
the editor receives no diagnostics, hover, completion, goto-definition, or code actions;
there's no error popup or progress indicator — VSCode just silently freezes for tsgo features;
the only way to recover is TypeScript Native Preview: Restart Server;
... Scheduling new diagnostics refresh...
... Delaying scheduled diagnostics refresh...
... Delaying scheduled diagnostics refresh...
... Delaying scheduled diagnostics refresh...
... Delaying scheduled diagnostics refresh...
(60+ Delaying lines over ~200 ms — typical fingerprint of a file system event burst)
... Delaying scheduled diagnostics refresh...
... Running scheduled diagnostics refresh
[server stops responding]
... Restarting language server...
[error] Stopping server timed out
... Restarting language server...
... Resolved client capabilities: { ... } ← new process
Reproduction in real workspaces
Not deterministic. From observing several occurrences back-to-back, these conditions appear correlated:
Large workspace with multiple ConfiguredProjects (e.g. monorepo). Captured cases: workspace with 3 ConfiguredProjects, live Go heap ≈ 1.5–2 GB at the moment of the hang.
Tens of editors open at once. Captured case: 224 didOpen / 155 didClose before the hang ⇒ ~70 active editors at the time of Running scheduled diagnostics refresh.
Burst of file system events shortly before the hang. A long sequence of Delaying scheduled diagnostics refresh (60+ over ~200 ms) is the typical fingerprint of git checkout / monorepo sync / mass touch in the workspace.
No special server-side load at the moment of hang. In the captured case only 23 textDocument/diagnostic requests had been handled in the entire session prior; hover/completion latencies were sub-millisecond.
What we tried in synthetic stress testing
We drove a real tsgo --lsp --stdio from a Node-based test client, replaying the captured workload (220 open files from a real hang, 3 ConfiguredProjects, full VSCode-style capabilities, ~2 GB warmed heap) plus extra pressure: idle traffic at 120 rps across codeLens / inlayHint / semanticTokens / documentSymbol / foldingRange / hover, watch-event bursts, repeated snapshot updates contending with idleCacheCleanTimer, and delayed client replies to workspace/diagnostic/refresh. The server kept up — parallel textDocument/diagnostic finished in ~1 s and probe hovers stayed sub-millisecond throughout.
We did reproduce a permanent stall once, after pushing further: 1 500 open files, the same 120 rps idle traffic, and a watch-event burst that triggered the diagnostic refresh. The server went silent for 2 minutes before our watchdog tripped and we captured a goroutine dump via SIGQUIT. The dump shows:
3 goroutines in sendClientRequest waiting on <-responseChan for 2 minutes: Server.WatchFiles (holding session.watchesMu), Server.RefreshDiagnostics, and serverProgressReporter.createWorkDoneProgress;
12 goroutines parked on session.watchesMu.Lock() inside updateWatch;
Server.readLoop parked in chansend on a full s.requestQueue (cap 100) — so client responses to the three pending requests above could not be delivered, making the deadlock self-locking.
This is consistent with the Running scheduled diagnostics refresh symptom but the trigger (1 500 simultaneously open files plus 120 rps client traffic) is much more aggressive than what real VSCode produces, so we don't claim it's the same root cause hitting users in production. Lower-volume settings did not reproduce. The symptom matches; the production root cause may be different.
The full dump and a unit test that demonstrates the updateWatch mutex pattern violation in 3 s are available on request.
Code paths worth a look while we wait for a dump
Two unbounded waits sitting directly on the path that fires at Running scheduled diagnostics refresh:
Session.RefreshDiagnostics (internal/project/session.go line 414): sends workspace/diagnostic/refresh on s.backgroundCtx with no per-request timeout or cancel handle. If the client never responds, this goroutine waits forever — and it's exactly the request the log shows being issued right before the silence.
updateWatch (internal/project/session.go lines 1136–1209): holds session.watchesMu across synchronous client.WatchFiles / client.UnwatchFiles calls. Any delay in the client's response to client/registerCapability blocks every other updateWatch behind this mutex. (We have a unit test that demonstrates this pattern violation in 3 s using only the existing ClientMock; happy to attach if useful.)
Symptom
In
typescript-native-preview.logthe last line before a long, total silence isRunning scheduled diagnostics refresh. After that the log gets nothing — nohandled method, noCloning snapshot, no periodicRuntime Metrics— for tens of minutes (observed cases of 12, 17, 30, 33, and 78 minutes on different projects).While the server is in this state:
Stopping server timed out(this is Restart Server: "Stopping timed out" on first click when server is hung #3601), so two clicks are required to actually get a fresh process.What the log looks like
Reproduction in real workspaces
Not deterministic. From observing several occurrences back-to-back, these conditions appear correlated:
Running scheduled diagnostics refresh.Delaying scheduled diagnostics refresh(60+ over ~200 ms) is the typical fingerprint ofgit checkout/ monoreposync/ masstouchin the workspace.textDocument/diagnosticrequests had been handled in the entire session prior; hover/completion latencies were sub-millisecond.What we tried in synthetic stress testing
We drove a real
tsgo --lsp --stdiofrom a Node-based test client, replaying the captured workload (220 open files from a real hang, 3 ConfiguredProjects, full VSCode-style capabilities, ~2 GB warmed heap) plus extra pressure: idle traffic at 120 rps across codeLens / inlayHint / semanticTokens / documentSymbol / foldingRange / hover, watch-event bursts, repeated snapshot updates contending withidleCacheCleanTimer, and delayed client replies toworkspace/diagnostic/refresh. The server kept up — paralleltextDocument/diagnosticfinished in ~1 s and probe hovers stayed sub-millisecond throughout.We did reproduce a permanent stall once, after pushing further: 1 500 open files, the same 120 rps idle traffic, and a watch-event burst that triggered the diagnostic refresh. The server went silent for 2 minutes before our watchdog tripped and we captured a goroutine dump via
SIGQUIT. The dump shows:sendClientRequestwaiting on<-responseChanfor 2 minutes:Server.WatchFiles(holdingsession.watchesMu),Server.RefreshDiagnostics, andserverProgressReporter.createWorkDoneProgress;session.watchesMu.Lock()insideupdateWatch;Server.readLoopparked inchansendon a fulls.requestQueue(cap 100) — so client responses to the three pending requests above could not be delivered, making the deadlock self-locking.This is consistent with the
Running scheduled diagnostics refreshsymptom but the trigger (1 500 simultaneously open files plus 120 rps client traffic) is much more aggressive than what real VSCode produces, so we don't claim it's the same root cause hitting users in production. Lower-volume settings did not reproduce. The symptom matches; the production root cause may be different.The full dump and a unit test that demonstrates the
updateWatchmutex pattern violation in 3 s are available on request.Code paths worth a look while we wait for a dump
Two unbounded waits sitting directly on the path that fires at
Running scheduled diagnostics refresh:Session.RefreshDiagnostics(internal/project/session.goline 414): sendsworkspace/diagnostic/refreshons.backgroundCtxwith no per-request timeout or cancel handle. If the client never responds, this goroutine waits forever — and it's exactly the request the log shows being issued right before the silence.updateWatch(internal/project/session.golines 1136–1209): holdssession.watchesMuacross synchronousclient.WatchFiles/client.UnwatchFilescalls. Any delay in the client's response toclient/registerCapabilityblocks every otherupdateWatchbehind this mutex. (We have a unit test that demonstrates this pattern violation in 3 s using only the existingClientMock; happy to attach if useful.)Environment
typescriptteam.native-preview-0.20260421.1-darwin-arm64microsoft/typescript-gomainat commitba858e5c6