Skip to content

fix(language-server): wait for cancellation before re-acquiring lint worker#677

Open
MyWWWChat wants to merge 1 commit into
neo4j:mainfrom
MyWWWChat:fix-worker-pool-race-in-linting
Open

fix(language-server): wait for cancellation before re-acquiring lint worker#677
MyWWWChat wants to merge 1 commit into
neo4j:mainfrom
MyWWWChat:fix-worker-pool-race-in-linting

Conversation

@MyWWWChat
Copy link
Copy Markdown

Summary

Three stacked issues in packages/language-server/src/linting.ts cause publishDiagnostics: [] after the first textDocument/didChange. After this fix the lint pipeline survives rapid document changes and delivers diagnostics on every turn.

The bug

if (lastSemanticJob !== undefined && !lastSemanticJob.resolved) {
  void lastSemanticJob.cancel();              // (1) fire-and-forget
}
const proxyWorker = (await pool.proxy())      // (2) races a worker that's being torn down
  as unknown as LintWorker;
lastSemanticJob = proxyWorker.lintCypherQuery(...);
const result = await lastSemanticJob;          // (3) rejects "Worker is terminated"
  1. cancel() is not awaited; the next pool.proxy() may return the same worker mid-termination.
  2. workerTerminateTimeout: 0 removes any drain period, so (1) hits on every cancellation cycle. Workerpool's own default is 1e3 ms.
  3. The transient Worker is terminated rejection is treated as fatal noise — console.error and no retry — so the user-visible result is empty diagnostics.

Reproduction

LSP harness against a freshly installed @neo4j-cypher/language-server (works on every release tested, 2.0.0-next.62.0.0-next.32):

const cp = require('child_process');
const p = cp.spawn('cypher-language-server', ['--stdio']);
const send = o => { const j = Buffer.from(JSON.stringify(o));
  p.stdin.write(Buffer.concat([Buffer.from(`Content-Length: ${j.length}\r\n\r\n`), j])); };
send({ jsonrpc:'2.0', id:1, method:'initialize',
       params:{ processId:null, rootUri:null, capabilities:{} } });
setTimeout(() => send({ jsonrpc:'2.0', method:'initialized', params:{} }), 500);
setTimeout(() => send({ jsonrpc:'2.0', method:'textDocument/didOpen',
  params:{ textDocument:{ uri:'file:///t.cypher', languageId:'cypher', version:1,
                          text:'MATCH (n) RETURN n;' } } }), 1500);
// the change that triggers the race
setTimeout(() => send({ jsonrpc:'2.0', method:'textDocument/didChange',
  params:{ textDocument:{ uri:'file:///t.cypher', version:2 },
           contentChanges:[{ text:'MATCH (n: RETURN n;' }] } }), 3000);

Stderr fires Error: Worker is terminated on the second turn; publishDiagnostics for the syntactically-broken text comes back as [] instead of containing the parser error.

Fix

Three correlated changes — none of them by themselves is enough on every Node version / load profile we hit, but together they make the pipeline robust:

  1. Await the cancellation. lastSemanticJob.cancel(); try { await lastSemanticJob; } catch { /* CancellationError */ }. Wait for the resource to actually be released before re-acquiring it (classical producer-consumer-with-graceful-cancellation idiom; mapped to Node's promise model it's just awaiting the rejection).
  2. workerTerminateTimeout: 01000 in both pool() calls. Restores a non-zero grace period for worker drain. Matches workerpool's own default.
  3. Retry once on /worker is terminated/i as belt-and-braces for any residual race the first two changes don't cover. CancellationError is re-thrown unchanged so cancellation semantics remain identical.

Test plan

  • Reproducer above returns [{ severity: 1, message: 'Expected...' }] on the second turn (was []).
  • First-turn behavior unchanged (still produces correct diagnostics on initial didOpen).
  • CancellationError still propagates out of the try block — no behavior change for legitimate cancellation paths.
  • pnpm test in packages/language-server (local environment lacked the full ANTLR toolchain — please verify on CI).

Notes

While digging through this I also noticed that in 2.0.0-next.32 rawLintDocument reads result.diagnostics and result.symbolTables, but the bundled lintCypherQuery returns a bare array. Same user-visible symptom (empty diagnostics) by a different path. That looks like a separate refactor left half-finished and is not addressed in this PR — flagging in case it's news.

…worker

Three correlated issues caused diagnostics to silently disappear after the
first textDocument/didChange:

1. `void lastSemanticJob.cancel()` is fire-and-forget; the next
   `await pool.proxy()` may hand back the same worker that's mid-termination,
   and the call rejects with "Worker is terminated".
2. `workerTerminateTimeout: 0` removes any drain period, so (1) hits on every
   cancellation cycle. Workerpool's own default is 1000ms.
3. The catch block treats the transient termination error as fatal noise —
   `console.error` and no retry — so users see `publishDiagnostics: []`
   instead of any actual lint result.

Fix:
- Await the in-flight cancellation (try/catch swallows the expected
  CancellationError).
- Restore a 1s grace period on worker termination.
- Retry the lint once when the underlying error matches /worker is
  terminated/i; let CancellationError propagate untouched.

Reproducer (LSP harness): a didOpen followed by a didChange ~3s later
returns `[]` for syntactically-broken text on every release tested
(2.0.0-next.6 … 2.0.0-next.32). After the patch, the syntax error is
delivered as expected on every change.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 3, 2026

⚠️ No Changeset found

Latest commit: d99ee26

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant