You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(servers): close GET migration race; reverse keychain/disk order (#1356)
Addresses code review:
1. **Concurrency**: GET /api/servers was reading + migrating outside the
write lock, then writing inside it — a concurrent POST/PUT/DELETE
between the read and the lock acquisition could be clobbered by the
migration write. Also affected the first-launch seed write. Fix:
take a fast unlocked peek; if the file is absent (seed branch) or
carries plaintext to migrate (migration branch), re-read inside the
write lock and decide there. Added `hasPlaintextSecrets()` as a
pure predicate so the common no-migration path stays lock-free.
2. **Write ordering**: POST and PUT wrote disk first, then keychain.
A keychain `set` failure mid-write left a half-configured entry on
disk and trapped retries at 409. Reversed to keychain-first so a
503 leaves both stores in their pre-write state. PUT in-place now
sets new fields, writes disk, then deletes obsolete fields (a
failed disk write leaves orphan keychain entries — recoverable —
rather than premature deletions of fields the user still expects
to see).
3. **Tolerance simplification**: dropped the no-entry regex in
`KeyringSecretStore.delete` — both "no entry" and "keychain
unavailable" collapse to the same desired outcome (the entry isn't
there anymore), so a uniform swallow is clearer.
4. **Parallelism**: `readKeychainEntriesFor` now `Promise.all`s the
per-field `get` calls. macOS Keychain round-trips are 10-50ms;
serializing 5+ env vars per server × N servers is a meaningful
wall-clock cost on GET.
5. **Spec doc**: reflects the tolerance contract (only `set`
hard-fails) and the keychain-before-disk write ordering rationale.
New regression test in servers-route.test.ts covers the retry-after-503
contract directly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: specification/v2_servers_file.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -245,7 +245,9 @@ Each server entry may carry these Inspector-extension fields at the top level:
245
245
-**First-connect contract**: settings apply on the *first* outbound request after the entry loads from disk — no need to open the settings form. The browser sends `settings` to the backend in the `/api/mcp/connect` body; the backend reads it from `RemoteConnectRequest` and threads it into `createTransportNode`.
246
246
- **Secret storage (#1356)**: `oauth.clientSecret` and stdio `env` values are persisted in the OS keychain (macOS Keychain Services / Windows Credential Manager / Linux libsecret via `@napi-rs/keyring`), keyed by `(serverId, field)` under the service name `mcp-inspector`. Field names: `oauth-client-secret`, `env:<KEY>` (one per stdio env variable). The on-disk `mcp.json` is stripped of these values — `oauth.clientSecret` is omitted entirely, stdio env keys are preserved with empty-string placeholders (`"env": { "API_KEY": "" }`) so the file still documents the env interface the server expects. The wire shape returned by `GET /api/servers` is unchanged from before #1356: the handler rehydrates values from the keychain so browser code sees the same JSON it has always seen. The keychain interactions live in `core/auth/node/secret-store.ts` behind a `SecretStore` interface; `KeyringSecretStore` is the production impl and `InMemorySecretStore` is the test double the integration suite injects via `RemoteServerOptions.secretStore`.
247
247
-**Migration**: on every `GET /api/servers`, the handler walks the freshly-read config and, for any entry that still carries plaintext secrets (older Inspector builds, hand-edited files, files imported from another tool), lifts each value into the keychain and rewrites the file with the stripped shape. The migration is idempotent — when the keychain already holds a value for `(serverId, field)`, the keychain wins and the disk plaintext is dropped unread. After the rewrite the disk file no longer contains the secret material.
248
-
-**Linux without libsecret**: `@napi-rs/keyring` returns a hard error on any operation. The handlers translate that to a `503` response with a message pointing at libsecret/gnome-keyring install. macOS and Windows always have a working keychain so this only realistically fires on minimal Linux installs.
248
+
-**Linux without libsecret**: `KeyringSecretStore` is *tolerant* — only the `set` operation throws `KeychainUnavailableError` (translated to a `503` by the handlers); `get` returns `null` and the destructive operations silently no-op. The result is that no-secret flows (creating a stdio server with no env values, deleting an entry, reading the list, the defensive sweep on POST) all work normally on a minimal Linux box without libsecret. Only the moments where a secret would actually be lost — saving an OAuth client secret, saving a stdio env value, or migrating a plaintext value into the keychain — surface a clear error. macOS and Windows always have a working keychain so this only matters on minimal Linux installs.
249
+
-**Migration tolerance**: when migration encounters `KeychainUnavailableError`, the GET handler logs a warning, leaves the on-disk plaintext untouched, and serves the (still-plaintext) response. Subsequent reads retry — installing libsecret later lifts the secrets on the next GET without any user action.
250
+
-**Write ordering on POST/PUT**: keychain writes happen before the disk write, and obsolete-field deletions happen after. The intent is that a `set` failure (the only hard-fail path) leaves both stores in their pre-write state — no half-applied entry on disk that would trap a retry POST at `409`, and no premature deletion of an obsolete field whose disk write later fails.
249
251
-**Out of scope for this PR**: the OAuth handshake itself still runs in the browser via the MCP SDK, so during the token exchange the secret transits the wire (browser → MCP SDK → OAuth provider's token endpoint). The on-disk win this PR delivers is that the secret is no longer in the shareable / symlinked `mcp.json` and is no longer the source-of-truth on the filesystem. Moving the token exchange to the Node side is tracked separately.
250
252
-**Hard-cutover legacy behavior (per #1358 decision 4)**: files written by the one pre-#1358 build of v2/main have a nested `settings` block. `normalizeMcpServers` drops the node on read and logs a one-line warn including the server id; the persisted headers / metadata / timeouts / OAuth credentials are intentionally lost on first read. Users re-enter them via the settings form (or hand-edit the file into the flat shape). v2 has not shipped a stable release with the nested shape, so the blast radius is the small set of v2/main dogfooders who edited per-server settings between #1353 merging and this change.
251
253
-**UI**: `ServerSettingsModal` is opened from the server card's settings affordance. Saving routes through `useServers.updateServerSettings(id, settings)` which issues a settings-only `PUT /api/servers/:id` with `{ id, settings }` — the route preserves the on-disk transport config inside its write lock. Conversely, `useServers.updateServer` (driven by the basic-config modal) issues a config-only PUT with `{ id, config }` and the route preserves the on-disk settings fields. Edits in either modal cannot silently wipe the other half.
0 commit comments