Commit 379e95e
authored
Connection health checks: liveness, identity, and expired-credential visibility (#1251)
* Connection health checks (liveness) with OpenAPI backing
A connection can declare one authenticated probe operation that answers
"is this credential still alive?". Core owns the vocabulary (HealthStatus,
HealthCheckSpec, HealthCheckResult, HealthCheckCandidate) and dispatch
(integrations.healthCheck.{get,candidates,set}, connections.{checkHealth,
validate}); the OpenAPI plugin implements the probe against its stored
operation bindings, ranking candidates non-destructive-first. The UI gets
a status dot + Check now per connection, a health-check editor sheet, and
a validate-the-key-first step in the Add Connection modal.
* Derive connection account info from the health-check probe
The same probe that answers "is this credential alive?" also answers
"whose account is this?": HealthCheckSpec gains an identityField dot-path
into the probe response, the editor gets a typed identity picker (response
schema fields, breadth-first so shallow scalars surface first) plus a live
preview against a pasted test key, and the Add Connection flow auto-fills
the display name from the probed identity.
Two corrections over the first attempt at this:
- The name auto-fill reads the label through the functional updater rather
than the closure snapshot, so a name typed while the probe is in flight
is never clobbered.
- The response sample is only taken from healthy responses; error bodies
stay out of the preview (non-healthy runs carry the classified detail).
A healthy probe whose spec-save fails now surfaces the failure instead
of silently pretending the check was configured.
* MCP connection liveness health checks
MCP connections get the liveness half of health checks: the probe dials
the server and lists tools (the same path tool discovery uses), so a live
token reads healthy and a revoked one reads expired. MCP has no usable
identity source, so no identity is derived and the operation/identity
editor stays hidden - only the status dot and Check now render.
The auth-wall signal is carried structurally instead of being fished out
of message strings: McpConnectionError/McpToolDiscoveryError gain an
httpStatus field populated from the transport cause at the handshake
boundary, and the liveness classifier reads it directly.
This also fixes the transport bug that made expiry undetectable on
default-configured connections: remoteTransport "auto" used to fall back
from streamable-http to SSE on ANY error, including a definitive 401/403;
the SSE retry then failed with an opaque error and the auth wall read as
merely degraded. The fallback now propagates 401/403 as-is (the credential
is the problem, not the transport). The e2e scenario runs on the DEFAULT
auto transport to pin exactly that, and failed MCP tool calls whose
handshake hits a 401/403 now surface the actionable auth failure instead
of a generic connection_rejected message.
* Health checks for Google and Microsoft Graph
Both providers wire the OpenAPI health-check backing (same store, config
superset) and get a ZERO-CONFIG default probe at add time, so the answer
to "has this connection expired?" - the original ask, born of Google's
7-day dev-token policy - works out of the box:
- Google: when the bundle includes the People API, the default check is
people.get with resourceName=people/me pinned and the account email as
the identity field.
- Microsoft Graph: when the selected workloads include GET /me, the
default check is /me with userPrincipalName as the identity field.
(The first attempt at this left Graph manual-only; there is no reason
to - /me is as canonical for Graph as people/me is for Google.)
Both plugins declare healthCheck in their own config schemas so the spec
survives each provider's read-modify-write updateBundle/updateGraph
cycles (Schema.Struct decode drops undeclared keys).
* Use tagged guards for the MCP connect-failure status extraction
* Assert provider health-check defaults in plugin unit tests
The Google auto-default (People API people.get, email identity) and the
Microsoft Graph auto-default (GET /me, userPrincipalName) are pinned in
each plugin's own unit suite against the existing stubbed-fetch harness,
replacing the standalone provider e2e whose hand-served discovery doc
never matched the plugin's URL allowlist. Also updates the connect-modal
UI scenarios for the merged affix credential field (placeholder is now
the bare "token").
* Own health-check storage in core; persist verdicts for at-a-glance expiry
Two structural fixes over the inherited design:
Core-owned spec storage. The declared health check moves out of each
plugin's opaque config blob into its own column on the integration row.
The old shape required every plugin to declare healthCheck in its config
schema or any config write silently stripped it, and its read-modify-write
persistence was exposed to lost updates. Now core writes the column
directly (a plain UPDATE, no read-modify-write), plugins shrink to two
hooks (listHealthCheckCandidates + checkHealth with the spec passed in),
and the per-plugin describeHealthCheck/setHealthCheck hooks and schema
declarations are gone. Plugins declare zero-config defaults through
ctx.core.integrations.setHealthCheck.
Persisted verdicts. Every checkHealth run writes its result onto the
connection row (last_health), and connections.list returns it - so the
accounts list shows alive/expired at a glance with no per-row clicking,
which was the actual customer ask. A live Check now still overrides
in-session.
Also addressed from design review:
- Mutating operations are hard-blocked as probes (not just ranked last):
a health check runs unattended and repeatedly, and this path has no
approval gate, so POST/PUT/PATCH/DELETE probes refuse to run.
- Credential values are scrubbed from probe error details before they
leave the server - upstream error bodies can echo the request back,
including query-param-carried keys.
- Google and Microsoft Graph account panels now mount the health-check
editor, so their auto-configured probes are visible and adjustable
(previously the flagship integrations had a working probe and no UI
for it).
- The editor uses the shared health-display labels instead of a local
drifting copy.
Cloud migration 0008 adds integration.health_check and
connection.last_health.
* Cover the review-driven guarantees in e2e
Three scenarios close the verification gaps:
- The connections list renders a persisted expired verdict on a FRESH
page load with no per-row clicking (browser scenario) - the at-a-glance
behavior the feature exists for.
- A mutating operation declared as the health check refuses to run
(unknown-with-reason, nothing reaches the upstream).
- A probe's error detail never echoes the credential back, even when the
upstream reflects the Authorization header into its error body.
* Probe-first key check in Add Connection
The old flow front-loaded a form: pick an operation and an identity field
from a schema before anything had shown you what the API returns, in a
block that dominated the modal. Inverted:
- The probe is AUTO-PICKED (top-ranked read-only zero-argument candidate)
and shown as a one-line "Calls GET me.getMe · change" caption; the full
operation form only opens behind "change", or when no zero-arg
candidate exists.
- Identity is chosen AFTER the probe, from the response the key actually
returned: the sample fields render as clickable path/value rows, and
clicking one upgrades the saved check with that identityField and
adopts the value as the display name. "Skip - status only" dismisses.
No schema guessing, and the pre-probe UI is one line instead of a form.
* Use the shared Button for the identity-picker affordances
* One checking signal, no layout shift on the key-check verdict
The in-flight state was double-communicated (button label swap to
"Checking..." plus a separate "Checking the key..." line) and the verdict
appearing pushed the layout down. Now the button carries the only
in-flight signal via its width-preserving loading spinner (label never
changes), and the verdict line's height is reserved from the start - the
reveal fills space that was always there.
* One Validate control: identity is a default, not a question
The key-check UI still asked for too much attention: a caption naming the
probe operation, a verdict line, and a whole ask-first panel ("Which
field names this account?"). Collapsed to a single control:
- One "Validate" button. The probe operation stays an invisible default
(top-ranked read-only zero-arg candidate).
- On healthy, the identity field is AUTO-PICKED from the response via a
shared heuristic (email > login/username > display name > id, shallower
paths first) and saved with the check.
- Everything lands in one verdict line beside the button:
"Healthy · alice@example.com · change". "change" opens the response
field list as a correction, not a question; the ask-first panel is gone.
pickIdentitySample lives in the core health-check vocabulary with unit
coverage; the connect e2e drives the collapsed flow including the
correction path.
* Pick mode: the key check becomes its own focused view
Two structural changes to the Add Connection modal, from review:
- The credential is step 1 and takes first focus; the display name moves
below it, framed as derived ("filled from the account when you check
the key") — you don't name a thing before proving it exists.
- "Check the key works" on an integration with no configured check now
swaps the modal body into pick mode, the same view-swap the OAuth app
registration uses. The user picks the read-only call (a deliberate,
taught choice — no auto-pick), runs it, sees the REAL response, and
clicks the field that names the account (or skips for status-only).
Picking returns to the main modal with the verdict and derived name.
The main modal's key-check footprint is now a button and a verdict line;
all teaching density lives in the focused subview. The "Confirms the key
authenticates" filler copy is gone, as is the auto-pick heuristic's UI
role (pickIdentitySample stays in the SDK for other surfaces).
* Two-step Add Connection: prove the key, then name and place it
The pick-mode submodal was jarring — it hid the key exactly when a failed
probe makes you want to edit it. Replaced with a two-step wizard in the
same modal (credential methods only; OAuth is unchanged):
Step 1 — get the key into a valid state. Auth method + key (first focus,
visible and editable throughout), and "Check the key works": a configured
check probes directly; with none the pick-a-call block expands INLINE
below the key — choose the call, run it, see the real response, click the
identity field. Continue is the only exit forward.
Step 2 — name it and place it. The verdict travels along as a one-line
recap, the display name arrives derived from the picked identity, the
saved-to picker and Add connection live here. Back returns to step 1 with
everything intact.
The pick block is gated on its expansion alone, not on hasHealthCheck: a
healthy probe saves the spec, which flips that flag mid-flow, and the
block must outlive its own success until the identity pick.
* Hoist the key check below the auth-method tabs
The check button and pick-a-call block lived inside the API-key tab's
content, so they moved with (and were clipped by) the tabs card. They now
render as a sibling below the whole tabs section: same position no matter
which credential method tab is active. canCheckKey already hides them for
OAuth/no-auth methods, where a pasted-key check doesn't apply.
* Redesign the key check as one request/response panel
The check was a button that grew a form that grew a bordered box of prose
and lists — hostile density. It is now the system's code-window pattern,
because that is literally what this is:
- ONE hairline-framed panel, always visible below the credential. Its
titlebar is the request line: a mono GET chip, the operation (pre-seeded
with the best read-only zero-arg candidate, editable in place; static
when the integration already has a check), and one Check button.
- The response renders inside the same frame: a mono status line
(dot · http status · verdict · identity) and, on a healthy first-time
run, the response fields as rows. Clicking a row makes it the label —
marked LABEL in place, no separate picker panel, no skip link (Continue
is the skip).
- Destructive candidates are filtered out of the picker entirely instead
of listed with a warning.
Nothing expands, nothing swaps views, no duplicate verdicts, and the copy
is one hint line under a mono section label.
* Form-pattern pass over the Add Connection wizard
Applied the form-design fundamentals to the flow:
- Credential fields get a show/hide reveal toggle (constant label,
aria-pressed state) — masked keys make typos invisible, and most keys
are pasted then eyeballed. One toggle state per field group.
- Placeholders no longer carry instructions ("paste the value / token",
"token"): the visible labels and the field frame do that job;
placeholder text that disappears on focus taught nothing.
- Continue is never disabled. A disabled submit hides the reason and drops
out of the tab order; clicking Continue with no key now says exactly
what's missing (role=alert line above the footer) and clears as soon as
a value lands.
- The wizard position is plain text in the dialog title ("STEP 1 OF 2",
mono sec-label style) — a text step indicator, not a progress bar.
e2e selectors moved off placeholders (gone) onto roles and labels.
* Identity-aware ranking, seamless request line, step-2 name picker, modal scroll test
Four refinements to the key-check panel from review:
- Candidates rank by what their response can NAME: calls whose schema
carries an email beat login beat display-name beat id, ahead of the
generic fewest-args/GET-first order (compareHealthCheckCandidatesByIdentity,
identityPathTier shared with the sample picker). The pre-seeded request
line now lands on the identity call, not an arbitrary list endpoint.
- The request line is seamless: the operation combobox renders frameless
inside the titlebar, so METHOD + operation + Check read as one request
rather than a form row in a box.
- The identity pick moved to step 2 where naming belongs: the display
name is a combobox seeded with the response's identity-looking fields
(value shown, path as the description); picking one also stores the
path as the check's identityField. Step 1's response is read-only,
identity fields ranked first, capped at 8 rows with a +N more line.
- The response view no longer nests a scroller (nested scroll areas trap
the wheel mid-modal); the modal is the one scroll context, and a new
e2e drives a short viewport, asserts real overflow, wheel-scrolls the
dialog, and reaches the footer.
* Memoize the response sample off the result object
* Rank every rendered response identity-first via one shared helper
rankResponseSample joins the core health-check vocabulary: rows whose
leaf key names the account (email > login > name > id) lead, the rest
keep response order, stable within tiers. The request panel, the step-2
name options, and the health-check editor's live preview all use it, so
what shows up first is always the field you came to see. Replaces the
panel's inline sort; unit-covered.
* Rank whoami calls above lists; free the modal wheel for real
Two fixes from testing against the actual Vercel spec:
- candidateIdentityTier now ignores identity keys under array segments:
aliases.listAliases exposes aliases.0.creator.email (people in a
collection, not the caller) and was outranking user.getAuthUser, whose
user.email names the account probing. Only singular paths count toward
the tier; reproduced against the full 9MB Vercel spec where
user.getAuthUser now ranks first, and unit-covered with that shape.
- The Add Connection dialog is now non-modal (with an explicit dim
overlay, since Radix renders none in non-modal mode): a modal dialog's
react-remove-scroll wrapped the wheel to the dialog subtree, so the
PORTALED combobox popup, and the modal body while it was open, could
not wheel-scroll at all. Same fix and rationale as the health-check
editor sheet. Outside-click still dismisses; option clicks are still
guarded by the portaled-popup check.
The scroll e2e now also opens the operation popup in a 420px viewport
and asserts the dialog still wheel-scrolls underneath it.
* Automatic health checks: stale-while-revalidate on page load
Manual Check now undermined the at-a-glance promise. The connections list
now revalidates itself:
- Core checkHealth gains ifStaleMs: return the persisted verdict when
younger than the window, probe otherwise. The SERVER owns freshness, so
N open tabs revalidating on load collapse to one probe per window
instead of stampeding the upstream. Exposed as an optional query param
on the checkHealth endpoint (bounded to a day).
- AccountRow revalidates on mount: a healthy verdict younger than 5
minutes renders as-is (the cache); anything else probes in the
background and corrects the dot in place. Non-healthy verdicts ALWAYS
revalidate — an expired dot is exactly the state the user is waiting to
see change, so recovery shows on the next load, not after the window.
Check now stays as the force-refresh.
- e2e: an API scenario pins the SWR contract (fresh window returns the
seeded verdict verbatim, zero window probes and sees the rotated key),
and the at-a-glance browser scenario now also restores the key and
asserts a reload flips expired back to healthy with no clicks.
* Simplify pass over the health-check diff
From a four-angle cleanup review (reuse / simplification / efficiency /
altitude), applied:
- Removed pickIdentitySample and its tests: the auto-pick era ended when
the identity moved to step 2's name options; rankResponseSample +
identityPathTier cover every live consumer. IDENTITY_KEY_TIERS goes
module-private (no external consumer).
- Candidate ranking sorts via decorate-sort-undecorate
(sortHealthCheckCandidatesByIdentity): tiers computed once per
candidate instead of inside the comparator, which re-walked response
fields O(n log n) times on Graph-sized specs.
- Retired the "pick mode" vocabulary (handleCandidateProbe,
hcCandidateReady) and rewrote comments narrating deleted UI
iterations; removed the write-only hcPickedPath state and the dead
KeyValidationStatus.validating prop.
- health-check-editor's local STATUS_CLASS map replaced by a shared
HEALTH_TEXT_CLASS in health-display.ts, next to the dot/ring maps it
duplicated.
- The six repeated (!wizardActive || wizardStep === X) conditions
collapsed into showValidateStep/showPlaceStep.
Noted but deliberately skipped: the RequestCheckPanel vs
HealthCheckConfigFields overlap (intentional UX divergence: request-line
panel vs form fields — unifying them would couple two surfaces that are
diverging on purpose), a batched checkHealth endpoint for N-row lists
(real but a follow-up: needs a new API shape), candidates-list spec
recompile caching (pre-existing known trade-off, same follow-up bucket),
StepHeader's four-variant scaffold and accounts-section trackEvent
repetition (pre-existing on main, not this diff's debt).
* Fold the remaining reuse findings
- One HealthStatusLine renders every verdict (step-2 recap and the
request panel's response status) instead of two hand-copied dot+label
rows; the panel passes variant=response for mono + http status.
- mcpLivenessFailureStatus delegates its status-code branch to the shared
classifyHttpStatus, so "which HTTP statuses mean expired" has exactly
one definition (the message-substring fallback stays local — it exists
for causes with no status at all).
* Show connection health at a glance on the integrations list
Every row on the integrations list now carries a worst-of health summary:
loading the page auto-checks each of the integration's connections (both
owners) with the same stale-while-revalidate guard the detail page uses,
and paints one status dot per row, with a mono EXPIRED/DEGRADED label when
something needs attention. Rows with no connections, or nothing but
never-probed ones, render nothing.
The revalidation logic moves out of AccountRow into a shared
use-connection-health hook (single- and multi-connection variants) so the
two surfaces cannot drift; AccountRow behavior is unchanged. Adds a
worst-of aggregation helper next to the health display maps, plus a
browser scenario seeding a dead-token MCP server and asserting the list
row reads Expired with no clicks.
* Fix health-check follow-up failures and stale connection cache
Do not block the OpenAPI add flow when saving the drafted health check
fails: the integration already exists server-side, so return through
onComplete and let the user fix the check from the detail page.
Forward the http client layer through the MCP checkHealth hook so it
dials the connector the same way resolveTools and invokeTool do.
Invalidate the connections cache after a health check so the persisted
last_health verdict is not served stale within the atom TTL. The manual
Check now path invalidates unconditionally; the automatic mount-time
probe invalidates only when the verdict actually changed, so an
unchanged reconfirm never churns the cache (the per-mount guard already
blocks a re-probe on the resulting refetch).
Drop the incidental definition.name varchar(255) narrowing from the
health-check migration: it is pre-existing schema drift, not part of
this feature, so the migration now adds only connection.last_health and
integration.health_check.
* Give credential inputs accessible names, fix e2e locators after placeholder removal
The add-account-modal rewrite dropped placeholder text from credential
inputs (the affix already carries the instruction visually), which left
the single-input cases with no accessible name. Add aria-label={input.label}
to both the affixed and non-affixed single-input branches; the multi-input
grid already had a proper Label htmlFor pairing.
Repoint the e2e tests that used to find these fields by placeholder to
role-based textbox locators scoped to the dialog, matching the pattern
used elsewhere. Also thread through the "Continue" step of the credential
wizard's two-step flow (validate, then place) that these tests hadn't
been updated for, and scope the "Add connection" submit click to the
dialog since the page has its own same-named trigger button.
* Replace em-dashes in new comments and docs
* Harden flaky e2e waits under CI shard load
* Bound MCP discovery with a probe deadline
discoverTools (the shared connect+listTools path behind resolveTools,
detect, probeEndpoint, and checkHealth) had no timeout of its own, and
neither the MCP SDK's connect handshake nor listTools call one either.
An unresponsive endpoint (a closed loopback port after its e2e scenario's
scope exits, a server wedged mid-handshake) hung the calling fiber, and
with it the server-side request handling it ran under, indefinitely.
Under CI shard load this showed up as auth-methods-ui.test.ts sitting on
its 90s auto-probe wait and then hitting the full 120s vitest timeout,
followed by every later test failing "login did not redirect to AuthKit
(500)": the dev server's request-handling capacity was pinned by fibers
stuck in an unbounded MCP connect, starving unrelated routes.
Give discoverTools a default 15s deadline (Effect.timeoutOrElse), mapping
a timeout to the same McpToolDiscoveryError("connect") shape a failed
connect already produces, so callers' existing handling (auth
classification, incomplete-tools fallback, health "degraded" status) all
keeps working unchanged. Connections that DID get established before the
deadline still close via the existing Effect.onExit handler, which fires
on interruption too.
Note: discoverTools already closed its connection deterministically via
Effect.onExit — there was no connector/session leak. The bug was purely
missing deadline, not missing cleanup.
* Cache compiled specs for request-path OpenAPI fallbacks
The health-check endpoints, candidate listing, and the tools/invoke
fallbacks recompiled the full OpenAPI document on every request. The UI
now auto-fires health checks on page mounts, so a large spec was parsed
into a fresh multi-MB object graph over and over, growing the dev server
heap until the process hit the V8 limit and every subsequent request
failed (the CI shard wedge behind the login 500 cascade).
Compile through a small module-level LRU instead, keyed by the config's
content-addressed specHash: same hash means byte-identical text, and a
spec update writes a new hash so stale entries age out. Capacity is four
compiled documents; legacy configs without a hash bypass the cache.
Add and update paths keep compiling fresh input directly.
* Capture dev-stack logs in e2e runs and raise CI heap headroom
* Use a non-hidden dir for captured server logs
actions/upload-artifact skips hidden files by default, so the .server
directory never made it into the CI artifact. Rename to server-logs so
the boot log actually ships with failed runs.
* Pre-bundle late-discovered deps so the dev worker never program-reloads mid-suite
vite was discovering effect/Match, effect/Predicate, and js-yaml during test
runs instead of at boot, forcing a re-optimize and full program reload on
both the client and SSR (workerd) environments. Each reload strands the
previous worker program's heap inside workerd, and a handful of them exhausts
its heap limit and kills the dev server mid-shard. Adding these to
optimizeDeps.include (client) and environments.ssr.optimizeDeps.include (SSR)
in apps/cloud and apps/host-selfhost's vite configs makes vite bundle them at
boot instead, so no mid-run discovery happens.
js-yaml is a transitive dependency via @executor-js/plugin-openapi that bun's
isolated install doesn't hoist into either app's node_modules, so a bare
"js-yaml" specifier silently fails to resolve in optimizeDeps.include. Used
vite's "<pkg> > <dep>" nested resolution syntax to resolve it from the
owning package instead.1 parent a380e34 commit 379e95e
66 files changed
Lines changed: 7496 additions & 188 deletions
File tree
- apps
- cloud
- drizzle
- meta
- src/db
- host-selfhost
- e2e
- scenarios
- selfhost
- setup
- targets
- packages
- core
- api/src
- connections
- handlers
- integrations
- sdk/src
- plugins
- google/src
- react
- sdk
- mcp/src/sdk
- microsoft/src
- react
- sdk
- openapi/src
- react
- sdk
- react/src
- api
- components
- lib
- pages
- scripts/oxlint-plugin-executor/rules
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
0 commit comments