feat: Add an expect query parameter to the status endpoints for server-side health checks#710
Draft
keelerm84 wants to merge 3 commits into
Draft
feat: Add an expect query parameter to the status endpoints for server-side health checks#710keelerm84 wants to merge 3 commits into
keelerm84 wants to merge 3 commits into
Conversation
Customers used /status as a health gate by fetching the JSON body, extracting fields with jq, and comparing values in a shell. They asked us to ship jq in the relay Docker image; a vendored binary is an attack surface we do not control and cannot go in the distroless image. Instead, all status endpoints (/status and the per-environment routes) now accept a repeatable, AND-ed expect=<path><op><value> query parameter and answer with an HTTP status code: 200 when every clause holds, 412 when a well-formed clause does not (including an absent field), and 400 when a clause is malformed. The verdict body summarizes each clause for debugging, but the status code is the contract, so a probe or monitor needs no body parsing and nothing installed. The clause grammar is a small bounded path evaluator -- not an embedded jq engine -- since the endpoints are unauthenticated. It supports map keys, bracket-quoted keys, and array index/field-filter access, the last of which is defined now for forward compatibility with the upcoming concurrent-keys arrays.
Review of the expect-query feature surfaced an unauthenticated panic and a parser robustness gap: - parseBracket sliced inner[1:0] and panicked when a bracket contained a lone quote character (e.g. ?expect=a["]=1). On the public /status endpoint this aborted the connection and logged a stack trace per request -- a log-flood vector. Guard the quoted-key branch with a length check so a lone quote falls through to a normal 400. - parseClause decremented bracket depth with no floor, so a stray ']' could drive depth negative and hide the real top-level operator. Clamp at zero. - Document the first-']' limitation for bracketed keys/filter values. Also corrects the docs example for the environments map key (it is the display name, which contains spaces/parentheses and needs bracket-quoting and URL-encoding) and notes that an empty expect= value returns 400. Adds regression tests: the lone-quote panic inputs, the stray-']' case, explicit-null rendering, the != operator through the handler, and a not-ready 503-before-evaluation ordering test.
Locks down the documented behavior that a present-but-empty expect= value returns 400 end-to-end, not just at the evaluator level.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Customers used /status as a health gate by fetching the JSON body,
extracting fields with jq, and comparing values in a shell. They asked
us to ship jq in the relay Docker image; a vendored binary is an attack
surface we do not control and cannot go in the distroless image.
Instead, all status endpoints (/status and the per-environment routes)
now accept a repeatable, AND-ed expect= query parameter
and answer with an HTTP status code: 200 when every clause holds, 412
when a well-formed clause does not (including an absent field), and 400
when a clause is malformed. The verdict body summarizes each clause for
debugging, but the status code is the contract, so a probe or monitor
needs no body parsing and nothing installed.
The clause grammar is a small bounded path evaluator -- not an embedded
jq engine -- since the endpoints are unauthenticated. It supports map
keys, bracket-quoted keys, and array index/field-filter access, the last
of which is defined now for forward compatibility with the upcoming
concurrent-keys arrays.