Don't abort the whole index batch when a relationship link isn't a card#5073
Open
jurgenwerk wants to merge 6 commits into
Open
Don't abort the whole index batch when a relationship link isn't a card#5073jurgenwerk wants to merge 6 commits into
jurgenwerk wants to merge 6 commits into
Conversation
… error writes When a card relationship's links.self points at a non-card URL (e.g. an image), following the link returned binary content that was handed to JSON.parse. The resulting error message embedded raw bytes (a NUL byte and/or unpaired UTF-16 surrogates) that Postgres rejects inside a jsonb column, aborting the entire index batch transaction — so none of the batch's rows ever committed, even the unrelated successful renders. Two independent fixes: 1. Gate relationship-link fetches on Content-Type before parsing, in both the host render path (card-service / store) and the realm query engine (fetchCrossRealmLinks), raising a clean error that names the field, URL, and actual content type instead of feeding binary to JSON.parse. 2. Sanitize jsonb-illegal code points (NUL, unpaired surrogates) from the error_doc and diagnostics payloads in IndexWriter.updateEntry so a single bad byte can no longer abort the whole batch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
…ts-whole-batch-when-a-relationship-link-returns
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens relationship-link fetching and index error persistence to prevent binary/non-JSON responses from producing poisoned error messages that can’t be written to Postgres jsonb, which previously could abort entire index batch transactions.
Changes:
- Add
isJsonContentType()and use it to gateresponse.json()for relationship-following fetch paths, producing clearer, structured failures for non-card/binary targets. - Add
sanitizeForJsonb()and apply it toerror_docanddiagnosticswrites so jsonb-illegal code points (NUL, unpaired surrogates) can’t abort index upserts. - Add targeted tests in realm-server and host to lock in the new sanitization + content-type behavior.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| packages/runtime-common/supported-mime-type.ts | Adds isJsonContentType() helper for JSON-family Content-Type detection. |
| packages/runtime-common/realm-index-query-engine.ts | Fails fast on non-JSON Content-Type for cross-realm relationship link resolution and plumbs relationship field context for better errors. |
| packages/runtime-common/index.ts | Re-exports isJsonContentType and sanitizeForJsonb from runtime-common. |
| packages/runtime-common/index-writer.ts | Sanitizes diagnostics and error_doc before persisting index rows to avoid jsonb write failures. |
| packages/runtime-common/error.ts | Introduces sanitizeForJsonb() to replace jsonb-illegal code points across JSON-shaped values. |
| packages/realm-server/tests/sanitize-for-jsonb-test.ts | New tests covering sanitization behavior, including nested structures and key sanitization. |
| packages/realm-server/tests/is-json-content-type-test.ts | New tests for JSON content-type recognition rules. |
| packages/realm-server/tests/index.ts | Registers the newly added realm-server tests. |
| packages/host/tests/unit/index-writer-test.ts | Regression test ensuring index writer persists error rows with illegal code points after sanitization. |
| packages/host/tests/integration/components/formatted-aibot-message-test.gts | Updates mocked cardService.getSource() to include contentType. |
| packages/host/tests/integration/components/ai-assistant-panel/codeblocks-test.gts | Updates mocked cardService.getSource() responses to include contentType (including null for 404). |
| packages/host/app/services/store.ts | Wraps render-context JSON parsing to throw a clean error when non-JSON/binary content is encountered. |
| packages/host/app/services/card-service.ts | Gates fetchJSON() on Content-Type before parsing and returns contentType from getSource(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Gate the render-context getSource path on Content-Type before JSON.parse, matching the other relationship-link fetch paths and avoiding handing a large binary body to the parser. A try/catch remains as a safety net for a body that claims a JSON content type but doesn't parse. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Condense the explanatory comments added for the relationship-link content-type gates and the jsonb sanitizer; the rationale lives once at the helper definitions rather than being repeated at each call site. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Document the authoring scenario (typically AI-generated card JSON conflating an image-URL field with a card relationship) at the indexer's relationship-following gate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines
+81
to
+87
| if (value !== null && typeof value === 'object') { | ||
| let result: Record<string, unknown> = {}; | ||
| for (let [key, val] of Object.entries(value)) { | ||
| result[sanitizeForJsonb(key)] = sanitizeForJsonb(val); | ||
| } | ||
| return result as T; | ||
| } |
habdelra
approved these changes
Jun 3, 2026
11 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
(Written by Claude on Matic's behalf.)
Problem
When a card relationship's
links.selfpoints at a non-card URL (e.g. an image), the indexer follows the link expecting a card document, receives binary content, and runsJSON.parseon it. The resulting "Unexpected token … is not valid JSON" error message embeds raw bytes — a NUL byte and/or unpaired UTF-16 surrogates. Those code points are illegal inside a Postgresjsonbvalue (22P05 unsupported Unicode escape sequence), so writing theerror_docrow rejects the entire index batch transaction — including every successfully-rendered sibling in the same batch. Each retry hits the same poisoned card, so the realm can never commit new index state.See CS-11243 for the affected realm and reproduction detail.
Fixes
1. Fail fast when a relationship link doesn't resolve to a card
Gate the relationship-link fetch paths on
Content-Typebefore parsing, so a link to a non-card resource raises a clean, structured error instead of feeding binary toJSON.parse:isJsonContentType()helper.card-service.fetchJSON, and thestoregetSource+ parse used during indexing).fetchCrossRealmLinks) — plumbs the relationship field name through so the error names the offending field, URL, and actual content type.2. Defense against binary in error_doc jsonb writes
New
sanitizeForJsonb()replaces jsonb-illegal code points (NUL, unpaired surrogates; valid surrogate pairs are preserved). Applied to both theerror_docanddiagnosticspayloads inIndexWriter.updateEntry, so a single bad byte can no longer abort the batch — even for an error class that #1 doesn't anticipate.🤖 Generated with Claude Code