Skip to content

Don't abort the whole index batch when a relationship link isn't a card#5073

Open
jurgenwerk wants to merge 6 commits into
mainfrom
cs-11243-indexer-aborts-whole-batch-when-a-relationship-link-returns
Open

Don't abort the whole index batch when a relationship link isn't a card#5073
jurgenwerk wants to merge 6 commits into
mainfrom
cs-11243-indexer-aborts-whole-batch-when-a-relationship-link-returns

Conversation

@jurgenwerk
Copy link
Copy Markdown
Contributor

@jurgenwerk jurgenwerk commented Jun 2, 2026

(Written by Claude on Matic's behalf.)

Problem

When a card relationship's links.self points at a non-card URL (e.g. an image), the indexer follows the link expecting a card document, receives binary content, and runs JSON.parse on it. The resulting "Unexpected token … is not valid JSON" error message embeds raw bytes — a NUL byte and/or unpaired UTF-16 surrogates. Those code points are illegal inside a Postgres jsonb value (22P05 unsupported Unicode escape sequence), so writing the error_doc row rejects the entire index batch transaction — including every successfully-rendered sibling in the same batch. Each retry hits the same poisoned card, so the realm can never commit new index state.

See CS-11243 for the affected realm and reproduction detail.

Fixes

1. Fail fast when a relationship link doesn't resolve to a card

Gate the relationship-link fetch paths on Content-Type before parsing, so a link to a non-card resource raises a clean, structured error instead of feeding binary to JSON.parse:

  • New isJsonContentType() helper.
  • Host render path (card-service.fetchJSON, and the store getSource + parse used during indexing).
  • Realm query engine (fetchCrossRealmLinks) — plumbs the relationship field name through so the error names the offending field, URL, and actual content type.

2. Defense against binary in error_doc jsonb writes

New sanitizeForJsonb() replaces jsonb-illegal code points (NUL, unpaired surrogates; valid surrogate pairs are preserved). Applied to both the error_doc and diagnostics payloads in IndexWriter.updateEntry, so a single bad byte can no longer abort the batch — even for an error class that #1 doesn't anticipate.

🤖 Generated with Claude Code

… error writes

When a card relationship's links.self points at a non-card URL (e.g. an
image), following the link returned binary content that was handed to
JSON.parse. The resulting error message embedded raw bytes (a NUL byte
and/or unpaired UTF-16 surrogates) that Postgres rejects inside a jsonb
column, aborting the entire index batch transaction — so none of the
batch's rows ever committed, even the unrelated successful renders.

Two independent fixes:

1. Gate relationship-link fetches on Content-Type before parsing, in both
   the host render path (card-service / store) and the realm query engine
   (fetchCrossRealmLinks), raising a clean error that names the field,
   URL, and actual content type instead of feeding binary to JSON.parse.

2. Sanitize jsonb-illegal code points (NUL, unpaired surrogates) from the
   error_doc and diagnostics payloads in IndexWriter.updateEntry so a
   single bad byte can no longer abort the whole batch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Preview deployments

Host Test Results

    1 files      1 suites   1h 51m 38s ⏱️
2 918 tests 2 903 ✅ 15 💤 0 ❌
2 937 runs  2 922 ✅ 15 💤 0 ❌

Results for commit 13c799c.

Realm Server Test Results

    1 files      1 suites   13m 25s ⏱️
1 556 tests 1 555 ✅ 1 💤 0 ❌
1 647 runs  1 646 ✅ 1 💤 0 ❌

Results for commit 13c799c.

…ts-whole-batch-when-a-relationship-link-returns
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens relationship-link fetching and index error persistence to prevent binary/non-JSON responses from producing poisoned error messages that can’t be written to Postgres jsonb, which previously could abort entire index batch transactions.

Changes:

  • Add isJsonContentType() and use it to gate response.json() for relationship-following fetch paths, producing clearer, structured failures for non-card/binary targets.
  • Add sanitizeForJsonb() and apply it to error_doc and diagnostics writes so jsonb-illegal code points (NUL, unpaired surrogates) can’t abort index upserts.
  • Add targeted tests in realm-server and host to lock in the new sanitization + content-type behavior.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/runtime-common/supported-mime-type.ts Adds isJsonContentType() helper for JSON-family Content-Type detection.
packages/runtime-common/realm-index-query-engine.ts Fails fast on non-JSON Content-Type for cross-realm relationship link resolution and plumbs relationship field context for better errors.
packages/runtime-common/index.ts Re-exports isJsonContentType and sanitizeForJsonb from runtime-common.
packages/runtime-common/index-writer.ts Sanitizes diagnostics and error_doc before persisting index rows to avoid jsonb write failures.
packages/runtime-common/error.ts Introduces sanitizeForJsonb() to replace jsonb-illegal code points across JSON-shaped values.
packages/realm-server/tests/sanitize-for-jsonb-test.ts New tests covering sanitization behavior, including nested structures and key sanitization.
packages/realm-server/tests/is-json-content-type-test.ts New tests for JSON content-type recognition rules.
packages/realm-server/tests/index.ts Registers the newly added realm-server tests.
packages/host/tests/unit/index-writer-test.ts Regression test ensuring index writer persists error rows with illegal code points after sanitization.
packages/host/tests/integration/components/formatted-aibot-message-test.gts Updates mocked cardService.getSource() to include contentType.
packages/host/tests/integration/components/ai-assistant-panel/codeblocks-test.gts Updates mocked cardService.getSource() responses to include contentType (including null for 404).
packages/host/app/services/store.ts Wraps render-context JSON parsing to throw a clean error when non-JSON/binary content is encountered.
packages/host/app/services/card-service.ts Gates fetchJSON() on Content-Type before parsing and returns contentType from getSource().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/host/app/services/store.ts
jurgenwerk and others added 4 commits June 3, 2026 11:46
Gate the render-context getSource path on Content-Type before
JSON.parse, matching the other relationship-link fetch paths and
avoiding handing a large binary body to the parser. A try/catch remains
as a safety net for a body that claims a JSON content type but doesn't
parse.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Condense the explanatory comments added for the relationship-link
content-type gates and the jsonb sanitizer; the rationale lives once at
the helper definitions rather than being repeated at each call site.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Document the authoring scenario (typically AI-generated card JSON
conflating an image-URL field with a card relationship) at the indexer's
relationship-following gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jurgenwerk jurgenwerk changed the title Fail fast on non-card relationship links and sanitize binary in index error writes Don't abort the whole index batch when a relationship link isn't a card Jun 3, 2026
@jurgenwerk jurgenwerk marked this pull request as ready for review June 3, 2026 11:18
@jurgenwerk jurgenwerk requested a review from Copilot June 3, 2026 11:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Comment thread packages/runtime-common/supported-mime-type.ts
Comment on lines +81 to +87
if (value !== null && typeof value === 'object') {
let result: Record<string, unknown> = {};
for (let [key, val] of Object.entries(value)) {
result[sanitizeForJsonb(key)] = sanitizeForJsonb(val);
}
return result as T;
}
@jurgenwerk jurgenwerk requested a review from a team June 3, 2026 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants