Skip to content

🛡️ fix: Strip null content parts on message load to prevent formatAgentMessages crash#64

Merged
bensi94 merged 1 commit into
fix-per-user-oidc-model-list-v0.8.6from
fix/agents-null-content-part
Jun 2, 2026
Merged

🛡️ fix: Strip null content parts on message load to prevent formatAgentMessages crash#64
bensi94 merged 1 commit into
fix-per-user-oidc-model-list-v0.8.6from
fix/agents-null-content-part

Conversation

@TomasPalsson
Copy link
Copy Markdown

@TomasPalsson TomasPalsson commented Jun 2, 2026

Summary

Fixes the recurring crash:

TypeError: Cannot read properties of null (reading 'type')
    at formatAgentMessages (@librechat/agents/dist/cjs/messages/format.cjs)
    at AgentClient.chatCompletion (api/server/controllers/agents/client.js:894)

The crash fires on a follow-up turn in a conversation — most reliably after using tool calls or after attaching PDF/PNG files. One small change in getMessages (the single DB read path for conversation history) sanitizes message content on load, which neutralizes both already-corrupted conversations and any future ones.


What actually happens (root cause)

The error is thrown inside @librechat/agents, but the bad data is created and persisted by our side. The chain:

  1. The streaming content aggregator builds content by index and leaves holes.
    createContentAggregator() (api/server/services/Endpoints/agents/initialize.js:129) produces the contentParts array that the run fills by index. Our own code documents this at api/server/controllers/agents/client.js:162:

    "createContentAggregator returns a sparse array (undefined slots for indices that never received content)."

    Tool-call parts in particular are placed at sparse offsets (see the comment at client.js:1090). A sparse hole in a JS array serializes to null when written to MongoDB.

  2. The happy path cleans it — so a completed response is safe.
    On normal completion, BaseClient.sendMessage sets responseMessage.content = completion (api/app/clients/BaseClient.js:602), where completion = filterMalformedContentParts(this.contentParts) (client.js:752). Holes are stripped → nothing bad persists.

  3. An interrupted run persists the raw, un-filtered array.
    When a stream is cut (Stop button, page refresh/navigation, network drop, or an upstream litellm/OpenAI-compatible proxy timeout closing the SSE connection), the resumable/partial save writes the live aggregator array without filtering:

    // api/server/controllers/agents/request.js:166
    content: aggregatedContent,   // raw sparse array
    unfinished: true,

    The sparse hole → null → persisted to the messages collection.

  4. The next turn replays it and crashes.
    On the following message, history is loaded and formatMessages.js:59 passes the stored array through verbatim into the payload. formatAgentMessages iterates content and reads part.type with no null guard → 💥.

Why files reproduce it deterministically, tool calls only sometimes

Attaching a PDF/PNG reliably triggers a tool call (extraction / OCR / file_search) and makes the response stream longer — a much bigger window for a disconnect/refresh/timeout to land mid-tool-call and hit the un-filtered partial save. A plain chat only produces a hole when a tool-call slot happens to never finalize, so it's sporadic. The deterministic ingredients are the sparse aggregator + the un-filtered partial save; the trigger is the interruption (which litellm timeouts make common).

Note on litellm

litellm does not create the null itself. As an OpenAI-compatible proxy it (a) routes through generic provider handling and (b) drops/early-closes streams on upstream timeouts — which is exactly what fires the interrupted-save path in step 3. So it acts as a frequent trigger of a latent bug, not the cause.


The fix

Sanitize content holes at the single universal read chokepoint — getMessages in packages/data-schemas/src/methods/message.ts. Every history load goes through it (BaseClient.loadHistory, the Responses-API controllers), so cleaning here protects every downstream consumer (formatAgentMessages, the content.find(part => part.type) at client.js:325, the edit path at BaseClient.js:447, token counting) in one place.

const messages = await (select ? query.select(select) : query).lean<IMessage[]>();
for (const message of messages) {
  if (Array.isArray(message.content)) {
    message.content = message.content.filter((part) => part != null);
  }
}
return messages;

Why read-side rather than the save path: filtering at save (request.js:166) would stop new corruption but leave every already-poisoned conversation crashing. Cleaning on read fixes existing rows and future ones, and makes any null that slips through harmless because no consumer ever sees it. It's a single-pass mutation over already-lean()'d objects — negligible cost, no new types.

Scope: intentionally minimal. No client.js / request.js changes. Stripping null/undefined array elements is always safe — those entries are never meaningful — and nothing relies on content-part holes once a message is read back.


How to reproduce

Deterministic (no timing luck):

  1. In mongosh, poison an existing assistant message:
    db.messages.updateOne(
      { messageId: "<an-assistant-messageId>" },
      { $set: { content: [ { type: "text", text: "hi" }, null ] } }
    )
  2. Open that conversation and send any new message → before this fix: crash; after: works.

Natural (the real path):

  1. Use an agent with a tool enabled (or attach a PDF/PNG).
  2. Send a prompt that starts a tool call.
  3. Cut the stream mid-tool-call — click Stop, refresh, or let an upstream/litellm timeout drop it. This persists an unfinished message with a sparse hole.
  4. Reply again in that conversation → crash (pre-fix).

Library-level sanity check (inside the container):

node -e "const {formatAgentMessages}=require('@librechat/agents'); \
formatAgentMessages([{role:'assistant',content:[{type:'text',text:'x'},null]}], {}, new Set())"

Find already-corrupted rows in your DB:

db.messages.find(
  { $or: [ { content: null }, { content: { $elemMatch: { $eq: null } } } ] },
  { messageId: 1, conversationId: 1, unfinished: 1 }
)

Optional follow-ups (not in this PR)

These are independent hardening items, none required to fix the reported crash:

  • Data hygiene at the source: apply filterMalformedContentParts to aggregatedContent at the interrupted save (request.js:166) so the DB stops storing holes at all. The read-side fix already makes the crash impossible; this just keeps stored data clean.
  • Client-input path (different crash signature): convertMessages (packages/api/src/agents/openai/service.ts:205) maps over msg.content from the HTTP request body and reads part.type with no null guard. This is reachable via the Responses/OpenAI-compatible passthrough (openai.js:463) and crashes on malformed client-supplied input — not on persisted rows — so it's a separate input-validation concern. One-line guard if desired: msg.content.filter((part) => part != null).map(...).
  • True origin: the sparse hole is created by the index-based aggregator in @librechat/agents; a fix there (or a version bump if upstream has patched it) is the only thing that stops the null being created.

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

The chatCompletion function in the agents controller now sanitizes incoming message payloads by filtering null or malformed content parts via filterMalformedContentParts before formatting. The sanitized payload replaces the original payload passed to formatAgentMessages, preventing crashes from corrupted historical message data.

Changes

Malformed content filtering

Layer / File(s) Summary
Malformed content filtering
api/server/controllers/agents/client.js
chatCompletion constructs a sanitizedPayload that removes null/malformed content parts from array-type message content, then passes this cleaned payload to formatAgentMessages instead of the raw payload.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • aproorg/LibreChat#54: Both PRs modify message payload handling in api/server/controllers/agents/client.js—this PR adds upstream content sanitization while the retrieved PR adjusts system-message insertion in the same flow.

Poem

A guardian stands at the gate,
Filtering shapes both bent and late,
Malformed whispers meet their end,
Before the formatter's careful hand,
Safe passage now, the path is clear. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title accurately reflects the main change: sanitizing null content parts before calling formatAgentMessages to prevent crashes. It is specific and directly addresses the core fix.
Description check ✅ Passed The description is comprehensive and well-structured. It includes a clear summary of the bug, detailed root cause analysis, reproduction steps, and implementation rationale. It addresses the template's Summary and Change Type sections thoroughly.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/agents-null-content-part

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

…ntMessages crash

The streaming content aggregator builds message content by index and yields a
sparse array; an interrupted/partial save persists a hole that serializes to
null in MongoDB. On replay, @librechat/agents formatAgentMessages reads
part.type with no null guard and crashes.

Sanitize content holes at getMessages, the single DB read chokepoint for
conversation history, so both already-corrupted and future rows are
neutralized for every consumer (formatAgentMessages, token counting, edit path).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@TomasPalsson TomasPalsson force-pushed the fix/agents-null-content-part branch from 8d62d48 to 6d37259 Compare June 2, 2026 14:27
@TomasPalsson TomasPalsson changed the title 🛡️ fix: Guard against null content parts in formatAgentMessages payload 🛡️ fix: Strip null content parts on message load to prevent formatAgentMessages crash Jun 2, 2026
@bensi94 bensi94 changed the base branch from main to fix-per-user-oidc-model-list-v0.8.6 June 2, 2026 15:55
@bensi94 bensi94 merged commit c14d070 into fix-per-user-oidc-model-list-v0.8.6 Jun 2, 2026
12 checks passed
@bensi94 bensi94 deleted the fix/agents-null-content-part branch June 2, 2026 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants