Skip to content

fix: bound agent communication memory growth#369

Merged
claude-code-best merged 3 commits intomainfrom
codex/memory-overflow-bounds
Apr 27, 2026
Merged

fix: bound agent communication memory growth#369
claude-code-best merged 3 commits intomainfrom
codex/memory-overflow-bounds

Conversation

@amDosion
Copy link
Copy Markdown
Collaborator

@amDosion amDosion commented Apr 26, 2026

Summary

  • Bound agent communication memory growth across UDS messaging, teammate mailboxes, and agent summary context.
  • Removed UDS auth token exposure from SDK/env/session registry/peer listing paths and rejected inline uds:...#token= targets instead of supporting a token fallback address form.
  • Hardened local UDS capability storage with private directory checks, exclusive temp-file creation, owner-only socket permissions, bounded active clients, and real boundary tests.

Verification

  • bun test src/utils/__tests__/ndjsonFramer.test.ts src/utils/__tests__/udsMessaging.test.ts packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts
  • bunx tsc --noEmit --pretty false
  • bun run lint
  • bunx biome lint on all modified src/package files
  • bun run test:all -> 3704 pass, 0 fail, 6734 expects
  • bun audit -> No vulnerabilities found
  • bun run build
  • bun run build:vite
  • git diff --check

Review Notes

  • Based on upstream main after PR fix: resolve dependency audit findings precisely #361 was merged (c2ac9a74).
  • Codex code/security review found and fixed the UDS inline-token reflection path.
  • Claude simplify review: no actionable simplification findings.
  • Claude security review initially flagged boundary-test and local capability hardening gaps; those were fixed and the final Claude security re-review returned NO ACTIONABLE FINDINGS.

Residual Risk

  • Same-OS-user processes that can read the owner-only capability file remain within the trusted local capability model.
  • Full external end-to-end UDS client driving a production headless model turn was not manually exercised; the UDS/socket paths are covered by real local socket tests.

Summary by CodeRabbit

  • New Features

    • Added bridge peer discovery for expanded peer enumeration
    • Added inline token detection and redaction for secure messaging
    • Improved peer address formatting in listings
  • Bug Fixes

    • Strengthened validation of UDS addresses, rejecting inline tokens
    • Enhanced error handling for oversized frames and malformed messages
    • Improved mailbox storage with size constraints and deduplication
  • Refactor

    • Optimized agent summarization with fingerprinting to reduce redundant processing
    • Refined message filtering for tool interaction validation
    • Enhanced message response reading with frame-size enforcement

UDS messaging now uses private local capabilities instead of exposing auth tokens through SDK metadata, environment variables, session registry, peer listing, or tool output. The receive path bounds NDJSON frames, response buffers, active clients, and pending inbox bytes, and strips auth metadata before messages enter the prompt queue.

Teammate mailboxes now validate file and message sizes, fail closed on corrupt mutation inputs, compact by count and retained bytes, and use stable message identity for in-process acknowledgements. Agent summaries now fork only a bounded recent context using lazy size estimation and content fingerprints instead of retaining or serializing unbounded histories.

Constraint: PR #361 was already merged; this branch is based on upstream/main@c2ac9a74.
Rejected: Default-disabling COORDINATOR_MODE/TEAMMEM only | explicit feature enablement still hit unbounded paths.
Rejected: Persisting UDS auth in SDK/env/session registry | bridge/remote metadata can leak local capability secrets.
Rejected: Inline uds #token addresses | observable/tool/classifier paths can reflect raw addresses outside the UDS request frame.
Rejected: Positional mailbox marking after compaction | compaction can shift indices across the lock boundary.
Confidence: high
Scope-risk: moderate
Directive: Do not expose UDS capability tokens through SDK messages, environment variables, session registry, peer-list output, or SendMessage result/classifier surfaces.
Directive: Do not reintroduce positional mailbox acknowledgements unless compaction is removed or read+mark is atomic under one lock.
Tested: bun test src/utils/__tests__/ndjsonFramer.test.ts src/utils/__tests__/udsMessaging.test.ts packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts
Tested: bunx tsc --noEmit --pretty false
Tested: bun run lint
Tested: bunx biome lint modified src/package files
Tested: bun run test:all (3704 pass, 0 fail, 6734 expects)
Tested: bun audit (No vulnerabilities found)
Tested: bun run build
Tested: bun run build:vite
Tested: git diff --check
Not-tested: End-to-end external UDS client driving a full production headless model turn.
@mintlify
Copy link
Copy Markdown

mintlify Bot commented Apr 26, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
ccb-863780bf 🟢 Ready View Preview Apr 26, 2026, 1:46 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This pull request extends UDS peer messaging with authentication tokens and capability-based validation, refactors agent summarization to use bounded message context with fingerprinting to avoid repeated summarization, upgrades teammate mailbox storage with atomic writes and compaction under size limits, and adds comprehensive test coverage for these systems along with new message filtering and response-reading utilities.

Changes

Cohort / File(s) Summary
UDS Messaging & Authentication
src/utils/udsMessaging.ts, src/utils/udsClient.ts, src/utils/udsResponseReader.ts
Adds per-process auth tokens, capability token persistence, frame size enforcement (MAX_UDS_FRAME_BYTES), and inbox buffering bounds. New response reader utility handles UTF-8 newline-delimited frame parsing with size limits and error settlement.
Peer Discovery & Addressing
packages/builtin-tools/src/tools/ListPeersTool/ListPeersTool.ts, src/bridge/peerSessions.ts, src/commands/peers/peers.ts
Refactors peer discovery to dynamically load UDS client, enumerate peers via udsClient.listPeers(), deduplicate by address, and append bridge peers. Updates peer display to use formatUdsAddress for standardized UDS address formatting.
UDS Message Sending & Token Redaction
packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts, packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts
Detects and redacts inline auth tokens (#token=...) in UDS recipient addresses throughout observable input backfill, classifier generation, and validation. Rejects inline tokens with dedicated error code 9. Test suite validates redaction across observable paths and classifier formats.
Agent Summarization with Context Selection
src/services/AgentSummary/summaryContext.ts, src/services/AgentSummary/agentSummary.ts, src/services/AgentSummary/__tests__/summaryContext.test.ts, src/services/AgentSummary/__tests__/agentSummary.test.ts
New module implements bounded message selection (selectSummaryContextMessages), character/cycle-aware estimation, and deterministic SHA-256 fingerprinting. Agent summarization now skips forking when transcript fingerprint unchanged and uses bounded context instead of full transcript.
Mailbox Storage & Compaction
src/utils/teammateMailbox.ts, src/utils/swarm/inProcessRunner.ts, src/utils/__tests__/teammateMailbox.test.ts
Upgrades mailbox with atomic writes, per-message text byte limits, validated parsing, and compactMailboxMessages that prioritizes unread protocol messages while respecting count and total-bytes budgets. Adds identity-based message marking. Updates in-process runner to use identity-based rather than index-based marking.
Tool Filtering & Validation
packages/builtin-tools/src/tools/AgentTool/filterIncompleteToolCalls.ts, packages/builtin-tools/src/tools/AgentTool/runAgent.ts, packages/builtin-tools/src/tools/AgentTool/__tests__/filterIncompleteToolCalls.test.ts
Extracts tool-call filtering logic into dedicated module. Removes orphaned tool_use/tool_result pairs by collecting valid result IDs and filtering blocks that lack required identifiers or matching pairs.
NDJSON Framing with Error Handling
src/utils/ndjsonFramer.ts, src/utils/__tests__/ndjsonFramer.test.ts
Adds NdjsonFramerOptions to configure frame-size limits and per-frame error callbacks. Rewritten parsing tracks byte positions; oversized frames trigger onFrameError, malformed frames trigger onInvalidFrame, with optional socket destruction on either error.
UDS Inbox & Response Reader Utilities
src/utils/__tests__/udsResponseReader.test.ts, src/cli/print.ts
New test suite for attachUdsResponseReader validates UTF-8 multibyte handling, malformed JSON rejection, blank frame ignoring, and receiver error frames. CLI print updates drain buffered UDS inbox entries and conditionally trigger run() only when work is enqueued.
Testing & Mocking Infrastructure
src/utils/__tests__/udsMessaging.test.ts, src/commands/poor/__tests__/poorMode.test.ts
Adds comprehensive UDS messaging test suite covering inbox retention, authentication, frame limits, and permission hardening on non-Windows. Updates poorMode test with broader mock coverage and module-query isolation to prevent test leakage.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Tool as SendMessageTool
    participant Sanitizer as Redaction Handler
    participant Transport as UDS Client
    participant Receiver as Remote Peer

    Client->>Tool: send message to uds:socket#token=secret
    Tool->>Sanitizer: backfillObservableInput(input)
    Sanitizer->>Sanitizer: detect inline token in 'to'
    Sanitizer->>Sanitizer: redact observable.to display
    Sanitizer->>Sanitizer: derive recipient from token-stripped address
    Tool->>Tool: validateInput (reject inline tokens)
    alt Token detected
        Tool-->>Client: error 9 - inline auth tokens rejected
    else Valid
        Tool->>Transport: sendToUdsSocket(recipient, message, authToken)
        Transport->>Receiver: send with capability token in meta
        Receiver-->>Transport: response
        Transport-->>Tool: success (with token-stripped address)
        Tool-->>Client: result with redacted recipient
    end
Loading
sequenceDiagram
    participant Agent
    participant SummaryService as AgentSummary
    participant Selector as selectSummaryContextMessages
    participant Fingerprint as getSummaryContextFingerprint
    participant Fork as ForkedAgent

    Agent->>SummaryService: tick() with full transcript
    SummaryService->>Fingerprint: compute hash of full transcript
    alt Fingerprint unchanged
        SummaryService-->>Agent: skip forking
    else Changed or new
        SummaryService->>Selector: boundedContext = select(maxMessages, maxChars)
        Selector-->>SummaryService: filtered recent messages
        alt Context too small
            SummaryService-->>Agent: skip (< 3 messages)
        else Sufficient
            SummaryService->>Fork: run with boundedContext messages
            Fork-->>SummaryService: summary text
            SummaryService->>SummaryService: record fingerprint
            SummaryService-->>Agent: emit summary update
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 The UDS paths now sing with tokens bright,
Peers discover each other, securely in sight,
Summaries bounded, fingerprints true,
Mailboxes compact—old and new!
Test suites abound, so thorough and deep,

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: bound agent communication memory growth' directly and accurately summarizes the main objective of this changeset—bounding memory growth in agent communication systems across UDS messaging, teammate mailboxes, and agent summaries.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/memory-overflow-bounds

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 26, 2026

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/builtin-tools/src/tools/ListPeersTool/ListPeersTool.ts (1)

82-115: ⚠️ Potential issue | 🟠 Major

Don't drop bridge peers from ListPeers.

This implementation now only returns self + udsClient.listPeers(), so Remote Control peers no longer show up here even though the prompt still advertises bridge:session_... targets and SendMessage still accepts them. That breaks peer discovery for cross-session messaging.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/ListPeersTool/ListPeersTool.ts` around lines
82 - 115, ListPeersTool.call currently only adds self and udsClient.listPeers
results, which omits Remote Control bridge peers; modify the method to also
import and call the Remote Control bridge peer listing (e.g. require the bridge
module that exposes a listPeers/listBridgePeers function), merge those bridge
peers into the peers array (preserving fields address/name/pid/cwd and
deduplicating by address), and ensure bridge-style addresses (like
"bridge:session_...") remain included so SendMessage and advertised targets
continue to work.
🧹 Nitpick comments (6)
src/services/AgentSummary/__tests__/summaryContext.test.ts (1)

103-132: Consider asserting the empty-input contract for getSummaryContextFingerprint.

agentSummary.ts relies on getSummaryContextFingerprint(summaryContext) returning null for empty bounded contexts (the transcriptFingerprint && … guard). A small test pinning that contract would prevent a regression where, e.g., switching to '' would silently make every empty tick "match" the previously stored fingerprint of ''.

Suggested addition
   test('changes when message content changes under the same uuid', () => {
     ...
   })
+
+  test('returns null for an empty transcript', () => {
+    expect(getSummaryContextFingerprint([])).toBeNull()
+  })
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/AgentSummary/__tests__/summaryContext.test.ts` around lines 103
- 132, Add a test asserting the empty-input contract for
getSummaryContextFingerprint: call getSummaryContextFingerprint with an
empty/zero-length bounded context (e.g., [] or the minimal empty summaryContext
shape used by agentSummary.ts) and assert it returns null; this ensures
agentSummary.ts's guard (transcriptFingerprint && …) keeps working and prevents
regressions where an empty fingerprint becomes '' or another non-null value.
Locate the tests in summaryContext.test.ts and add the assertion alongside the
existing cases referencing getSummaryContextFingerprint so it runs with the
other fingerprint tests.
src/services/AgentSummary/summaryContext.ts (1)

55-62: Truncated string contributions can mask changes to the tail of a large value.

hash.update(text.slice(0, limit.remaining)) only feeds the first remaining chars of a large primitive into the digest, while limit.remaining -= text.length accounts for the full length. If a single string value at the very edge of the budget changes only past the truncation point, the fingerprint won't reflect it (the rest of the walk is also short-circuited because remaining is already ≤ 0). With MAX_SUMMARY_CONTEXT_CHARS = 200_000 and lastMessage.uuid baked into the returned key this is unlikely to matter in practice — calling it out as a minor robustness consideration only.

Possible tweak to deduct only what was hashed (preserves total walk continuation behavior for callers)
-  if (value === null || typeof value !== 'object') {
-    const text = String(value)
-    hash.update(typeof value)
-    hash.update(':')
-    hash.update(text.slice(0, limit.remaining))
-    limit.remaining -= text.length
-    return
-  }
+  if (value === null || typeof value !== 'object') {
+    const text = String(value)
+    const consumed = Math.min(text.length, limit.remaining)
+    hash.update(typeof value)
+    hash.update(':')
+    hash.update(text.slice(0, consumed))
+    // Mark whether the value was truncated so siblings can't collide by content alone.
+    if (consumed < text.length) hash.update(`#${text.length}`)
+    limit.remaining -= text.length
+    return
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/AgentSummary/summaryContext.ts` around lines 55 - 62, The code
truncates large primitive strings but subtracts the full text length from
limit.remaining, which hides tail changes; in the block handling primitives
(check for value === null || typeof value !== 'object'), compute the actual
hashed length first (e.g., sliceLen = Math.min(text.length, limit.remaining)),
call hash.update(text.slice(0, sliceLen)), then decrement limit.remaining by
sliceLen (not text.length), and bail early if sliceLen is 0; refer to the
variables/identifiers limit.remaining, value, text, and hash.update to locate
and fix the logic.
src/utils/teammateMailbox.ts (1)

152-165: Clean up the temp file when atomic write fails.

If writeFile partially writes or rename fails (e.g., EXDEV on a different filesystem, ENOSPC, EACCES on the target directory), the .tmp file is leaked. Over time these can accumulate next to active inboxes and confuse cleanup tooling. Wrap the rename in try/catch and unlink the temp on failure.

♻️ Suggested fix
-  const tempPath = `${inboxPath}.${process.pid}.${randomBytes(8).toString('hex')}.tmp`
-  await writeFile(tempPath, content, 'utf-8')
-  await rename(tempPath, inboxPath)
+  const tempPath = `${inboxPath}.${process.pid}.${randomBytes(8).toString('hex')}.tmp`
+  try {
+    await writeFile(tempPath, content, 'utf-8')
+    await rename(tempPath, inboxPath)
+  } catch (error) {
+    await unlink(tempPath).catch(() => {})
+    throw error
+  }

Add the import:

-import { mkdir, readFile, rename, stat, writeFile } from 'fs/promises'
+import { mkdir, readFile, rename, stat, unlink, writeFile } from 'fs/promises'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/teammateMailbox.ts` around lines 152 - 165, The writeMailboxAtomic
function can leak the temp file if writeFile/rename fails; update
writeMailboxAtomic to wrap the rename (and the write+rename sequence) in a
try/catch and, on any failure after creating tempPath, attempt to
unlink(tempPath) (using the existing fs promises API or importing unlink) before
rethrowing the original error; reference the tempPath variable and the
rename/writeFile calls and ensure the unlink call itself is swallowed or logged
so cleanup failures don't mask the original error.
packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts (1)

3-5: Use the shared bun:bundle mock instead of redefining it inline.

This adds another ad-hoc feature-flag mock surface. Please wire this test through the shared mock in tests/mocks/ so bun:bundle behavior stays consistent across suites.

Based on learnings: Applies to /tests//*.test.ts : Mock only modules with side effects (dependencies with I/O, state, or network access). Do not mock pure functions or pure data modules. Use shared mocks from tests/mocks/ for log.ts, debug.ts, bun:bundle, settings.js, config.ts, and auth.ts

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts`
around lines 3 - 5, The test currently redefines the bun:bundle feature-flag
inline via mock.module('bun:bundle', ...) which creates an ad-hoc mock; remove
that inline mock and instead wire the test to use the shared bun:bundle mock
from the tests/mocks/ directory so feature flags remain consistent (ensure the
test imports or invokes the shared mock setup before running and configures the
UDS_INBOX flag via that shared mock); update the test to rely on the shared mock
entry (rather than mock.module) and remove any local feature-flag logic so other
suites use the same bun:bundle behavior.
src/utils/udsMessaging.ts (1)

311-318: Use a constant-time comparison for the auth token.

getMessageAuthToken(message) === authToken short-circuits on first differing byte. The local-UDS attack surface limits the practical impact (the capability file is already 0o600 and the comparison runs in the recipient's own process), but a constant-time check is trivial here and avoids the question entirely.

🛡️ Suggested fix using crypto.timingSafeEqual
 function isAuthorizedMessage(message: UdsMessage): boolean {
-  return getMessageAuthToken(message) === authToken
+  const provided = getMessageAuthToken(message)
+  if (!provided || !authToken) return false
+  const a = Buffer.from(provided, 'utf8')
+  const b = Buffer.from(authToken, 'utf8')
+  if (a.length !== b.length) return false
+  return timingSafeEqual(a, b)
 }

Add timingSafeEqual to the existing crypto import on line 11.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/udsMessaging.ts` around lines 311 - 318, Replace the direct
equality check in isAuthorizedMessage by performing a constant-time comparison
using crypto.timingSafeEqual: import timingSafeEqual (or crypto) into the
module, retrieve the token via getMessageAuthToken(message), and if both token
and authToken are defined and have the same byte length, compare
Buffer.from(token) and Buffer.from(authToken) with timingSafeEqual; return false
for missing tokens or length mismatch. Ensure you reference getMessageAuthToken
and isAuthorizedMessage when making the change.
src/utils/udsClient.ts (1)

46-50: Consider extracting a shared NDJSON response reader.

getChunkBytes is duplicated verbatim in src/utils/udsMessaging.ts (line 565), and the read/settle/parse loop in sendToUdsSocket (lines 220-275) is nearly identical to the one in sendUdsMessage (lines 594-639 of udsMessaging.ts): same buffer/settled/finish pattern, same per-frame byte cap, same NDJSON splitting, and same response/error/pong settling rules. Extracting a helper would prevent the two implementations from drifting on future changes (e.g., when the framing or termination rules evolve).

Also applies to: 220-275

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/udsClient.ts` around lines 46 - 50, The duplicated NDJSON
response-reading logic (including getChunkBytes) across getChunkBytes,
sendToUdsSocket, and sendUdsMessage should be extracted into a shared helper to
avoid drift; create a new function (e.g., readNdjsonResponses or
parseNdjsonStream) that encapsulates the buffer/settled/finish loop, per-frame
byte cap, NDJSON splitting, and the response/error/pong settling rules, replace
the duplicated implementations in sendToUdsSocket and sendUdsMessage to call
this helper, and keep the existing getChunkBytes either moved into the helper or
exported for reuse so both modules use the same utility.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cli/print.ts`:
- Around line 2773-2776: The bug is that the object branch stringifies the whole
transport envelope (jsonStringify(entry.message)) instead of just the inbox
payload, causing the wrapper to be re-enqueued; update the assignment for value
so both branches serialize only entry.message.data (i.e., keep the string branch
as-is and change jsonStringify(entry.message) to
jsonStringify(entry.message.data)), ensuring you still handle non-string data
safely via jsonStringify.

In `@src/utils/ndjsonFramer.ts`:
- Around line 41-60: The current pre-check only prevents oversized frames when
chunk has no newline, so a single read containing a huge frame plus its
terminating '\n' bypasses the cap; modify the framing logic in ndjsonFramer.ts
to avoid concatenating the entire chunk before size-checking: when
chunk.includes('\n'), iterate the split pieces (or scan the chunk) and for each
piece compute Buffer.byteLength(buffer + piece, 'utf8') (or accumulate sizes)
and call rejectOversizedFrame if any intermediate frame exceeds maxFrameBytes;
only append/commit the piece to buffer and emit lines after verifying it's
within the limit. Ensure you still handle the existing no-newline path and keep
references to maxFrameBytes, buffer, chunk, and rejectOversizedFrame.

In `@src/utils/teammateMailbox.ts`:
- Around line 65-82: The fallback in shouldRetainUnreadAsProtocolMessage
currently treats unread messages with unparseable JSON-like text as protocol
messages (catch returns true); change the catch to return false so that messages
which fail jsonParse are not considered protocol-priority. Update the catch
block in shouldRetainUnreadAsProtocolMessage (which uses isJsonLikeMessage and
jsonParse) to return false, relying on isStructuredProtocolMessage to already
require successful parsing and a recognized type, preventing malformed JSON-like
blobs from consuming the retained-byte budget.

In `@src/utils/udsMessaging.ts`:
- Around line 198-214: The existing ensureSocketParent function only checks that
the parent is a directory but ignores its permissions and ownership; update
ensureSocketParent to mirror the invariants in assertPrivateCapabilityDir by,
after lstat confirms a directory, checking that (stat.mode & 0o077) === 0 and
stat.uid === process.getuid(), and throw an error if those checks fail; keep the
existing mkdir/chmod path for creating a missing directory, and reference
assertPrivateCapabilityDir as the source of the expected checks to implement.

---

Outside diff comments:
In `@packages/builtin-tools/src/tools/ListPeersTool/ListPeersTool.ts`:
- Around line 82-115: ListPeersTool.call currently only adds self and
udsClient.listPeers results, which omits Remote Control bridge peers; modify the
method to also import and call the Remote Control bridge peer listing (e.g.
require the bridge module that exposes a listPeers/listBridgePeers function),
merge those bridge peers into the peers array (preserving fields
address/name/pid/cwd and deduplicating by address), and ensure bridge-style
addresses (like "bridge:session_...") remain included so SendMessage and
advertised targets continue to work.

---

Nitpick comments:
In
`@packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts`:
- Around line 3-5: The test currently redefines the bun:bundle feature-flag
inline via mock.module('bun:bundle', ...) which creates an ad-hoc mock; remove
that inline mock and instead wire the test to use the shared bun:bundle mock
from the tests/mocks/ directory so feature flags remain consistent (ensure the
test imports or invokes the shared mock setup before running and configures the
UDS_INBOX flag via that shared mock); update the test to rely on the shared mock
entry (rather than mock.module) and remove any local feature-flag logic so other
suites use the same bun:bundle behavior.

In `@src/services/AgentSummary/__tests__/summaryContext.test.ts`:
- Around line 103-132: Add a test asserting the empty-input contract for
getSummaryContextFingerprint: call getSummaryContextFingerprint with an
empty/zero-length bounded context (e.g., [] or the minimal empty summaryContext
shape used by agentSummary.ts) and assert it returns null; this ensures
agentSummary.ts's guard (transcriptFingerprint && …) keeps working and prevents
regressions where an empty fingerprint becomes '' or another non-null value.
Locate the tests in summaryContext.test.ts and add the assertion alongside the
existing cases referencing getSummaryContextFingerprint so it runs with the
other fingerprint tests.

In `@src/services/AgentSummary/summaryContext.ts`:
- Around line 55-62: The code truncates large primitive strings but subtracts
the full text length from limit.remaining, which hides tail changes; in the
block handling primitives (check for value === null || typeof value !==
'object'), compute the actual hashed length first (e.g., sliceLen =
Math.min(text.length, limit.remaining)), call hash.update(text.slice(0,
sliceLen)), then decrement limit.remaining by sliceLen (not text.length), and
bail early if sliceLen is 0; refer to the variables/identifiers limit.remaining,
value, text, and hash.update to locate and fix the logic.

In `@src/utils/teammateMailbox.ts`:
- Around line 152-165: The writeMailboxAtomic function can leak the temp file if
writeFile/rename fails; update writeMailboxAtomic to wrap the rename (and the
write+rename sequence) in a try/catch and, on any failure after creating
tempPath, attempt to unlink(tempPath) (using the existing fs promises API or
importing unlink) before rethrowing the original error; reference the tempPath
variable and the rename/writeFile calls and ensure the unlink call itself is
swallowed or logged so cleanup failures don't mask the original error.

In `@src/utils/udsClient.ts`:
- Around line 46-50: The duplicated NDJSON response-reading logic (including
getChunkBytes) across getChunkBytes, sendToUdsSocket, and sendUdsMessage should
be extracted into a shared helper to avoid drift; create a new function (e.g.,
readNdjsonResponses or parseNdjsonStream) that encapsulates the
buffer/settled/finish loop, per-frame byte cap, NDJSON splitting, and the
response/error/pong settling rules, replace the duplicated implementations in
sendToUdsSocket and sendUdsMessage to call this helper, and keep the existing
getChunkBytes either moved into the helper or exported for reuse so both modules
use the same utility.

In `@src/utils/udsMessaging.ts`:
- Around line 311-318: Replace the direct equality check in isAuthorizedMessage
by performing a constant-time comparison using crypto.timingSafeEqual: import
timingSafeEqual (or crypto) into the module, retrieve the token via
getMessageAuthToken(message), and if both token and authToken are defined and
have the same byte length, compare Buffer.from(token) and Buffer.from(authToken)
with timingSafeEqual; return false for missing tokens or length mismatch. Ensure
you reference getMessageAuthToken and isAuthorizedMessage when making the
change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bb598331-6192-4ca8-b170-4a2437972ed5

📥 Commits

Reviewing files that changed from the base of the PR and between 3cb4828 and f353eb0.

📒 Files selected for processing (17)
  • packages/builtin-tools/src/tools/ListPeersTool/ListPeersTool.ts
  • packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts
  • packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts
  • src/cli/print.ts
  • src/commands/peers/peers.ts
  • src/services/AgentSummary/__tests__/summaryContext.test.ts
  • src/services/AgentSummary/agentSummary.ts
  • src/services/AgentSummary/summaryContext.ts
  • src/utils/__tests__/ndjsonFramer.test.ts
  • src/utils/__tests__/teammateMailbox.test.ts
  • src/utils/__tests__/udsMessaging.test.ts
  • src/utils/messages/systemInit.ts
  • src/utils/ndjsonFramer.ts
  • src/utils/swarm/inProcessRunner.ts
  • src/utils/teammateMailbox.ts
  • src/utils/udsClient.ts
  • src/utils/udsMessaging.ts

Comment thread src/cli/print.ts Outdated
Comment thread src/utils/ndjsonFramer.ts Outdated
Comment thread src/utils/teammateMailbox.ts
Comment thread src/utils/udsMessaging.ts
CodeRabbit and Codecov surfaced real gaps in UDS framing, peer discovery, mailbox retention, and summary context coverage. This tightens those paths without suppressing review or coverage signals.

Constraint: PR #369 must address CodeRabbit and Codecov findings without warning suppression or fake fallbacks

Rejected: Suppress Codecov or CodeRabbit warnings | leaves real receive-path and test-isolation gaps

Rejected: Add unreachable feature-gated tests | bun:bundle keeps those branches compile-time gated in local tests

Confidence: high

Scope-risk: moderate

Directive: Keep UDS auth-token rejection outside feature flags; do not reintroduce inline token fallbacks

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage; bun run test:all; bun run lint; bun run build; bun run build:vite; bun audit; git diff --cached --check

Not-tested: Remote Codecov/CodeRabbit refreshed reports until pushed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts (1)

583-610: ⚠️ Potential issue | 🟡 Minor

Marker-based token-rejection state is fragile across copies.

markAndRedactInlineUdsToken mutates input.to and stamps a non-schema marker key on the same object. The downstream guards in validateInput (Lines 675-685) and call (Lines 809-823, 860-868) depend on either hasInlineUdsToken(input.to) (already redacted by this point) or wasInlineUdsTokenRejected(input) (the marker). If any framework path passes a fresh/cloned input to validateInput/call (e.g., after schema parsing or sanitization that drops unknown keys), the marker is lost and the redacted input.to no longer contains #token=, so the request would slip past the rejection. The current tests cover the same-object path; consider a more robust signal (e.g., storing the rejection on a WeakSet keyed by the input object, or relying purely on a pre-redaction check before backfill mutates to).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts` around
lines 583 - 610, markAndRedactInlineUdsToken currently mutates input.to and
writes a non-schema marker onto the same object, which can be lost if callers
clone/sanitize input before validateInput or call; update the logic so rejection
survives copies by (a) adding a WeakSet (e.g., inlineUdsTokenRejectedSet) keyed
by the original input object and mark the object in markAndRedactInlineUdsToken,
and then change wasInlineUdsTokenRejected to consult that WeakSet, or (b)
alternatively perform the pre-redaction rejection check before mutating input.to
and rely solely on hasInlineUdsToken for decision; locate usages in
markAndRedactInlineUdsToken, hasInlineUdsToken, wasInlineUdsTokenRejected,
validateInput and call and ensure the chosen approach consistently
detects/rejects token-bearing requests even when input objects are cloned or
unknown keys are stripped.
🧹 Nitpick comments (9)
src/utils/__tests__/teammateMailbox.test.ts (2)

58-71: Hooks are file-scoped and unnecessarily applied to getLastPeerDmSummary.

beforeEach/afterEach are declared at the file level, so the temp directory creation and CLAUDE_CONFIG_DIR mutation also run for every getLastPeerDmSummary test even though those tests don't touch the filesystem. Consider scoping these hooks inside the two describe blocks that need them, to keep tests independent and slightly faster.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/__tests__/teammateMailbox.test.ts` around lines 58 - 71, The
file-level beforeEach/afterEach that create tempHome and mutate
process.env.CLAUDE_CONFIG_DIR are running for all tests (including ones for
getLastPeerDmSummary) and should be moved into the two describe blocks that
actually need filesystem setup; locate the beforeEach/afterEach functions that
reference previousConfigDir, tempHome, mkdtempSync,
process.env.CLAUDE_CONFIG_DIR and rm and cut them out of the file scope and
paste them inside the specific describe(...) blocks that exercise filesystem
behavior, leaving tests for getLastPeerDmSummary unaffected.

1-22: Consider using the src/* path alias for imports.

Per coding guidelines, the src/* path alias should be preferred over deep relative imports. This is consistent with the rest of the codebase and easier to refactor.

♻️ Proposed change
-import type { Message } from '../../types/message.js'
+import type { Message } from 'src/types/message'
 import {
   compactMailboxMessages,
   getLastPeerDmSummary,
   ...
-} from '../teammateMailbox.js'
+} from 'src/utils/teammateMailbox'

As per coding guidelines: "Path alias src/* is available via tsconfig and maps to ./src/*. Use this for imports throughout the codebase".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/__tests__/teammateMailbox.test.ts` around lines 1 - 22, Replace
deep relative imports in this test with the src/* path alias: update imports for
types (Message), all functions and types from ../teammateMailbox.js
(compactMailboxMessages, getLastPeerDmSummary, getInboxPath,
markMessageAsReadByIndex, markMessageAsReadByIdentity, markMessagesAsRead,
markMessagesAsReadByPredicate, MAX_MAILBOX_MESSAGE_TEXT_BYTES,
MAX_MAILBOX_MESSAGES, MAX_READ_MAILBOX_MESSAGES,
MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES, readMailbox, TeammateMessage,
writeToMailbox) and any node:fs/node:path/node:os imports to use the project
alias where appropriate (e.g., import from "src/utils/teammateMailbox" or
"src/types/message" instead of relative paths) so the file follows the src/*
tsconfig path-alias convention.
src/services/AgentSummary/__tests__/agentSummary.test.ts (3)

83-85: Simplify the scheduled type — TimerHandler & (...) => void is meaningless.

TimerHandler is string | Function, so intersecting it with a function type doesn’t add useful constraints and makes Parameters<...> resolve in a confusing way. The captured callback is always invoked with no args (Line 119, 126), so a simple signature is clearer and equivalent.

♻️ Proposed simplification
-  let scheduled:
-    | ((...args: Parameters<TimerHandler & ((...args: unknown[]) => void)>) => void)
-    | undefined
+  let scheduled: ((...args: unknown[]) => void | Promise<void>) | undefined
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/AgentSummary/__tests__/agentSummary.test.ts` around lines 83 -
85, The variable `scheduled` currently uses an overcomplicated type involving
`Parameters<TimerHandler & ((...args: unknown[]) => void)>`; replace its
declaration with a simple function signature `(() => void) | undefined` (i.e.,
make `scheduled` type `() => void` or `undefined`) and update any places that
assign or call it (the setTimeout/mock callback assignments and invocations
around `scheduled` at lines where `scheduled` is assigned and invoked) to use
zero-argument calls (e.g., `scheduled && scheduled()` or optional chaining
`scheduled?.()`). This removes the meaningless intersection with `TimerHandler`
and clarifies that the captured callback takes no arguments.

43-48: Avoid mocking the pure filterIncompleteToolCalls helper.

This factory replaces filterIncompleteToolCalls with an identity function. Per the test guidelines, only modules with side effects (I/O, state, network) should be mocked — not pure helpers. The existing transcriptMessages are already “clean” (no orphan tool calls), so the real implementation would behave identically and would also catch regressions if the helper’s contract changes.

Consider dropping this mock and letting the real implementation run.

As per coding guidelines: "Mock only modules with side effects (dependencies with I/O, state, or network access). Do not mock pure functions or pure data modules."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/AgentSummary/__tests__/agentSummary.test.ts` around lines 43 -
48, The test currently replaces the pure helper filterIncompleteToolCalls with
an identity function via
mock.module('@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js'),
which hides regressions; remove that mock so the real filterIncompleteToolCalls
implementation runs for agentSummary tests (delete or comment out the
mock.module block that returns filterIncompleteToolCalls: <T>(messages: T) =>
messages) and re-run the tests—this keeps the pure helper unmocked while still
allowing you to mock only side‑effecting modules.

29-31: Replace require() with ESM import for the sessionStorage snapshot and restoration.

The test file uses proper ESM imports throughout (lines 1–13), but lines 29–31 and 74–78 use require('../../../utils/sessionStorage.ts'), which bypasses Bun's ESM resolution and relies on CommonJS interop. Additionally, there's a path inconsistency: require() uses .ts while mock.module() expects .js.

Replace with:

import * as sessionStorageModule from '../../../utils/sessionStorage.js'

const sessionStorageSnapshot = {
  ...sessionStorageModule,
}

Then update the afterAll factory (lines 74–78) to:

afterAll(() => {
  mock.module('src/utils/sessionStorage.js', () => sessionStorageModule)
})

This aligns with the coding guideline: "All imports, builds, and execution must use Bun APIs, not Node.js APIs. Module system must be ESM."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/AgentSummary/__tests__/agentSummary.test.ts` around lines 29 -
31, Replace the CommonJS require call that builds sessionStorageSnapshot with an
ESM import: import the sessionStorage module as sessionStorageModule from
'../../../utils/sessionStorage.js' and spread that into sessionStorageSnapshot
(replace the require('../../../utils/sessionStorage.ts') usage). Then update the
afterAll factory that registers the mock to call mock.module with the '.js' path
('src/utils/sessionStorage.js') and return the sessionStorageModule so the mock
restores the ESM module correctly; update references to the
sessionStorageSnapshot and afterAll factory accordingly.
src/services/AgentSummary/summaryContext.ts (1)

1-1: Use the node: prefix for built-in modules to align with Bun best practices.

Bun's documentation recommends the node: prefix for built-in modules to ensure reliable resolution in strict ESM contexts and avoid potential conflicts with userland packages. While the codebase is currently inconsistent (mostly using 'crypto', but increasingly preferring 'node:crypto' in packages), adopting the prefix improves clarity and future compatibility.

♻️ Proposed change
-import { createHash } from 'crypto'
+import { createHash } from 'node:crypto'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/AgentSummary/summaryContext.ts` at line 1, Replace the built-in
module import to use the node: prefix so Bun resolves it reliably: update the
import that currently pulls createHash (symbol createHash) from 'crypto' to
import createHash from 'node:crypto' across this module (summaryContext.ts) so
the built-in crypto module uses the node: namespace consistently with other
packages.
packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts (2)

858-890: Duplicate inline-token rejection inside the UDS branch is unreachable.

The earlier guard at Lines 809-823 already returns before this branch is entered for any plain-text + uds: + inline-token combination, and validateInput rejects the same case upstream. The check on Lines 860-868 is dead defensive code; consider removing it or, if you prefer to keep the defense-in-depth, drop the earlier 809-823 block and consolidate here so reviewers can see one authoritative gate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts` around
lines 858 - 890, The duplicate inline-token rejection inside the uds branch is
unreachable because validateInput and the earlier guard already return for
plain-text uds addresses with inline tokens; remove the inner defensive check
that calls hasInlineUdsToken and wasInlineUdsTokenRejected (the if-block that
returns the "uds addresses must not include inline auth tokens" error) from the
SendMessageTool uds handling, leaving the recipientForDisplay,
require('src/utils/udsClient.js') import, sendToUdsSocket call, and try/catch
intact; alternatively, if you prefer one authoritative gate, delete the earlier
guard in validateInput and keep this check here — but do not keep both.

593-604: Redundant recipientForDisplay after markAndRedactInlineUdsToken.

By Line 593/604, input.to has already been redacted at Line 583-585 (markAndRedactInlineUdsToken mutates input.to), so recipientForDisplay(input.to) is effectively input.to. Either drop the redundant call or skip the in-place mutation in markAndRedactInlineUdsToken — currently both layers run and the intent (which one is authoritative) is muddled.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts` around
lines 593 - 604, The code calls recipientForDisplay(input.to) after
markAndRedactInlineUdsToken has already mutated input.to, making the
recipientForDisplay call redundant; either remove the redundant
recipientForDisplay call in SendMessageTool (the lines setting input.recipient =
recipientForDisplay(input.to)) or change markAndRedactInlineUdsToken to return a
redacted value instead of mutating input.to and then use that return value when
setting input.recipient — pick one approach and update usages (references:
recipientForDisplay, markAndRedactInlineUdsToken, input.to, input.recipient) so
only one authoritative redaction/display conversion is performed.
src/utils/udsResponseReader.ts (1)

45-54: Frame-size check uses O(n) buffer measurement per chunk.

Buffer.byteLength(buffer, 'utf8') rescans the entire pending buffer on every 'data' event. With many small chunks this is O(n²) over the response stream. Since maxFrameBytes is bounded at 64 KiB this is not a real problem, but a tracked counter (bufferBytes += getChunkBytes(chunk), decremented when lines are extracted) would be cheaper and easier to reason about.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/udsResponseReader.ts` around lines 45 - 54, Replace the repeated
full-buffer size calls with a tracked byte counter: introduce a variable (e.g.,
bufferBytes) in the scope of the socket data handler, increment it by
getChunkBytes(chunk) when a chunk arrives, use that counter to compare against
options.maxFrameBytes and call finish(new Error('UDS response frame exceeded
size limit')) if exceeded, and when you remove/extract lines from buffer
decrement bufferBytes by the number of bytes removed; update logic around
socket.on('data', chunk => { ... }) and any line-extraction code so the counter
stays consistent (referencing bufferBytes, getChunkBytes, options.maxFrameBytes
and finish).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/udsMessaging.ts`:
- Around line 412-475: The server currently replies with an 'error' frame but
leaves the client connected in attachNdjsonFramer's handler; update the handlers
for isAuthorizedMessage failure, enqueueInboxEntry rejection, and the
onFrameError callback to call socket.destroy() (after sending the error response
and only if !socket.destroyed) so misbehaving clients can't hold UDS client
slots; reference the attachNdjsonFramer callback, isAuthorizedMessage,
enqueueInboxEntry, onFrameError, and writeSocketMessage when making this change.
- Around line 393-396: The capability file is written before the server begins
listening, creating a race where the token is visible while the socket isn't
ready; change the sequence so the server starts listening first and only then
write the capability file. Specifically, in the code paths using
ensureAuthToken(), the Promise that wraps socket.listen() (the server.listen()
flow) should be awaited and verified successful before calling
writeCapabilityFile(path, token) and assigning socketPath; also add symmetric
cleanup: if writeCapabilityFile fails after listen, close the server and remove
any socket file, and if listen fails do not write the capability file. Update
the logic around ensureAuthToken, writeCapabilityFile and the listen Promise to
reflect this order and error-handling.
- Around line 99-112: getDefaultUdsSocketPath() caches defaultSocketPath with a
per-process nonce but stopUdsMessaging and the failure catch in
startUdsMessaging don't clear that cache, causing restarts to reuse a stale
path; update stopUdsMessaging() to set defaultSocketPath = null and also clear
defaultSocketPath in the catch block inside startUdsMessaging() (the start
failure path) so a subsequent call to getDefaultUdsSocketPath() generates a
fresh nonce and new socket path.
- Around line 487-507: The current srv.on('error', reject) listener can swallow
runtime server errors after the Promise resolves; change this to a guarded
handler that calls reject only if the Promise hasn't yet settled and otherwise
logs runtime errors. Concretely, attach an error handler (referencing srv) that
checks a local "settled" flag (set when resolve() runs) or removes itself after
successful listen, so that when server = srv and
process.env.CLAUDE_CODE_MESSAGING_SOCKET is set you detach the reject handler
and replace it with a runtime logger that calls logForDebugging (including the
path and opts?.isExplicit context) and closes srv as appropriate; ensure
chmod(path, 0o600) errors still reject the Promise and that any listeners are
cleaned up to avoid swallowing later EMFILE/EPIPE errors.

In `@src/utils/udsResponseReader.ts`:
- Around line 27-81: attachUdsResponseReader currently only listens for 'data'
and 'error' so peer half-close/close can leave finish() (and options.onSettled)
uncalled; add socket 'end' and 'close' handlers that invoke finish(...) to fail
fast when the socket is closed without a complete response (e.g., finish(new
Error('UDS socket closed before response'))), and ensure the handlers also clear
any partial buffer/cleanup so options.onSettled is always called; reference
attachUdsResponseReader and finish to locate where to add these listeners.

---

Outside diff comments:
In `@packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts`:
- Around line 583-610: markAndRedactInlineUdsToken currently mutates input.to
and writes a non-schema marker onto the same object, which can be lost if
callers clone/sanitize input before validateInput or call; update the logic so
rejection survives copies by (a) adding a WeakSet (e.g.,
inlineUdsTokenRejectedSet) keyed by the original input object and mark the
object in markAndRedactInlineUdsToken, and then change wasInlineUdsTokenRejected
to consult that WeakSet, or (b) alternatively perform the pre-redaction
rejection check before mutating input.to and rely solely on hasInlineUdsToken
for decision; locate usages in markAndRedactInlineUdsToken, hasInlineUdsToken,
wasInlineUdsTokenRejected, validateInput and call and ensure the chosen approach
consistently detects/rejects token-bearing requests even when input objects are
cloned or unknown keys are stripped.

---

Nitpick comments:
In `@packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts`:
- Around line 858-890: The duplicate inline-token rejection inside the uds
branch is unreachable because validateInput and the earlier guard already return
for plain-text uds addresses with inline tokens; remove the inner defensive
check that calls hasInlineUdsToken and wasInlineUdsTokenRejected (the if-block
that returns the "uds addresses must not include inline auth tokens" error) from
the SendMessageTool uds handling, leaving the recipientForDisplay,
require('src/utils/udsClient.js') import, sendToUdsSocket call, and try/catch
intact; alternatively, if you prefer one authoritative gate, delete the earlier
guard in validateInput and keep this check here — but do not keep both.
- Around line 593-604: The code calls recipientForDisplay(input.to) after
markAndRedactInlineUdsToken has already mutated input.to, making the
recipientForDisplay call redundant; either remove the redundant
recipientForDisplay call in SendMessageTool (the lines setting input.recipient =
recipientForDisplay(input.to)) or change markAndRedactInlineUdsToken to return a
redacted value instead of mutating input.to and then use that return value when
setting input.recipient — pick one approach and update usages (references:
recipientForDisplay, markAndRedactInlineUdsToken, input.to, input.recipient) so
only one authoritative redaction/display conversion is performed.

In `@src/services/AgentSummary/__tests__/agentSummary.test.ts`:
- Around line 83-85: The variable `scheduled` currently uses an overcomplicated
type involving `Parameters<TimerHandler & ((...args: unknown[]) => void)>`;
replace its declaration with a simple function signature `(() => void) |
undefined` (i.e., make `scheduled` type `() => void` or `undefined`) and update
any places that assign or call it (the setTimeout/mock callback assignments and
invocations around `scheduled` at lines where `scheduled` is assigned and
invoked) to use zero-argument calls (e.g., `scheduled && scheduled()` or
optional chaining `scheduled?.()`). This removes the meaningless intersection
with `TimerHandler` and clarifies that the captured callback takes no arguments.
- Around line 43-48: The test currently replaces the pure helper
filterIncompleteToolCalls with an identity function via
mock.module('@claude-code-best/builtin-tools/tools/AgentTool/runAgent.js'),
which hides regressions; remove that mock so the real filterIncompleteToolCalls
implementation runs for agentSummary tests (delete or comment out the
mock.module block that returns filterIncompleteToolCalls: <T>(messages: T) =>
messages) and re-run the tests—this keeps the pure helper unmocked while still
allowing you to mock only side‑effecting modules.
- Around line 29-31: Replace the CommonJS require call that builds
sessionStorageSnapshot with an ESM import: import the sessionStorage module as
sessionStorageModule from '../../../utils/sessionStorage.js' and spread that
into sessionStorageSnapshot (replace the
require('../../../utils/sessionStorage.ts') usage). Then update the afterAll
factory that registers the mock to call mock.module with the '.js' path
('src/utils/sessionStorage.js') and return the sessionStorageModule so the mock
restores the ESM module correctly; update references to the
sessionStorageSnapshot and afterAll factory accordingly.

In `@src/services/AgentSummary/summaryContext.ts`:
- Line 1: Replace the built-in module import to use the node: prefix so Bun
resolves it reliably: update the import that currently pulls createHash (symbol
createHash) from 'crypto' to import createHash from 'node:crypto' across this
module (summaryContext.ts) so the built-in crypto module uses the node:
namespace consistently with other packages.

In `@src/utils/__tests__/teammateMailbox.test.ts`:
- Around line 58-71: The file-level beforeEach/afterEach that create tempHome
and mutate process.env.CLAUDE_CONFIG_DIR are running for all tests (including
ones for getLastPeerDmSummary) and should be moved into the two describe blocks
that actually need filesystem setup; locate the beforeEach/afterEach functions
that reference previousConfigDir, tempHome, mkdtempSync,
process.env.CLAUDE_CONFIG_DIR and rm and cut them out of the file scope and
paste them inside the specific describe(...) blocks that exercise filesystem
behavior, leaving tests for getLastPeerDmSummary unaffected.
- Around line 1-22: Replace deep relative imports in this test with the src/*
path alias: update imports for types (Message), all functions and types from
../teammateMailbox.js (compactMailboxMessages, getLastPeerDmSummary,
getInboxPath, markMessageAsReadByIndex, markMessageAsReadByIdentity,
markMessagesAsRead, markMessagesAsReadByPredicate,
MAX_MAILBOX_MESSAGE_TEXT_BYTES, MAX_MAILBOX_MESSAGES, MAX_READ_MAILBOX_MESSAGES,
MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES, readMailbox, TeammateMessage,
writeToMailbox) and any node:fs/node:path/node:os imports to use the project
alias where appropriate (e.g., import from "src/utils/teammateMailbox" or
"src/types/message" instead of relative paths) so the file follows the src/*
tsconfig path-alias convention.

In `@src/utils/udsResponseReader.ts`:
- Around line 45-54: Replace the repeated full-buffer size calls with a tracked
byte counter: introduce a variable (e.g., bufferBytes) in the scope of the
socket data handler, increment it by getChunkBytes(chunk) when a chunk arrives,
use that counter to compare against options.maxFrameBytes and call finish(new
Error('UDS response frame exceeded size limit')) if exceeded, and when you
remove/extract lines from buffer decrement bufferBytes by the number of bytes
removed; update logic around socket.on('data', chunk => { ... }) and any
line-extraction code so the counter stays consistent (referencing bufferBytes,
getChunkBytes, options.maxFrameBytes and finish).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e85d8030-5133-4048-972f-e05ad93c0f9d

📥 Commits

Reviewing files that changed from the base of the PR and between f353eb0 and ee0d788.

📒 Files selected for processing (15)
  • packages/builtin-tools/src/tools/ListPeersTool/ListPeersTool.ts
  • packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts
  • packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts
  • src/bridge/peerSessions.ts
  • src/cli/print.ts
  • src/services/AgentSummary/__tests__/agentSummary.test.ts
  • src/services/AgentSummary/__tests__/summaryContext.test.ts
  • src/services/AgentSummary/summaryContext.ts
  • src/utils/__tests__/teammateMailbox.test.ts
  • src/utils/__tests__/udsMessaging.test.ts
  • src/utils/ndjsonFramer.ts
  • src/utils/teammateMailbox.ts
  • src/utils/udsClient.ts
  • src/utils/udsMessaging.ts
  • src/utils/udsResponseReader.ts
✅ Files skipped from review due to trivial changes (1)
  • src/utils/udsClient.ts
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/utils/tests/udsMessaging.test.ts
  • src/cli/print.ts
  • src/utils/ndjsonFramer.ts
  • src/utils/teammateMailbox.ts

Comment thread src/utils/udsMessaging.ts
Comment thread src/utils/udsMessaging.ts Outdated
Comment thread src/utils/udsMessaging.ts
Comment thread src/utils/udsMessaging.ts Outdated
Comment thread src/utils/udsResponseReader.ts
amDosion pushed a commit that referenced this pull request Apr 27, 2026
Tighten the UDS auth, framing, and response-reader boundaries while keeping the AgentSummary lifecycle covered so Codecov and CI fail on real regressions instead of missing coverage. The poorMode settings mock mirrors unrelated real settings defaults to avoid Bun mock retention changing later permission tests.

Constraint: PR #369 must fix Codecov/CI precisely without warning suppression, fallback masking, or mock pollution

Rejected: Delete AgentSummary lifecycle coverage | would hide Codecov loss and stale-summary behavior

Rejected: Store inline UDS rejection in a hidden input sentinel | cloned observable inputs can drop it and bypass rejection

Rejected: Ignore malformed UDS frames until timeout | leaves client slots and SendMessage calls open to exhaustion

Confidence: high

Scope-risk: moderate

Directive: Keep empty #token= markers rejected; do not require a non-empty token value in hasInlineUdsToken

Tested: bun test packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts src/utils/__tests__/udsMessaging.test.ts src/utils/__tests__/udsResponseReader.test.ts src/utils/__tests__/ndjsonFramer.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run test:all

Tested: bun audit

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted Codecov upload until pushed PR checks rerun
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (9)
packages/builtin-tools/src/tools/AgentTool/filterIncompleteToolCalls.ts (1)

17-17: Remove redundant type assertions; type narrowing already applies.

Since Message is a discriminated union on type (from @ant/model-provider), the checks message?.type === 'user' and message?.type === 'assistant' automatically narrow the type to UserMessage and AssistantMessage respectively. The as UserMessage and as AssistantMessage assertions are redundant and can be removed, simplifying the code to align with the guideline: "Use type guards (type narrowing) with union types instead of forced type casting".

♻️ Proposed simplification
   for (const message of messages) {
     if (message?.type === 'user') {
-      const userMessage = message as UserMessage
-      const content = userMessage.message.content
+      const content = message.message.content
       if (Array.isArray(content)) {
   for (const message of messages) {
     if (message?.type === 'assistant') {
-      const assistantMessage = message as AssistantMessage
-      const content = assistantMessage.message.content
+      const content = message.message.content
       if (Array.isArray(content)) {
         let changed = false
@@
         if (filteredContent.length > 0) {
           withoutOrphanToolUses.push({
-            ...assistantMessage,
+            ...message,
             message: {
-              ...assistantMessage.message,
+              ...message.message,
               content: filteredContent,
             },
           })
     if (message?.type !== 'user') {
       filteredMessages.push(message)
       continue
     }
-    const userMessage = message as UserMessage
-    const content = userMessage.message.content
+    const content = message.message.content
     if (!Array.isArray(content)) {
@@
     if (filteredContent.length > 0) {
       filteredMessages.push({
-        ...userMessage,
+        ...message,
         message: {
-          ...userMessage.message,
+          ...message.message,
           content: filteredContent,
         },
       })
     }

Applies to lines 17, 34, and 77.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/AgentTool/filterIncompleteToolCalls.ts` at
line 17, Remove the unnecessary type assertions in filterIncompleteToolCalls.ts:
where the code uses "const userMessage = message as UserMessage" and similar "as
AssistantMessage" casts (occurring around the variables assigned at the checks
for message?.type === 'user' and message?.type === 'assistant'), rely on the
discriminated-union type narrowing instead and change those assignments to use
the already-narrowed type (e.g., const userMessage = message) so you eliminate
the redundant casts in the function(s) handling message type checks.
src/commands/poor/__tests__/poorMode.test.ts (2)

16-45: Extract the expanded settings mock into tests/mocks/settings.ts instead of inlining a full-surface mock.

This change grows the inline mock from a few functions to ~16 exports — exactly the situation the shared-mocks guideline is meant to prevent. The author's own comment on lines 34–35 hints at the leakage risk that motivates the rule. Putting the surface in tests/mocks/settings.ts keeps it in one place, lets every settings consumer test reuse it, and avoids drift each time settings.ts adds an export.

As per coding guidelines: "Use shared mocks from tests/mocks/ for log.ts, debug.ts, bun:bundle, settings.js, config.ts, and auth.ts" and "Never inline mock definitions; update shared mock files instead".

♻️ Suggested shape

tests/mocks/settings.ts (new or extended) exports a factory:

// tests/mocks/settings.ts
export function makeSettingsMock(initial: Record<string, unknown> = {}) {
  let state = initial
  let lastUpdate: { source: string; patch: Record<string, unknown> } | null = null
  return {
    state: () => state,
    setState: (s: Record<string, unknown>) => { state = s },
    lastUpdate: () => lastUpdate,
    resetLastUpdate: () => { lastUpdate = null },
    exports: {
      loadManagedFileSettings: () => ({ settings: null, errors: [] }),
      // ...full surface...
      getInitialSettings: () => state,
      updateSettingsForSource: (source: string, patch: Record<string, unknown>) => {
        lastUpdate = { source, patch }
        state = { ...state, ...patch }
      },
    },
  }
}

Then in this test file:

-mock.module('src/utils/settings/settings.js', () => ({
-  loadManagedFileSettings: () => ({ settings: null, errors: [] }),
-  ...
-}))
+import { makeSettingsMock } from '../../../../tests/mocks/settings'
+const settingsMock = makeSettingsMock()
+mock.module('src/utils/settings/settings.js', () => settingsMock.exports)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/poor/__tests__/poorMode.test.ts` around lines 16 - 45, Extract
the inline full-surface settings mock into a shared factory in
tests/mocks/settings.ts (e.g. export function makeSettingsMock(initial = {}) {
... }) that maintains internal state and lastUpdate and returns an object with
.exports containing the same exported functions used here
(loadManagedFileSettings, getManagedFileSettingsPresence, parseSettingsFile,
getInitialSettings, getSettingsForSource, getSettingsWithErrors,
getSettingsWithSources, getSettings_DEPRECATED, settingsMergeCustomizer,
getManagedSettingsKeysForLogging, hasAutoModeOptIn,
hasSkipDangerousModePermissionPrompt, getAutoModeConfig,
getUseAutoModeDuringPlan, rawSettingsContainsKey, updateSettingsForSource,
getSettingsRootPathForSource, getSettingsFilePathForSource,
getRelativeSettingsFilePathForSource, getPolicySettingsOrigin). Then update this
test (poorMode.test.ts) to import makeSettingsMock(), call it to get the mock
instance, and pass mockInstance.exports into mock.module instead of the large
inline object; ensure the factory exposes helpers (state, setState, lastUpdate,
resetLastUpdate) if the tests rely on reading/updating mockSettings.

47-50: Re-mocking with the real namespace after mock.restore() is the right guard against leakage into sibling suites. However, import * as settingsModule returns a frozen module namespace; passing it directly as the factory result works in Bun today but will throw if any consumer attempts to mutate exports. When moving this to the recommended shared tests/mocks/settings.ts factory, use a plain object spread (() => ({ ...settingsModule })) to avoid that footgun. Additionally, per the coding guidelines, all tests mocking settings should consolidate to a single shared mock in tests/mocks/settings.ts rather than maintaining divergent inline mocks across providers.test.ts, effort.test.ts, and others—this eliminates the leakage risk entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/poor/__tests__/poorMode.test.ts` around lines 47 - 50, The test
re-mocks the real settings namespace after mock.restore() using the frozen
imported namespace (settingsModule), which can throw if consumers mutate
exports; change the factory passed to
mock.module('src/utils/settings/settings.js', ...) to return a plain mutable
object copy instead of the frozen namespace (e.g., use a factory that spreads
settingsModule into a new object), and move/consolidate all inline settings
mocks from this and other tests (providers.test.ts, effort.test.ts, etc.) into
the shared tests/mocks/settings.ts mock factory so every test uses the same
mutable mock implementation and avoids leakage.
src/utils/__tests__/teammateMailbox.test.ts (1)

126-143: Tighten the req-1 substring assertion to avoid masking off-by-one regressions.

'req-1' is also a substring of 'req-10', 'req-100', 'req-1000', etc., so this assertion would still pass if compaction kept [req-10..req-2000] instead of the intended [req-1..req-2000]. Match against the JSON-quoted form (or check both bookends) so the cap boundary is genuinely exercised.

Proposed refinement
     expect(compacted).toHaveLength(MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES)
-    expect(compacted[0]?.text).toContain('req-1')
+    expect(compacted[0]?.text).toContain('"req-1"')
+    expect(compacted.at(-1)?.text).toContain(
+      `"req-${MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES}"`,
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/__tests__/teammateMailbox.test.ts` around lines 126 - 143, The test
for compactMailboxMessages is using a loose substring 'req-1' which can match
'req-10' etc.; update the assertion to assert the exact JSON-quoted request_id
(e.g. expect(compacted[0]?.text).toContain('"request_id":"req-1"') or otherwise
check both the first and last items to verify the true cap boundary) so the cap
at MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES is exercised; locate the test that
constructs messages via message(...) and the compactMailboxMessages(...) call
and tighten the assertion accordingly.
src/utils/udsMessaging.ts (3)

81-86: Exported limits should be documented as part of the public surface.

MAX_UDS_INBOX_ENTRIES, MAX_UDS_FRAME_BYTES, MAX_UDS_INBOX_BYTES, MAX_UDS_CLIENTS, UDS_AUTH_TIMEOUT_MS, UDS_IDLE_TIMEOUT_MS are now exported and consumed by the test suite (and by udsClient/udsResponseReader indirectly via MAX_UDS_FRAME_BYTES). Adding a one-line JSDoc to each — particularly the rationale for the byte/entry caps — would make future tuning safer; right now the only documentation is the bare numeric literal.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/udsMessaging.ts` around lines 81 - 86, Add one-line JSDoc comments
above each exported constant (MAX_UDS_INBOX_ENTRIES, MAX_UDS_FRAME_BYTES,
MAX_UDS_INBOX_BYTES, MAX_UDS_CLIENTS, UDS_AUTH_TIMEOUT_MS, UDS_IDLE_TIMEOUT_MS)
describing what the limit controls and the rationale (e.g., memory/throughput
protection, preventing oversized frames, expected client concurrency, or
auth/idle timeout behavior) so consumers and tests understand why the numeric
value was chosen; keep each JSDoc short (one sentence) and include units for
time/bytes where applicable to make future tuning safe and self-documenting.

211-228: Redundant pre-check duplicates assertPrivateDirectory logic.

Lines 215-219 already check !stat.isDirectory() || stat.isSymbolicLink() and throw, but the very next line calls assertPrivateDirectory (Line 175) which performs the same check with a slightly different message. The pre-check is dead code; just call assertPrivateDirectory directly.

♻️ Optional cleanup
   try {
     const stat = await lstat(dir)
-    if (!stat.isDirectory() || stat.isSymbolicLink()) {
-      throw new Error(
-        `[udsMessaging] socket parent is not a directory: ${dir}`,
-      )
-    }
     assertPrivateDirectory(stat, dir, 'socket parent')
     return
   } catch (error) {
     if (!isNotFound(error)) throw error
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/udsMessaging.ts` around lines 211 - 228, In
ensureSocketParent(path) remove the redundant pre-check that inspects
stat.isDirectory() and stat.isSymbolicLink() and throws, and instead call
assertPrivateDirectory(stat, dir, 'socket parent') directly after obtaining stat
from lstat; keep the existing try/catch around lstat and the subsequent
mkdir/chmod behavior unchanged so the same validation and error messages come
from assertPrivateDirectory rather than the duplicated manual check.

299-316: inbox full error message conflates three distinct rejection reasons.

enqueueInboxEntry returns false for (a) per-message frame oversize, (b) entry-count overflow, and (c) byte-budget overflow, but logs all three as inbox full. The framer should already reject (a) before this path, so it's mostly defensive — yet operators reading the log can't distinguish "we're being flooded" from "a giant single message slipped past the framer cap" (which would indicate a bug). Splitting the log message would make production triage cheaper without changing behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/udsMessaging.ts` around lines 299 - 316, The log in
enqueueInboxEntry currently lumps three different rejection causes into one
"inbox full" error; update enqueueInboxEntry to detect which condition failed
(entryBytes > MAX_UDS_FRAME_BYTES, inbox.length >= MAX_UDS_INBOX_ENTRIES, or
inboxBytes + entryBytes > MAX_UDS_INBOX_BYTES) and log a distinct, descriptive
message for each case (e.g., "message too large (exceeds MAX_UDS_FRAME_BYTES)",
"inbox entry count limit reached", or "inbox byte quota exceeded") while keeping
the same boolean return behavior; use the existing variables (entryBytes,
MAX_UDS_FRAME_BYTES, inbox.length, MAX_UDS_INBOX_ENTRIES, inboxBytes,
MAX_UDS_INBOX_BYTES) and the same logError call site so only the message string
changes.
src/utils/__tests__/udsMessaging.test.ts (1)

137-149: Sentinel error path swallows the diagnostic message.

If MAX_UDS_INBOX_BYTES enforcement ever regresses, the inner throw new Error('byte cap was not enforced') is caught by the same catch and tested against 'inbox full', so the failure surfaces as a confusing assertion failure on the wrong message rather than the intended diagnostic. Re-throwing the sentinel out of the try (or asserting outside the loop) would make the regression mode obvious.

♻️ Optional restructure
-    for (;;) {
-      try {
-        await sendUdsMessage(path, { type: 'text', data: payload })
-        accepted++
-        if (accepted > MAX_UDS_INBOX_BYTES / payload.length + 20) {
-          throw new Error('byte cap was not enforced')
-        }
-      } catch (error) {
-        expect(error).toBeInstanceOf(Error)
-        expect((error as Error).message).toContain('inbox full')
-        break
-      }
-    }
+    const limit = MAX_UDS_INBOX_BYTES / payload.length + 20
+    let lastError: Error | undefined
+    while (accepted <= limit) {
+      try {
+        await sendUdsMessage(path, { type: 'text', data: payload })
+        accepted++
+      } catch (error) {
+        lastError = error as Error
+        break
+      }
+    }
+    expect(lastError).toBeInstanceOf(Error)
+    expect(lastError?.message).toContain('inbox full')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/__tests__/udsMessaging.test.ts` around lines 137 - 149, The test
loop around sendUdsMessage and MAX_UDS_INBOX_BYTES currently throws a sentinel
Error('byte cap was not enforced') inside the try which is then caught and
asserted as an 'inbox full' error; change the flow so the sentinel cannot be
mistaken for the expected inbox-full error: either move the assertion that the
cap was exceeded outside the try/catch (count accepted and after the loop assert
accepted did not exceed the threshold) or throw a distinct Error subclass or
unique sentinel that you re-throw from the catch when it originates from your
own check; update the loop in udsMessaging.test.ts referencing sendUdsMessage
and MAX_UDS_INBOX_BYTES so only genuine inbox-full errors are asserted inside
the catch and the sentinel failure surfaces separately.
packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts (1)

671-681: Redundant scheme check.

hasInlineUdsToken already calls parseAddress and gates on addr.scheme === 'uds', so the outer addr.scheme === 'uds' && is dead-code-ish. Also worth noting the third parseAddress(input.to) invocation in this function (Line 658, this block, and Line 690) — three parses of the same string in a single validation call.

♻️ Optional simplification
-      if (
-        addr.scheme === 'uds' &&
-        hasInlineUdsToken(input.to)
-      ) {
+      if (hasInlineUdsToken(input.to)) {
         return {
           result: false,
           message:
             'uds addresses must not include inline auth tokens; use the ListPeers address',
           errorCode: 9,
         }
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts` around
lines 671 - 681, The redundant outer scheme check should be removed because
hasInlineUdsToken already calls parseAddress and checks addr.scheme; instead
call hasInlineUdsToken(input.to) directly and rely on its internal parseAddress
logic, and avoid multiple parseAddress calls by parsing input.to once (reuse the
resulting addr variable) or by using the existing addr from parseAddress where
present; update the validation in SendMessageTool (the block referencing addr,
hasInlineUdsToken, and parseAddress) to use the single parsed address or the
direct hasInlineUdsToken call to eliminate the third parseAddress invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts`:
- Around line 805-816: The inline UDS-token rejection is only executed inside
the typeof input.message === 'string' branch in call, so structured messages can
bypass it; move the parseAddress(input.to) and hasInlineUdsToken(input.to) check
out of that string-only branch and run it for all messages at the start of
SendMessageTool.call (mirroring validateInput), returning the same { data: {
success: false, message: ... } } for any uds address with an inline token;
reference the call method, input.message, parseAddress, hasInlineUdsToken, and
the ListPeers message text to locate and update the logic.

In `@src/commands/poor/__tests__/poorMode.test.ts`:
- Around line 36-40: Extract the inline mock for settings into a shared
tests/mocks/settings.ts file and implement the exported mock functions so they
derive their return values from a single mockSettings object (not hardcoded
constants); specifically provide functions matching hasAutoModeOptIn,
hasSkipDangerousModePermissionPrompt, getAutoModeConfig,
getUseAutoModeDuringPlan, and rawSettingsContainsKey that read from mockSettings
(e.g., hasUseAutoModeDuringPlan should return the actual boolean present in
mockSettings, hasSkipDangerousModePermissionPrompt should reflect the stored
setting, getAutoModeConfig should return the stored object or undefined, and
rawSettingsContainsKey should check key in mockSettings), then replace the
inline mock in src/commands/poor/__tests__/poorMode.test.ts with an import of
this shared mock.

In `@src/utils/__tests__/udsMessaging.test.ts`:
- Around line 31-37: The test socket parent directory can remain with
overly-broad permissions across runs because socketPath() always returns
tmpdir()/claude-code-test/..., so fix by creating a unique per-test parent
directory and using that in socketPath/startUdsMessaging (or register
claude-code-test for guaranteed cleanup in afterEach); specifically change
socketPath to accept or derive a per-test parent (use fs.mkdtemp to create a
unique parent dir) and update tests that call
startUdsMessaging/ensureSocketParent to pass the per-test parent (or add an
afterEach that recursively removes the claude-code-test parent), ensuring tests
remove the parent directory after each run to avoid stale-permission failures.
- Line 99: The assertion referencing process.env.CLAUDE_CODE_MESSAGING_TOKEN is
incorrect/trivial — either change the asserted env var to the real sentinel
process.env.CLAUDE_CODE_MESSAGING_SOCKET if you intended to verify the
socket-based mechanism, or remove the assertion entirely and add a one-line
comment explaining this test is a forward-looking sentinel (or why no env var is
expected). Update the test in udsMessaging.test.ts to use
CLAUDE_CODE_MESSAGING_SOCKET (or remove the expect(...) line) and include a
brief clarifying comment if keeping the check.

In `@src/utils/ndjsonFramer.ts`:
- Around line 79-98: The tail bytes are decoded with
chunk.subarray(...).toString('utf8') which can split multibyte UTF-8 sequences
across chunks; replace this ad-hoc decoding with Node's StringDecoder to
preserve partial characters. Import and instantiate a StringDecoder('utf8')
(e.g., decoder), use decoder.write(...) instead of .toString('utf8') when
appending the tail and when decoding any partial pieces, and on final stream
close call decoder.end() to flush any remaining bytes; keep existing overflow
checks (bufferBytes/tailBytes and rejectOversizedFrame) and continue to use
emitLine(buffer) and buffer/reset logic as before.

In `@src/utils/udsMessaging.ts`:
- Around line 649-667: In stopUdsMessaging, ensure authToken is always cleared
regardless of socketPath presence: move the assignment that sets authToken =
null out of the if (socketPath) block so that after attempting to remove the
socket (via removeSocketPath) and logging, authToken is set to null
unconditionally; also keep the existing unconditional capabilityFilePath
unlink/cleanup as-is so both in-memory token (authToken) and capabilityFilePath
are wiped on any stopUdsMessaging path.
- Around line 692-721: The connection error path is exposed because
createConnection(targetSocketPath, ...) returns conn synchronously but
attachUdsResponseReader(conn, ...) (which registers error/end/close handlers) is
called later; move the attachUdsResponseReader call to immediately after const
conn = createConnection(...) (before any writes, setTimeout, or other async
actions) so the socket has its error/close handlers attached as soon as it is
created; ensure you still call conn.write(...) and conn.setTimeout(...)
afterward and reuse the existing finish(error?) callback; apply the same change
pattern to the analogous usages in udsClient.ts (around the code referenced at
lines ~230 and ~256).

---

Nitpick comments:
In `@packages/builtin-tools/src/tools/AgentTool/filterIncompleteToolCalls.ts`:
- Line 17: Remove the unnecessary type assertions in
filterIncompleteToolCalls.ts: where the code uses "const userMessage = message
as UserMessage" and similar "as AssistantMessage" casts (occurring around the
variables assigned at the checks for message?.type === 'user' and message?.type
=== 'assistant'), rely on the discriminated-union type narrowing instead and
change those assignments to use the already-narrowed type (e.g., const
userMessage = message) so you eliminate the redundant casts in the function(s)
handling message type checks.

In `@packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts`:
- Around line 671-681: The redundant outer scheme check should be removed
because hasInlineUdsToken already calls parseAddress and checks addr.scheme;
instead call hasInlineUdsToken(input.to) directly and rely on its internal
parseAddress logic, and avoid multiple parseAddress calls by parsing input.to
once (reuse the resulting addr variable) or by using the existing addr from
parseAddress where present; update the validation in SendMessageTool (the block
referencing addr, hasInlineUdsToken, and parseAddress) to use the single parsed
address or the direct hasInlineUdsToken call to eliminate the third parseAddress
invocation.

In `@src/commands/poor/__tests__/poorMode.test.ts`:
- Around line 16-45: Extract the inline full-surface settings mock into a shared
factory in tests/mocks/settings.ts (e.g. export function
makeSettingsMock(initial = {}) { ... }) that maintains internal state and
lastUpdate and returns an object with .exports containing the same exported
functions used here (loadManagedFileSettings, getManagedFileSettingsPresence,
parseSettingsFile, getInitialSettings, getSettingsForSource,
getSettingsWithErrors, getSettingsWithSources, getSettings_DEPRECATED,
settingsMergeCustomizer, getManagedSettingsKeysForLogging, hasAutoModeOptIn,
hasSkipDangerousModePermissionPrompt, getAutoModeConfig,
getUseAutoModeDuringPlan, rawSettingsContainsKey, updateSettingsForSource,
getSettingsRootPathForSource, getSettingsFilePathForSource,
getRelativeSettingsFilePathForSource, getPolicySettingsOrigin). Then update this
test (poorMode.test.ts) to import makeSettingsMock(), call it to get the mock
instance, and pass mockInstance.exports into mock.module instead of the large
inline object; ensure the factory exposes helpers (state, setState, lastUpdate,
resetLastUpdate) if the tests rely on reading/updating mockSettings.
- Around line 47-50: The test re-mocks the real settings namespace after
mock.restore() using the frozen imported namespace (settingsModule), which can
throw if consumers mutate exports; change the factory passed to
mock.module('src/utils/settings/settings.js', ...) to return a plain mutable
object copy instead of the frozen namespace (e.g., use a factory that spreads
settingsModule into a new object), and move/consolidate all inline settings
mocks from this and other tests (providers.test.ts, effort.test.ts, etc.) into
the shared tests/mocks/settings.ts mock factory so every test uses the same
mutable mock implementation and avoids leakage.

In `@src/utils/__tests__/teammateMailbox.test.ts`:
- Around line 126-143: The test for compactMailboxMessages is using a loose
substring 'req-1' which can match 'req-10' etc.; update the assertion to assert
the exact JSON-quoted request_id (e.g.
expect(compacted[0]?.text).toContain('"request_id":"req-1"') or otherwise check
both the first and last items to verify the true cap boundary) so the cap at
MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES is exercised; locate the test that
constructs messages via message(...) and the compactMailboxMessages(...) call
and tighten the assertion accordingly.

In `@src/utils/__tests__/udsMessaging.test.ts`:
- Around line 137-149: The test loop around sendUdsMessage and
MAX_UDS_INBOX_BYTES currently throws a sentinel Error('byte cap was not
enforced') inside the try which is then caught and asserted as an 'inbox full'
error; change the flow so the sentinel cannot be mistaken for the expected
inbox-full error: either move the assertion that the cap was exceeded outside
the try/catch (count accepted and after the loop assert accepted did not exceed
the threshold) or throw a distinct Error subclass or unique sentinel that you
re-throw from the catch when it originates from your own check; update the loop
in udsMessaging.test.ts referencing sendUdsMessage and MAX_UDS_INBOX_BYTES so
only genuine inbox-full errors are asserted inside the catch and the sentinel
failure surfaces separately.

In `@src/utils/udsMessaging.ts`:
- Around line 81-86: Add one-line JSDoc comments above each exported constant
(MAX_UDS_INBOX_ENTRIES, MAX_UDS_FRAME_BYTES, MAX_UDS_INBOX_BYTES,
MAX_UDS_CLIENTS, UDS_AUTH_TIMEOUT_MS, UDS_IDLE_TIMEOUT_MS) describing what the
limit controls and the rationale (e.g., memory/throughput protection, preventing
oversized frames, expected client concurrency, or auth/idle timeout behavior) so
consumers and tests understand why the numeric value was chosen; keep each JSDoc
short (one sentence) and include units for time/bytes where applicable to make
future tuning safe and self-documenting.
- Around line 211-228: In ensureSocketParent(path) remove the redundant
pre-check that inspects stat.isDirectory() and stat.isSymbolicLink() and throws,
and instead call assertPrivateDirectory(stat, dir, 'socket parent') directly
after obtaining stat from lstat; keep the existing try/catch around lstat and
the subsequent mkdir/chmod behavior unchanged so the same validation and error
messages come from assertPrivateDirectory rather than the duplicated manual
check.
- Around line 299-316: The log in enqueueInboxEntry currently lumps three
different rejection causes into one "inbox full" error; update enqueueInboxEntry
to detect which condition failed (entryBytes > MAX_UDS_FRAME_BYTES, inbox.length
>= MAX_UDS_INBOX_ENTRIES, or inboxBytes + entryBytes > MAX_UDS_INBOX_BYTES) and
log a distinct, descriptive message for each case (e.g., "message too large
(exceeds MAX_UDS_FRAME_BYTES)", "inbox entry count limit reached", or "inbox
byte quota exceeded") while keeping the same boolean return behavior; use the
existing variables (entryBytes, MAX_UDS_FRAME_BYTES, inbox.length,
MAX_UDS_INBOX_ENTRIES, inboxBytes, MAX_UDS_INBOX_BYTES) and the same logError
call site so only the message string changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ede0317f-ec34-459b-9be9-771edaa17139

📥 Commits

Reviewing files that changed from the base of the PR and between ee0d788 and 3f2b845.

📒 Files selected for processing (16)
  • packages/builtin-tools/src/tools/AgentTool/__tests__/filterIncompleteToolCalls.test.ts
  • packages/builtin-tools/src/tools/AgentTool/filterIncompleteToolCalls.ts
  • packages/builtin-tools/src/tools/AgentTool/runAgent.ts
  • packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts
  • packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts
  • src/commands/poor/__tests__/poorMode.test.ts
  • src/services/AgentSummary/__tests__/agentSummary.test.ts
  • src/services/AgentSummary/agentSummary.ts
  • src/services/AgentSummary/summaryContext.ts
  • src/utils/__tests__/ndjsonFramer.test.ts
  • src/utils/__tests__/teammateMailbox.test.ts
  • src/utils/__tests__/udsMessaging.test.ts
  • src/utils/__tests__/udsResponseReader.test.ts
  • src/utils/ndjsonFramer.ts
  • src/utils/udsMessaging.ts
  • src/utils/udsResponseReader.ts
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/utils/tests/ndjsonFramer.test.ts
  • src/services/AgentSummary/tests/agentSummary.test.ts
  • src/services/AgentSummary/agentSummary.ts
  • src/services/AgentSummary/summaryContext.ts

Comment thread packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts
Comment on lines +36 to +40
hasAutoModeOptIn: () => true,
hasSkipDangerousModePermissionPrompt: () => false,
getAutoModeConfig: () => undefined,
getUseAutoModeDuringPlan: () => true,
rawSettingsContainsKey: (key: string) => key in mockSettings,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm real defaults of the four functions vs. the mock returns.
fd -t f 'settings.ts' src/utils/settings | xargs rg -nP -C3 '\b(hasAutoModeOptIn|getUseAutoModeDuringPlan|hasSkipDangerousModePermissionPrompt|getAutoModeConfig)\s*\('

Repository: claude-code-best/claude-code

Length of output: 2360


🏁 Script executed:

cat -n src/commands/poor/__tests__/poorMode.test.ts

Repository: claude-code-best/claude-code

Length of output: 5332


🏁 Script executed:

find tests/mocks -name '*settings*' -type f 2>/dev/null || echo "No settings mock found in tests/mocks"

Repository: claude-code-best/claude-code

Length of output: 54


🏁 Script executed:

# Check if feature() is mocked in the test file
rg -nP 'feature\(' src/commands/poor/__tests__/poorMode.test.ts | head -20

Repository: claude-code-best/claude-code

Length of output: 54


🏁 Script executed:

ls -la tests/mocks/settings.* 2>/dev/null || echo "No settings mock file exists in tests/mocks/"

Repository: claude-code-best/claude-code

Length of output: 117


🏁 Script executed:

# Check if there are any shared mocks in tests/mocks/ that are used by tests
fd -t f '\.ts$' tests/mocks/ | head -20

Repository: claude-code-best/claude-code

Length of output: 168


🏁 Script executed:

# Check if other test files import from tests/mocks/
rg -l 'from.*tests/mocks' --type ts | head -10

Repository: claude-code-best/claude-code

Length of output: 506


🏁 Script executed:

# Check if poorMode.ts calls any of those four functions
rg -nP '(hasAutoModeOptIn|getUseAutoModeDuringPlan|hasSkipDangerousModePermissionPrompt|getAutoModeConfig)' src/commands/poor/poorMode.ts

Repository: claude-code-best/claude-code

Length of output: 54


🏁 Script executed:

# Check one of the shared mock files to see the pattern
cat tests/mocks/log.ts

Repository: claude-code-best/claude-code

Length of output: 915


Extract inline settings mock to tests/mocks/settings.ts and use shared mock pattern.

The hardcoded values for hasAutoModeOptIn, getUseAutoModeDuringPlan, hasSkipDangerousModePermissionPrompt, and getAutoModeConfig do not align with their real implementations:

  • hasAutoModeOptIn() returns hardcoded true — correct ✓
  • getUseAutoModeDuringPlan() has conditional logic that can return false — mock always returns true
  • hasSkipDangerousModePermissionPrompt() reads settings and can return true — mock always returns false
  • getAutoModeConfig() reads settings and returns an object or undefined — mock always returns undefined

Although this test file never calls these functions, following the established pattern in tests/mocks/ (shared factory functions for log.ts, debug.ts, etc.), create tests/mocks/settings.ts with derived defaults from mockSettings rather than hardcoded constants. This prevents future test refactors or accidental function calls from silent behavior divergence and aligns with the guideline to use shared mocks for settings.js.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/poor/__tests__/poorMode.test.ts` around lines 36 - 40, Extract
the inline mock for settings into a shared tests/mocks/settings.ts file and
implement the exported mock functions so they derive their return values from a
single mockSettings object (not hardcoded constants); specifically provide
functions matching hasAutoModeOptIn, hasSkipDangerousModePermissionPrompt,
getAutoModeConfig, getUseAutoModeDuringPlan, and rawSettingsContainsKey that
read from mockSettings (e.g., hasUseAutoModeDuringPlan should return the actual
boolean present in mockSettings, hasSkipDangerousModePermissionPrompt should
reflect the stored setting, getAutoModeConfig should return the stored object or
undefined, and rawSettingsContainsKey should check key in mockSettings), then
replace the inline mock in src/commands/poor/__tests__/poorMode.test.ts with an
import of this shared mock.

Comment on lines +31 to +37
function socketPath(label: string): string {
const suffix = `${process.pid}-${Date.now()}-${Math.random().toString(16).slice(2)}-${label}`
if (process.platform === 'win32') {
return `\\\\.\\pipe\\claude-code-test-${suffix}`
}
return join(tmpdir(), 'claude-code-test', `${suffix}.sock`)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Test sockets share an unscrubbed parent directory across test runs.

socketPath() always points at tmpdir()/claude-code-test/<suffix>.sock; the per-test cleanup (unlink(path) at Lines 201/362/391/422) only removes the socket file, never the claude-code-test parent. On the very first run ensureSocketParent creates it at 0o700, but if any prior process (different umask, manual mkdir, an aborted prior test) ever created claude-code-test with broader perms, every subsequent run of startUdsMessaging will fail-closed with socket parent permissions are too broad, and the failure persists until someone manually deletes the directory. Consider scoping each test's parent dir under a per-test mkdtemp (as the explicit-parent test on Lines 506-522 already does) or registering claude-code-test for cleanup in afterEach.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/__tests__/udsMessaging.test.ts` around lines 31 - 37, The test
socket parent directory can remain with overly-broad permissions across runs
because socketPath() always returns tmpdir()/claude-code-test/..., so fix by
creating a unique per-test parent directory and using that in
socketPath/startUdsMessaging (or register claude-code-test for guaranteed
cleanup in afterEach); specifically change socketPath to accept or derive a
per-test parent (use fs.mkdtemp to create a unique parent dir) and update tests
that call startUdsMessaging/ensureSocketParent to pass the per-test parent (or
add an afterEach that recursively removes the claude-code-test parent), ensuring
tests remove the parent directory after each run to avoid stale-permission
failures.

Comment thread src/utils/__tests__/udsMessaging.test.ts
Comment thread src/utils/ndjsonFramer.ts
Comment thread src/utils/udsMessaging.ts
Comment on lines 649 to 667
// Remove socket file (skip on Windows — pipe paths aren't files)
if (socketPath) {
if (process.platform !== 'win32') {
try {
await unlink(socketPath)
} catch {
// Already gone
}
}
await removeSocketPath(socketPath)
delete process.env.CLAUDE_CODE_MESSAGING_SOCKET
logForDebugging(
`[udsMessaging] server stopped, socket removed: ${socketPath}`,
)
socketPath = null
authToken = null
}
if (capabilityFilePath) {
try {
await unlink(capabilityFilePath)
} catch {
// Already gone
}
capabilityFilePath = null
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

authToken only cleared inside the socketPath branch.

If stopUdsMessaging is ever called with server set but socketPath somehow null (e.g., a future refactor reorders nullification, or a partial-failure path), authToken and capabilityFilePath may also be in inconsistent states. The cleanup is currently nested under if (socketPath) for the authToken reset (Line 657), but the capabilityFilePath reset on Lines 659-665 is unconditional. Move authToken = null outside the if (socketPath) block so termination always wipes the in-memory token.

🛡️ Proposed reorder
   // Remove socket file (skip on Windows — pipe paths aren't files)
   if (socketPath) {
     await removeSocketPath(socketPath)
     delete process.env.CLAUDE_CODE_MESSAGING_SOCKET
     logForDebugging(
       `[udsMessaging] server stopped, socket removed: ${socketPath}`,
     )
     socketPath = null
-    authToken = null
   }
+  authToken = null
   if (capabilityFilePath) {
     try {
       await unlink(capabilityFilePath)
     } catch {
       // Already gone
     }
     capabilityFilePath = null
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Remove socket file (skip on Windows — pipe paths aren't files)
if (socketPath) {
if (process.platform !== 'win32') {
try {
await unlink(socketPath)
} catch {
// Already gone
}
}
await removeSocketPath(socketPath)
delete process.env.CLAUDE_CODE_MESSAGING_SOCKET
logForDebugging(
`[udsMessaging] server stopped, socket removed: ${socketPath}`,
)
socketPath = null
authToken = null
}
if (capabilityFilePath) {
try {
await unlink(capabilityFilePath)
} catch {
// Already gone
}
capabilityFilePath = null
}
}
// Remove socket file (skip on Windows — pipe paths aren't files)
if (socketPath) {
await removeSocketPath(socketPath)
delete process.env.CLAUDE_CODE_MESSAGING_SOCKET
logForDebugging(
`[udsMessaging] server stopped, socket removed: ${socketPath}`,
)
socketPath = null
}
authToken = null
if (capabilityFilePath) {
try {
await unlink(capabilityFilePath)
} catch {
// Already gone
}
capabilityFilePath = null
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/udsMessaging.ts` around lines 649 - 667, In stopUdsMessaging,
ensure authToken is always cleared regardless of socketPath presence: move the
assignment that sets authToken = null out of the if (socketPath) block so that
after attempting to remove the socket (via removeSocketPath) and logging,
authToken is set to null unconditionally; also keep the existing unconditional
capabilityFilePath unlink/cleanup as-is so both in-memory token (authToken) and
capabilityFilePath are wiped on any stopUdsMessaging path.

Comment thread src/utils/udsMessaging.ts
Comment on lines 692 to 721
return new Promise<void>((resolve, reject) => {
const conn = createConnection(targetSocketPath, () => {
conn.write(jsonStringify(message) + '\n', err => {
let settled = false
let conn: ReturnType<typeof createConnection>
const finish = (error?: Error): void => {
if (settled) return
settled = true
if (error) {
conn.destroy(error)
reject(error)
} else {
conn.end()
if (err) reject(err)
else resolve()
resolve()
}
}

conn = createConnection(targetSocketPath, () => {
conn.write(jsonStringify(outbound) + '\n', err => {
if (err) finish(err)
})
})
conn.on('error', reject)
attachUdsResponseReader(conn, {
maxFrameBytes: MAX_UDS_FRAME_BYTES,
acceptPong: true,
onSettled: finish,
})
// Timeout so we don't hang on unreachable sockets
conn.setTimeout(5000, () => {
conn.destroy(new Error('Connection timed out'))
finish(new Error('Connection timed out'))
})
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm Node's createConnection error-emission ordering and that no other call site relies on synchronous error.
rg -nP "createConnection\(" -C5 src/utils/udsMessaging.ts src/utils/udsClient.ts

Repository: claude-code-best/claude-code

Length of output: 2636


🏁 Script executed:

#!/bin/bash
# Get the full function context around the flagged code in udsMessaging.ts
sed -n '685,725p' src/utils/udsMessaging.ts

Repository: claude-code-best/claude-code

Length of output: 1011


🏁 Script executed:

#!/bin/bash
# Find and inspect attachUdsResponseReader function
rg -n "attachUdsResponseReader" src/utils/udsMessaging.ts -B2 -A20

Repository: claude-code-best/claude-code

Length of output: 1171


🏁 Script executed:

#!/bin/bash
# Get full context of connectToPeer in udsClient.ts to see error handler ordering
sed -n '254,265p' src/utils/udsClient.ts

Repository: claude-code-best/claude-code

Length of output: 406


🏁 Script executed:

#!/bin/bash
# Get full context around line 135 in udsClient.ts
sed -n '130,160p' src/utils/udsClient.ts

Repository: claude-code-best/claude-code

Length of output: 860


🏁 Script executed:

#!/bin/bash
# Get full context around line 230 in udsClient.ts  
sed -n '215,245p' src/utils/udsClient.ts

Repository: claude-code-best/claude-code

Length of output: 970


🏁 Script executed:

#!/bin/bash
# Find and inspect the full attachUdsResponseReader implementation
rg -n "function attachUdsResponseReader|export.*attachUdsResponseReader" -A50 src/utils/udsResponseReader.ts

Repository: claude-code-best/claude-code

Length of output: 1619


🏁 Script executed:

#!/bin/bash
# Also check if there's attachUdsResponseReader in .js files
fd "udsResponseReader" src/

Repository: claude-code-best/claude-code

Length of output: 149


🏁 Script executed:

#!/bin/bash
# Get the full attachUdsResponseReader function implementation
sed -n '28,120p' src/utils/udsResponseReader.ts

Repository: claude-code-best/claude-code

Length of output: 2530


Connection-error path before attachUdsResponseReader is wired requires defensive fix.

createConnection(targetSocketPath, ...) returns synchronously at line 707, but the error/end/close listeners are not attached until attachUdsResponseReader(conn, ...) runs at line 712. While Node.js defers connection errors via process.nextTick (making this safe today), any future refactor that introduces asynchronous code between lines 707–712 would silently lose early-error events.

Recommended fixes:

  • Move attachUdsResponseReader() to immediately after const conn = createConnection(...) before any other calls, or
  • Wrap the entire connect-and-respond flow in a timeout rather than using only conn.setTimeout(), to catch connect stalls that never trigger the socket idle timer.

The same pattern appears in udsClient.ts:230 and udsClient.ts:256, which should also be reviewed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/udsMessaging.ts` around lines 692 - 721, The connection error path
is exposed because createConnection(targetSocketPath, ...) returns conn
synchronously but attachUdsResponseReader(conn, ...) (which registers
error/end/close handlers) is called later; move the attachUdsResponseReader call
to immediately after const conn = createConnection(...) (before any writes,
setTimeout, or other async actions) so the socket has its error/close handlers
attached as soon as it is created; ensure you still call conn.write(...) and
conn.setTimeout(...) afterward and reuse the existing finish(error?) callback;
apply the same change pattern to the analogous usages in udsClient.ts (around
the code referenced at lines ~230 and ~256).

amDosion pushed a commit that referenced this pull request Apr 27, 2026
Tighten the UDS auth, framing, and response-reader boundaries while keeping the AgentSummary lifecycle covered so Codecov and CI fail on real regressions instead of missing coverage. The poorMode settings mock mirrors unrelated real settings defaults to avoid Bun mock retention changing later permission tests.

Constraint: PR #369 must fix Codecov/CI precisely without warning suppression, fallback masking, or mock pollution

Rejected: Delete AgentSummary lifecycle coverage | would hide Codecov loss and stale-summary behavior

Rejected: Store inline UDS rejection in a hidden input sentinel | cloned observable inputs can drop it and bypass rejection

Rejected: Ignore malformed UDS frames until timeout | leaves client slots and SendMessage calls open to exhaustion

Confidence: high

Scope-risk: moderate

Directive: Keep empty #token= markers rejected; do not require a non-empty token value in hasInlineUdsToken

Tested: bun test packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts src/utils/__tests__/udsMessaging.test.ts src/utils/__tests__/udsResponseReader.test.ts src/utils/__tests__/ndjsonFramer.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run test:all

Tested: bun audit

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted Codecov upload until pushed PR checks rerun
@amDosion amDosion force-pushed the codex/memory-overflow-bounds branch from 3f2b845 to c9ddecc Compare April 27, 2026 05:13
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/utils/__tests__/teammateMailbox.test.ts (1)

126-143: Tighten the protocol-cap assertion to avoid prefix collisions.

toContain('req-1') is also satisfied by req-10, req-11, …, so this assertion does not actually prove that the dropped element is req-0 (i.e. the oldest). With MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES + 1 inputs you typically generate req-0req-N where N >= 10, so the substring match is ambiguous for any cap ≥ 11.

♻️ Proposed stricter assertion
-    expect(compacted).toHaveLength(MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES)
-    expect(compacted[0]?.text).toContain('req-1')
+    expect(compacted).toHaveLength(MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES)
+    const firstId = JSON.parse(compacted[0]?.text ?? '{}').request_id
+    expect(firstId).toBe('req-1')
+    expect(compacted.some(m => m.text.includes('"request_id":"req-0"'))).toBe(
+      false,
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/__tests__/teammateMailbox.test.ts` around lines 126 - 143, The
assertion is too loose because expect(compacted[0]?.text).toContain('req-1') can
match 'req-10' etc.; update the test in the "caps unread protocol messages with
an independent bound" case to assert the exact dropped item by parsing or
exact-matching the protocol payload: use compactMailboxMessages, the generated
messages via message and MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES to build
req-0…req-N, then decode or parse compacted[0].text (or use a strict string
equality / regex with word-boundaries) and assert the request_id equals "req-1"
(or that the serialized JSON contains `"request_id":"req-1"`), ensuring the test
verifies the precise ID rather than a substring.
src/services/AgentSummary/__tests__/agentSummary.test.ts (1)

84-92: Consider properly restoring all mock.module() registrations; Bun's mock.restore() does not reset them.

mock.restore() does not reset modules registered via mock.module(), per Bun documentation. This leaves poorMode.js, LocalAgentTask.js, and forkedAgent.js mocked for the remainder of the in-process test runtime. While no other tests in this suite currently import these modules (verified via codebase search), proper cleanup via explicit mock restoration (e.g., mock.restoreModule() or re-importing the real modules) is better practice and protects against future test additions. Alternatively, move these mocks to per-test setup via beforeEach() to scope their lifetime.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/AgentSummary/__tests__/agentSummary.test.ts` around lines 84 -
92, The afterEach cleanup currently calls mock.restore() but leaves mock.module
registrations in place; update the afterEach in the test to explicitly undo
those module mocks (e.g., call mock.restoreModule() for each mocked path or
re-import the real modules) for the modules you registered earlier (poorMode.js,
LocalAgentTask.js, forkedAgent.js and the sessionStorage.js registration), or
move the mock.module(...) calls into beforeEach so their scope is per-test;
ensure you reference the existing afterEach block and the mock.module(...) calls
when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/services/AgentSummary/__tests__/agentSummary.test.ts`:
- Around line 84-92: The afterEach cleanup currently calls mock.restore() but
leaves mock.module registrations in place; update the afterEach in the test to
explicitly undo those module mocks (e.g., call mock.restoreModule() for each
mocked path or re-import the real modules) for the modules you registered
earlier (poorMode.js, LocalAgentTask.js, forkedAgent.js and the
sessionStorage.js registration), or move the mock.module(...) calls into
beforeEach so their scope is per-test; ensure you reference the existing
afterEach block and the mock.module(...) calls when making the change.

In `@src/utils/__tests__/teammateMailbox.test.ts`:
- Around line 126-143: The assertion is too loose because
expect(compacted[0]?.text).toContain('req-1') can match 'req-10' etc.; update
the test in the "caps unread protocol messages with an independent bound" case
to assert the exact dropped item by parsing or exact-matching the protocol
payload: use compactMailboxMessages, the generated messages via message and
MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES to build req-0…req-N, then decode or parse
compacted[0].text (or use a strict string equality / regex with word-boundaries)
and assert the request_id equals "req-1" (or that the serialized JSON contains
`"request_id":"req-1"`), ensuring the test verifies the precise ID rather than a
substring.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 00b9d81e-128f-4345-89f7-89f7486fd97e

📥 Commits

Reviewing files that changed from the base of the PR and between 3f2b845 and c9ddecc.

📒 Files selected for processing (16)
  • packages/builtin-tools/src/tools/AgentTool/__tests__/filterIncompleteToolCalls.test.ts
  • packages/builtin-tools/src/tools/AgentTool/filterIncompleteToolCalls.ts
  • packages/builtin-tools/src/tools/AgentTool/runAgent.ts
  • packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts
  • packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts
  • src/commands/poor/__tests__/poorMode.test.ts
  • src/services/AgentSummary/__tests__/agentSummary.test.ts
  • src/services/AgentSummary/agentSummary.ts
  • src/services/AgentSummary/summaryContext.ts
  • src/utils/__tests__/ndjsonFramer.test.ts
  • src/utils/__tests__/teammateMailbox.test.ts
  • src/utils/__tests__/udsMessaging.test.ts
  • src/utils/__tests__/udsResponseReader.test.ts
  • src/utils/ndjsonFramer.ts
  • src/utils/udsMessaging.ts
  • src/utils/udsResponseReader.ts
✅ Files skipped from review due to trivial changes (1)
  • src/utils/udsMessaging.ts
🚧 Files skipped from review as they are similar to previous changes (5)
  • packages/builtin-tools/src/tools/AgentTool/filterIncompleteToolCalls.ts
  • packages/builtin-tools/src/tools/AgentTool/tests/filterIncompleteToolCalls.test.ts
  • src/services/AgentSummary/agentSummary.ts
  • src/utils/tests/udsMessaging.test.ts
  • src/commands/poor/tests/poorMode.test.ts

amDosion pushed a commit that referenced this pull request Apr 27, 2026
Tighten the UDS auth, framing, and response-reader boundaries while keeping the AgentSummary lifecycle covered so Codecov and CI fail on real regressions instead of missing coverage. The poorMode settings mock mirrors unrelated real settings defaults to avoid Bun mock retention changing later permission tests.

Constraint: PR #369 must fix Codecov/CI precisely without warning suppression, fallback masking, or mock pollution

Rejected: Delete AgentSummary lifecycle coverage | would hide Codecov loss and stale-summary behavior

Rejected: Store inline UDS rejection in a hidden input sentinel | cloned observable inputs can drop it and bypass rejection

Rejected: Ignore malformed UDS frames until timeout | leaves client slots and SendMessage calls open to exhaustion

Confidence: high

Scope-risk: moderate

Directive: Keep empty #token= markers rejected; do not require a non-empty token value in hasInlineUdsToken

Tested: bun test packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts src/utils/__tests__/udsMessaging.test.ts src/utils/__tests__/udsResponseReader.test.ts src/utils/__tests__/ndjsonFramer.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run test:all

Tested: bun audit

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted Codecov upload until pushed PR checks rerun
@amDosion amDosion force-pushed the codex/memory-overflow-bounds branch from c9ddecc to 550f5de Compare April 27, 2026 06:09
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
src/utils/ndjsonFramer.ts (1)

86-98: ⚠️ Potential issue | 🟠 Major

Multibyte UTF-8 sequences are still split at chunk boundaries on the tail path.

A previous review flagged this and was marked "Addressed in commit c9ddecc", but line 96 still calls chunk.subarray(start).toString('utf8') and the file does not import StringDecoder. When a chunk ends mid-multibyte-sequence, the trailing bytes are decoded to U+FFFD and the next chunk's leading continuation bytes decode to another U+FFFD. The corrupted buffer then fails JSON.parse and triggers a spurious invalid-frame rejection (or, with destroyOnInvalidFrame: true, kills the socket). Line 79 is safe because LF (0x0A) cannot appear inside a multibyte sequence, so only the tail accumulation needs the fix. udsResponseReader.ts already uses StringDecoder('utf8') for exactly this reason — adopt the same pattern here.

♻️ Suggested fix using `StringDecoder`
 import type { Socket } from 'net'
+import { StringDecoder } from 'node:string_decoder'
@@
 export function attachNdjsonFramer<T = unknown>(
   socket: Socket,
   onMessage: (msg: T) => void,
   parse: (text: string) => T = text => JSON.parse(text) as T,
   options: NdjsonFramerOptions = {},
 ): void {
   let buffer = ''
   let bufferBytes = 0
+  const decoder = new StringDecoder('utf8')
   const maxFrameBytes = options.maxFrameBytes ?? Number.POSITIVE_INFINITY
@@
-      buffer += chunk.subarray(start, index).toString('utf8')
+      buffer += decoder.write(chunk.subarray(start, index))
       emitLine(buffer)
       buffer = ''
       bufferBytes = 0
       start = index + 1
     }
@@
     if (tailBytes > 0) {
-      buffer += chunk.subarray(start).toString('utf8')
+      buffer += decoder.write(chunk.subarray(start))
       bufferBytes += tailBytes
     }
   })
 }

You may also want to call decoder.end() on socket 'end'/'close' to flush any trailing bytes through the framer (and reject if it produces a non-empty residual without a newline).

Verification — confirm the issue is still live and that udsResponseReader already handles it:

#!/bin/bash
# Expect: ndjsonFramer has no StringDecoder; udsResponseReader does.
rg -nP "StringDecoder|string_decoder|toString\(['\"]utf8['\"]\)" src/utils/ndjsonFramer.ts src/utils/udsResponseReader.ts
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/ndjsonFramer.ts` around lines 86 - 98, The tail path in
ndjsonFramer still decodes trailing bytes with
chunk.subarray(start).toString('utf8'), which corrupts multibyte UTF‑8
sequences; fix by importing and instantiating a StringDecoder('utf8') (same
pattern used in udsResponseReader) and replace the toString call with
decoder.write(...) when appending to buffer (update the code around buffer +=
chunk.subarray(start)... and bufferBytes handling); also ensure decoder.end() is
called on socket 'end'/'close' to flush any remaining bytes and reject if the
flushed residual produces a non-empty leftover without a newline (so
rejectOversizedFrame/invalid-frame handling remains correct).
src/utils/udsMessaging.ts (1)

650-666: ⚠️ Potential issue | 🟡 Minor

authToken reset still nested inside if (socketPath) — fix not yet applied.

Per the prior review on this hunk, authToken = null (Line 657) should be hoisted out of the if (socketPath) block so a partial-failure stop path that leaves server set but socketPath null still wipes the in-memory token. capabilityFilePath and defaultSocketPath are already reset unconditionally; this completes that symmetry.

🛡️ Proposed reorder
   if (socketPath) {
     await removeSocketPath(socketPath)
     delete process.env.CLAUDE_CODE_MESSAGING_SOCKET
     logForDebugging(
       `[udsMessaging] server stopped, socket removed: ${socketPath}`,
     )
     socketPath = null
-    authToken = null
   }
+  authToken = null
   if (capabilityFilePath) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/udsMessaging.ts` around lines 650 - 666, The authToken reset is
incorrectly placed inside the if (socketPath) block; move the statement
authToken = null so it executes unconditionally after the socketPath conditional
(matching how capabilityFilePath and defaultSocketPath are cleared), i.e., keep
removing the socket and nulling socketPath inside the if (socketPath) block but
hoist authToken = null out of that block so any partial-stop path still wipes
the in-memory token.
🧹 Nitpick comments (4)
packages/builtin-tools/src/tools/AgentTool/runAgent.ts (1)

89-93: LGTM — clean extraction with backwards-compatible re-export.

The inline filterIncompleteToolCalls logic has been moved to its own module and re-exported here, preserving the public surface for any existing consumers that import it from runAgent.ts. The behavior at line 380 (filtering forkContextMessages before forwarding) is unchanged, and agentSummary.ts is already wired to import directly from the new file (per context snippet 4).

One minor follow-up worth verifying: the rationale for filtering here directly contradicts the explicit warning in src/utils/forkedAgent.ts (context snippet 2), which deliberately skips this call to avoid orphaning paired tool_results and triggering API 400s. This pre-dates the PR, so it's not a regression — but if both code paths can produce the subagent's initialMessages, the divergence may be worth a comment explaining why runAgent is safe to filter while forkedAgent is not.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/AgentTool/runAgent.ts` around lines 89 - 93,
Add a short explanatory comment near the use of filterIncompleteToolCalls in
runAgent (the call handling forkContextMessages before forwarding) stating why
it is safe here but must be skipped in src/utils/forkedAgent.ts to avoid
orphaning paired tool_results and 400s; reference the symbols
filterIncompleteToolCalls, forkContextMessages, runAgent and forkedAgent to
locate the logic and ensure the comment explains the differing input
shapes/source of subagent initialMessages that justify the divergence.
packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts (1)

5-5: Optional: hoist the SendMessageTool import to module scope.

Each of the 7 tests repeats const { SendMessageTool } = await import('../SendMessageTool.js'). There are no module-level mocks here that require lazy loading, so a single top-level import { SendMessageTool } from '../SendMessageTool.js' would remove the boilerplate and let each test callback be synchronous when no other awaits are needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts`
at line 5, Hoist the dynamic import so tests stop repeating it: remove the
repeated "const { SendMessageTool } = await import('../SendMessageTool.js')"
lines from each of the seven tests and add a single top-level import like
"import { SendMessageTool } from '../SendMessageTool.js'" at module scope in
udsRecipientSanitization.test.ts; update the individual test callbacks to be
synchronous where no other awaits are used so they no longer need to await the
module load.
src/utils/__tests__/teammateMailbox.test.ts (1)

126-143: Tighten the protocol-cap assertion to avoid loose substring matches.

toContain('req-1') also matches req-10, req-11, …, req-19, req-100, etc. The test would still pass if compaction incorrectly kept req-10 (or any req-1X) as the first retained entry instead of req-1, masking an off-by-one in the protocol cap. Pin the exact identifier.

♻️ Proposed refactor
-    expect(compacted).toHaveLength(MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES)
-    expect(compacted[0]?.text).toContain('req-1')
+    expect(compacted).toHaveLength(MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES)
+    // Oldest (req-0) must be dropped; first retained should be exactly req-1.
+    expect(JSON.parse(compacted[0]!.text)).toMatchObject({ request_id: 'req-1' })
+    expect(JSON.parse(compacted.at(-1)!.text)).toMatchObject({
+      request_id: `req-${MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES}`,
+    })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/__tests__/teammateMailbox.test.ts` around lines 126 - 143, The test
uses a loose substring assertion that can match req-10 etc.; update the
assertion to pin the exact retained identifier: locate the test "caps unread
protocol messages with an independent bound" and change the final assertion on
compactMailboxMessages' result (function compactMailboxMessages, constant
MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES, helper message) to verify the exact
request_id for the first retained message—either by parsing compacted[0].text as
JSON and asserting request_id === 'req-1' or by checking for the exact key/value
string (e.g. '"request_id":"req-1"') so off-by-one matches like req-10 cannot
satisfy the test.
src/services/AgentSummary/summaryContext.ts (1)

33-41: Optional: prefer Object.keys over for...in + Object.hasOwn filter.

for (const key in record) { if (!Object.hasOwn(record, key)) continue; ... } is functionally equivalent to for (const key of Object.keys(record)) and avoids walking the prototype chain. Same nit applies at lines 84–86 inside updateFingerprintHash. Purely stylistic.

♻️ Proposed minor refactor
       } else {
         const record = value as Record<string, unknown>
-        for (const key in record) {
-          if (!Object.hasOwn(record, key)) continue
+        for (const key of Object.keys(record)) {
           total += key.length + 3
           total += estimateJsonChars(record[key], limit - total, seen)
           if (total > limit) return total
         }
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/AgentSummary/summaryContext.ts` around lines 33 - 41, Replace
the for...in + Object.hasOwn check with a direct Object.keys iteration to avoid
prototype-chain traversal: in summaryContext.ts update the loop inside
estimateJsonChars that currently uses "for (const key in record) { if
(!Object.hasOwn(record, key)) continue; ... }" to "for (const key of
Object.keys(record)) { ... }", and make the same change in updateFingerprintHash
where a similar for...in + Object.hasOwn pattern is used; keep the inner logic
identical (accumulating total/processing values) and preserve existing calls to
estimateJsonChars/updateFingerprintHash and the seen/limit handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/services/AgentSummary/__tests__/agentSummary.test.ts`:
- Around line 140-143: Replace the fragile query-string cache-bust used when
re-importing agentSummary (e.g., importing '../agentSummary.js?shortContext')
with explicit mock isolation: before importing startAgentSummarization, call
mock.restore() then use mock.module() to remock the dependent modules you need
refreshed (notably getAgentTranscript and sessionStorage) so their updated
bindings are visible to the module loader, then import the plain
'../agentSummary.js' and destructure startAgentSummarization; ensure you call
mock.module() for each dependency that should be isolated prior to the import so
Bun doesn't rely on query strings for cache-busting.

In `@src/utils/ndjsonFramer.ts`:
- Around line 71-93: The oversized-frame handlers currently call
rejectOversizedFrame and return without clearing framer state, leaving
buffer/bufferBytes/start stale; update both call sites that invoke
rejectOversizedFrame (the segment-path inside the loop and the tail-path after
the loop) to reset buffer = '' and bufferBytes = 0 and advance start to the byte
after the next 0x0a (or to chunk.length if none) so the parser resynchronizes
within the same chunk instead of silently keeping rejected bytes; ensure you
still call rejectOversizedFrame and only return early if destroyOnFrameError
dictates teardown, otherwise continue parsing the remainder of the chunk and
call emitLine when a newline is reached.

---

Duplicate comments:
In `@src/utils/ndjsonFramer.ts`:
- Around line 86-98: The tail path in ndjsonFramer still decodes trailing bytes
with chunk.subarray(start).toString('utf8'), which corrupts multibyte UTF‑8
sequences; fix by importing and instantiating a StringDecoder('utf8') (same
pattern used in udsResponseReader) and replace the toString call with
decoder.write(...) when appending to buffer (update the code around buffer +=
chunk.subarray(start)... and bufferBytes handling); also ensure decoder.end() is
called on socket 'end'/'close' to flush any remaining bytes and reject if the
flushed residual produces a non-empty leftover without a newline (so
rejectOversizedFrame/invalid-frame handling remains correct).

In `@src/utils/udsMessaging.ts`:
- Around line 650-666: The authToken reset is incorrectly placed inside the if
(socketPath) block; move the statement authToken = null so it executes
unconditionally after the socketPath conditional (matching how
capabilityFilePath and defaultSocketPath are cleared), i.e., keep removing the
socket and nulling socketPath inside the if (socketPath) block but hoist
authToken = null out of that block so any partial-stop path still wipes the
in-memory token.

---

Nitpick comments:
In `@packages/builtin-tools/src/tools/AgentTool/runAgent.ts`:
- Around line 89-93: Add a short explanatory comment near the use of
filterIncompleteToolCalls in runAgent (the call handling forkContextMessages
before forwarding) stating why it is safe here but must be skipped in
src/utils/forkedAgent.ts to avoid orphaning paired tool_results and 400s;
reference the symbols filterIncompleteToolCalls, forkContextMessages, runAgent
and forkedAgent to locate the logic and ensure the comment explains the
differing input shapes/source of subagent initialMessages that justify the
divergence.

In
`@packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts`:
- Line 5: Hoist the dynamic import so tests stop repeating it: remove the
repeated "const { SendMessageTool } = await import('../SendMessageTool.js')"
lines from each of the seven tests and add a single top-level import like
"import { SendMessageTool } from '../SendMessageTool.js'" at module scope in
udsRecipientSanitization.test.ts; update the individual test callbacks to be
synchronous where no other awaits are used so they no longer need to await the
module load.

In `@src/services/AgentSummary/summaryContext.ts`:
- Around line 33-41: Replace the for...in + Object.hasOwn check with a direct
Object.keys iteration to avoid prototype-chain traversal: in summaryContext.ts
update the loop inside estimateJsonChars that currently uses "for (const key in
record) { if (!Object.hasOwn(record, key)) continue; ... }" to "for (const key
of Object.keys(record)) { ... }", and make the same change in
updateFingerprintHash where a similar for...in + Object.hasOwn pattern is used;
keep the inner logic identical (accumulating total/processing values) and
preserve existing calls to estimateJsonChars/updateFingerprintHash and the
seen/limit handling.

In `@src/utils/__tests__/teammateMailbox.test.ts`:
- Around line 126-143: The test uses a loose substring assertion that can match
req-10 etc.; update the assertion to pin the exact retained identifier: locate
the test "caps unread protocol messages with an independent bound" and change
the final assertion on compactMailboxMessages' result (function
compactMailboxMessages, constant MAX_UNREAD_PROTOCOL_MAILBOX_MESSAGES, helper
message) to verify the exact request_id for the first retained message—either by
parsing compacted[0].text as JSON and asserting request_id === 'req-1' or by
checking for the exact key/value string (e.g. '"request_id":"req-1"') so
off-by-one matches like req-10 cannot satisfy the test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 01e51345-76a0-4ff3-8ece-ba5a1a6b461a

📥 Commits

Reviewing files that changed from the base of the PR and between c9ddecc and 550f5de.

📒 Files selected for processing (18)
  • packages/builtin-tools/src/tools/AgentTool/__tests__/filterIncompleteToolCalls.test.ts
  • packages/builtin-tools/src/tools/AgentTool/filterIncompleteToolCalls.ts
  • packages/builtin-tools/src/tools/AgentTool/runAgent.ts
  • packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts
  • packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts
  • src/commands/poor/__tests__/poorMode.test.ts
  • src/services/AgentSummary/__tests__/agentSummary.test.ts
  • src/services/AgentSummary/__tests__/summaryContext.test.ts
  • src/services/AgentSummary/agentSummary.ts
  • src/services/AgentSummary/summaryContext.ts
  • src/utils/__tests__/ndjsonFramer.test.ts
  • src/utils/__tests__/teammateMailbox.test.ts
  • src/utils/__tests__/udsMessaging.test.ts
  • src/utils/__tests__/udsResponseReader.test.ts
  • src/utils/ndjsonFramer.ts
  • src/utils/swarm/inProcessRunner.ts
  • src/utils/udsMessaging.ts
  • src/utils/udsResponseReader.ts
✅ Files skipped from review due to trivial changes (2)
  • packages/builtin-tools/src/tools/AgentTool/filterIncompleteToolCalls.ts
  • src/utils/swarm/inProcessRunner.ts
🚧 Files skipped from review as they are similar to previous changes (7)
  • src/commands/poor/tests/poorMode.test.ts
  • packages/builtin-tools/src/tools/AgentTool/tests/filterIncompleteToolCalls.test.ts
  • src/utils/tests/ndjsonFramer.test.ts
  • src/utils/tests/udsResponseReader.test.ts
  • src/utils/tests/udsMessaging.test.ts
  • src/services/AgentSummary/agentSummary.ts
  • packages/builtin-tools/src/tools/SendMessageTool/SendMessageTool.ts

Comment thread src/services/AgentSummary/__tests__/agentSummary.test.ts Outdated
Comment thread src/utils/ndjsonFramer.ts
Comment on lines +71 to +93
if (
Number.isFinite(maxFrameBytes) &&
bufferBytes + segmentBytes > maxFrameBytes
) {
rejectOversizedFrame(bufferBytes + segmentBytes)
return
}

buffer += chunk.subarray(start, index).toString('utf8')
emitLine(buffer)
buffer = ''
bufferBytes = 0
start = index + 1
}

const tailBytes = chunk.length - start
if (
Number.isFinite(maxFrameBytes) &&
bufferBytes + tailBytes > maxFrameBytes
) {
rejectOversizedFrame(bufferBytes + tailBytes)
return
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Framer state is not reset when an oversized frame is rejected.

Both rejectOversizedFrame paths (lines 75 and 91) return without resetting buffer / bufferBytes and silently drop the rest of the chunk. With the default destroyOnFrameError: true this is harmless because the socket is torn down. But when callers opt out (destroyOnFrameError: false), the next data event keeps appending to the stale buffer (which contains the prefix of the just-rejected frame), so subsequent valid frames are conflated with rejected bytes and tend to cascade into further parse/size failures. Note this is asymmetric with the invalid-frame path, where lines 81–82 reset state and parsing continues cleanly.

🛡️ Suggested fix: reset state and resync past the next newline before resuming
   const rejectOversizedFrame = (bytes: number): void => {
     const error = new Error(
       `NDJSON frame exceeded ${maxFrameBytes} bytes (${bytes})`,
     )
     options.onFrameError?.(error)
+    buffer = ''
+    bufferBytes = 0
     if (options.destroyOnFrameError ?? true) {
       socket.destroy(error)
     }
   }

If you also want the opt-out path to recover within the same chunk, advance start to the byte after the next 0x0a (or to chunk.length) before continuing the loop, instead of returning early.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (
Number.isFinite(maxFrameBytes) &&
bufferBytes + segmentBytes > maxFrameBytes
) {
rejectOversizedFrame(bufferBytes + segmentBytes)
return
}
buffer += chunk.subarray(start, index).toString('utf8')
emitLine(buffer)
buffer = ''
bufferBytes = 0
start = index + 1
}
const tailBytes = chunk.length - start
if (
Number.isFinite(maxFrameBytes) &&
bufferBytes + tailBytes > maxFrameBytes
) {
rejectOversizedFrame(bufferBytes + tailBytes)
return
}
const rejectOversizedFrame = (bytes: number): void => {
const error = new Error(
`NDJSON frame exceeded ${maxFrameBytes} bytes (${bytes})`,
)
options.onFrameError?.(error)
buffer = ''
bufferBytes = 0
if (options.destroyOnFrameError ?? true) {
socket.destroy(error)
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/ndjsonFramer.ts` around lines 71 - 93, The oversized-frame handlers
currently call rejectOversizedFrame and return without clearing framer state,
leaving buffer/bufferBytes/start stale; update both call sites that invoke
rejectOversizedFrame (the segment-path inside the loop and the tail-path after
the loop) to reset buffer = '' and bufferBytes = 0 and advance start to the byte
after the next 0x0a (or to chunk.length if none) so the parser resynchronizes
within the same chunk instead of silently keeping rejected bytes; ensure you
still call rejectOversizedFrame and only return early if destroyOnFrameError
dictates teardown, otherwise continue parsing the remainder of the chunk and
call emitLine when a newline is reached.

Tighten the UDS auth, framing, and response-reader boundaries while keeping the AgentSummary lifecycle covered so Codecov and CI fail on real regressions instead of missing coverage. The poorMode settings mock mirrors unrelated real settings defaults to avoid Bun mock retention changing later permission tests.

Constraint: PR #369 must fix Codecov/CI precisely without warning suppression, fallback masking, or mock pollution

Rejected: Delete AgentSummary lifecycle coverage | would hide Codecov loss and stale-summary behavior

Rejected: Store inline UDS rejection in a hidden input sentinel | cloned observable inputs can drop it and bypass rejection

Rejected: Ignore malformed UDS frames until timeout | leaves client slots and SendMessage calls open to exhaustion

Confidence: high

Scope-risk: moderate

Directive: Keep empty #token= markers rejected; do not require a non-empty token value in hasInlineUdsToken

Tested: bun test packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts src/utils/__tests__/udsMessaging.test.ts src/utils/__tests__/udsResponseReader.test.ts src/utils/__tests__/ndjsonFramer.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run test:all

Tested: bun audit

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted Codecov upload until pushed PR checks rerun
@amDosion amDosion force-pushed the codex/memory-overflow-bounds branch from 550f5de to 4e2ff56 Compare April 27, 2026 06:23
@claude-code-best claude-code-best merged commit 52b61c2 into main Apr 27, 2026
8 checks passed
amDosion pushed a commit that referenced this pull request Apr 27, 2026
Tighten the UDS auth, framing, and response-reader boundaries while keeping the AgentSummary lifecycle covered so Codecov and CI fail on real regressions instead of missing coverage. The poorMode settings mock mirrors unrelated real settings defaults to avoid Bun mock retention changing later permission tests.

Constraint: PR #369 must fix Codecov/CI precisely without warning suppression, fallback masking, or mock pollution

Rejected: Delete AgentSummary lifecycle coverage | would hide Codecov loss and stale-summary behavior

Rejected: Store inline UDS rejection in a hidden input sentinel | cloned observable inputs can drop it and bypass rejection

Rejected: Ignore malformed UDS frames until timeout | leaves client slots and SendMessage calls open to exhaustion

Confidence: high

Scope-risk: moderate

Directive: Keep empty #token= markers rejected; do not require a non-empty token value in hasInlineUdsToken

Tested: bun test packages/builtin-tools/src/tools/SendMessageTool/__tests__/udsRecipientSanitization.test.ts src/utils/__tests__/udsMessaging.test.ts src/utils/__tests__/udsResponseReader.test.ts src/utils/__tests__/ndjsonFramer.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run test:all

Tested: bun audit

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted Codecov upload until pushed PR checks rerun
amDosion pushed a commit that referenced this pull request Apr 27, 2026
PR #369 was merged before the final Codecov coverage fix landed, so this follow-up carries only the incremental real-path tests needed on top of main. The tests exercise AgentSummary lifecycle branches, mailbox fail-closed behavior, UDS client connection failure through a real capability file, and UDS response-reader framing without mock.module, warning suppression, feature fallback, or production-code churn.

Constraint: PR #369 is already merged; this branch must contain only the incremental Codecov repair on top of latest main

Rejected: Reopen or keep pushing the merged PR branch | merged PR refs do not update and would leave Codecov stale

Rejected: Mock bun:bundle or hide warnings | would reintroduce cross-test pollution and pseudo coverage

Rejected: Keep unrelated SendMessageTool production diff | it created avoidable patch-coverage debt without improving the runtime path

Confidence: high

Scope-risk: narrow

Directive: Keep these coverage tests on real paths; do not replace them with output suppression or feature-flag mocks

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test src\services\AgentSummary\__tests__\agentSummary.test.ts src\services\AgentSummary\__tests__\summaryContext.test.ts src\utils\__tests__\teammateMailbox.test.ts src\utils\__tests__\udsMessaging.test.ts src\utils\__tests__\udsResponseReader.test.ts packages\builtin-tools\src\tools\SendMessageTool\__tests__\udsRecipientSanitization.test.ts

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Tested: bun audit

Tested: git diff --check

Tested: Claude simplify review GO (.omx/artifacts/claude-simplify-codecov-20260427-144906.md)

Tested: Claude security review GO (.omx/artifacts/claude-security-codecov-20260427-145007.md)

Not-tested: GitHub-hosted Codecov upload after this amended commit until PR checks rerun
amDosion pushed a commit that referenced this pull request Apr 27, 2026
PR #369 was merged before the final Codecov coverage fix landed, so this follow-up carries only the incremental real-path tests needed on top of main. The tests exercise AgentSummary lifecycle branches, mailbox fail-closed behavior, UDS client connection failure through a real capability file, and UDS response-reader framing without mock.module, warning suppression, feature fallback, or production-code churn.

Constraint: PR #369 is already merged; this branch must contain only the incremental Codecov repair on top of latest main

Rejected: Reopen or keep pushing the merged PR branch | merged PR refs do not update and would leave Codecov stale

Rejected: Mock bun:bundle or hide warnings | would reintroduce cross-test pollution and pseudo coverage

Rejected: Keep unrelated SendMessageTool production diff | it created avoidable patch-coverage debt without improving the runtime path

Confidence: high

Scope-risk: narrow

Directive: Keep these coverage tests on real paths; do not replace them with output suppression or feature-flag mocks

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test src\utils\__tests__\teammateMailbox.test.ts

Tested: bun test src\services\AgentSummary\__tests__\agentSummary.test.ts src\services\AgentSummary\__tests__\summaryContext.test.ts src\utils\__tests__\teammateMailbox.test.ts src\utils\__tests__\udsMessaging.test.ts src\utils\__tests__\udsResponseReader.test.ts packages\builtin-tools\src\tools\SendMessageTool\__tests__\udsRecipientSanitization.test.ts

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Tested: bun audit

Tested: git diff --check

Tested: Claude simplify review GO (.omx/artifacts/claude-simplify-codecov-20260427-1521.md)

Tested: Claude security review GO (.omx/artifacts/claude-security-codecov-20260427-1522.md)

Not-tested: GitHub-hosted Codecov upload after this amended commit until PR checks rerun
claude-code-best pushed a commit that referenced this pull request Apr 27, 2026
* test: keep Codecov coverage on real agent communication paths

PR #369 was merged before the final Codecov coverage fix landed, so this follow-up carries only the incremental real-path tests needed on top of main. The tests exercise AgentSummary lifecycle branches, mailbox fail-closed behavior, UDS client connection failure through a real capability file, and UDS response-reader framing without mock.module, warning suppression, feature fallback, or production-code churn.

Constraint: PR #369 is already merged; this branch must contain only the incremental Codecov repair on top of latest main

Rejected: Reopen or keep pushing the merged PR branch | merged PR refs do not update and would leave Codecov stale

Rejected: Mock bun:bundle or hide warnings | would reintroduce cross-test pollution and pseudo coverage

Rejected: Keep unrelated SendMessageTool production diff | it created avoidable patch-coverage debt without improving the runtime path

Confidence: high

Scope-risk: narrow

Directive: Keep these coverage tests on real paths; do not replace them with output suppression or feature-flag mocks

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun test src\utils\__tests__\teammateMailbox.test.ts

Tested: bun test src\services\AgentSummary\__tests__\agentSummary.test.ts src\services\AgentSummary\__tests__\summaryContext.test.ts src\utils\__tests__\teammateMailbox.test.ts src\utils\__tests__\udsMessaging.test.ts src\utils\__tests__\udsResponseReader.test.ts packages\builtin-tools\src\tools\SendMessageTool\__tests__\udsRecipientSanitization.test.ts

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Tested: bun audit

Tested: git diff --check

Tested: Claude simplify review GO (.omx/artifacts/claude-simplify-codecov-20260427-1521.md)

Tested: Claude security review GO (.omx/artifacts/claude-security-codecov-20260427-1522.md)

Not-tested: GitHub-hosted Codecov upload after this amended commit until PR checks rerun

* test: keep review assertions tied to real failure paths

CodeRabbit flagged three non-blocking but valid review gaps: platform-specific mailbox errno checks, brittle UDS connection-failure message assertions, and missing AgentSummary reschedule proof after fork errors. This keeps the fixes narrow by tightening the affected assertions and adding a structured UDS connection error for tests to assert behavior instead of prose.

Constraint: PR #374 is a review follow-up and must not hide warnings, skip tests, or merge the PR.

Rejected: Matching the UDS failure message literal | preserves the brittle coupling CodeRabbit flagged.

Rejected: Asserting only that mailbox writes throw | would allow unrelated pre-path failures to pass.

Confidence: high

Scope-risk: narrow

Directive: Keep UDS connection-failure tests on structured error data, not display wording.

Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/teammateMailbox.test.ts src/utils/__tests__/udsMessaging.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.

* test: remove brittle review follow-up assumptions

CodeRabbit's second pass found two valid brittleness issues and one suggested callback-reference assertion that would not match production behavior. This keeps the production behavior unchanged: timers still schedule the summarizer closure, tests now assert timer-handle identity, and UDS connection errors use native Error.cause instead of shadowing it.

Constraint: Do not manufacture behavior just to satisfy a review hint; assertions must match the real AgentSummary scheduling contract.

Rejected: Assert a fresh scheduled callback function | scheduleNext intentionally passes the same runSummary closure each time.

Rejected: Store a custom cause field on UdsPeerConnectionError | native Error.cause is available under ESNext/Bun.

Confidence: high

Scope-risk: narrow

Directive: Timer tests should assert returned handle identity for ownership, not incidental numeric values.

Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.

* test: enforce structured UDS timeout failures

CodeRabbit's follow-up surfaced a real consistency gap: UDS send socket errors used UdsPeerConnectionError while response timeouts still rejected a generic Error. Timeouts now use the same structured peer failure contract, and the test exercises that path through a short explicit timeout instead of waiting for the production default.

The AgentSummary unchanged-fingerprint test now also asserts that the second unchanged tick does not log errors, preserving the existing behavior checks without changing production scheduling semantics.

Constraint: Keep the production timeout default at 5000ms while allowing tests to exercise the timeout path quickly.

Rejected: Leave timeout failures as generic Error | callers would need separate handling for the same peer connection failure class.

Confidence: high

Scope-risk: narrow

Directive: Keep UDS send timeout and socket-error branches on the same structured error contract.

Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts

Tested: bunx tsc --noEmit --pretty false

Tested: bun run lint

Tested: bun run test:all

Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage

Tested: bun run build

Tested: bun run build:vite

Not-tested: GitHub-hosted CodeRabbit refresh until pushed.

---------

Co-authored-by: unraid <local@unraid.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants