Skip to content

Commit b47731a

Browse files
amDosionunraid
andauthored
test: keep Codecov coverage on real agent communication paths (#374)
* test: keep Codecov coverage on real agent communication paths PR #369 was merged before the final Codecov coverage fix landed, so this follow-up carries only the incremental real-path tests needed on top of main. The tests exercise AgentSummary lifecycle branches, mailbox fail-closed behavior, UDS client connection failure through a real capability file, and UDS response-reader framing without mock.module, warning suppression, feature fallback, or production-code churn. Constraint: PR #369 is already merged; this branch must contain only the incremental Codecov repair on top of latest main Rejected: Reopen or keep pushing the merged PR branch | merged PR refs do not update and would leave Codecov stale Rejected: Mock bun:bundle or hide warnings | would reintroduce cross-test pollution and pseudo coverage Rejected: Keep unrelated SendMessageTool production diff | it created avoidable patch-coverage debt without improving the runtime path Confidence: high Scope-risk: narrow Directive: Keep these coverage tests on real paths; do not replace them with output suppression or feature-flag mocks Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun test src\utils\__tests__\teammateMailbox.test.ts Tested: bun test src\services\AgentSummary\__tests__\agentSummary.test.ts src\services\AgentSummary\__tests__\summaryContext.test.ts src\utils\__tests__\teammateMailbox.test.ts src\utils\__tests__\udsMessaging.test.ts src\utils\__tests__\udsResponseReader.test.ts packages\builtin-tools\src\tools\SendMessageTool\__tests__\udsRecipientSanitization.test.ts Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Tested: bun audit Tested: git diff --check Tested: Claude simplify review GO (.omx/artifacts/claude-simplify-codecov-20260427-1521.md) Tested: Claude security review GO (.omx/artifacts/claude-security-codecov-20260427-1522.md) Not-tested: GitHub-hosted Codecov upload after this amended commit until PR checks rerun * test: keep review assertions tied to real failure paths CodeRabbit flagged three non-blocking but valid review gaps: platform-specific mailbox errno checks, brittle UDS connection-failure message assertions, and missing AgentSummary reschedule proof after fork errors. This keeps the fixes narrow by tightening the affected assertions and adding a structured UDS connection error for tests to assert behavior instead of prose. Constraint: PR #374 is a review follow-up and must not hide warnings, skip tests, or merge the PR. Rejected: Matching the UDS failure message literal | preserves the brittle coupling CodeRabbit flagged. Rejected: Asserting only that mailbox writes throw | would allow unrelated pre-path failures to pass. Confidence: high Scope-risk: narrow Directive: Keep UDS connection-failure tests on structured error data, not display wording. Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/teammateMailbox.test.ts src/utils/__tests__/udsMessaging.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Not-tested: GitHub-hosted CodeRabbit refresh until pushed. * test: remove brittle review follow-up assumptions CodeRabbit's second pass found two valid brittleness issues and one suggested callback-reference assertion that would not match production behavior. This keeps the production behavior unchanged: timers still schedule the summarizer closure, tests now assert timer-handle identity, and UDS connection errors use native Error.cause instead of shadowing it. Constraint: Do not manufacture behavior just to satisfy a review hint; assertions must match the real AgentSummary scheduling contract. Rejected: Assert a fresh scheduled callback function | scheduleNext intentionally passes the same runSummary closure each time. Rejected: Store a custom cause field on UdsPeerConnectionError | native Error.cause is available under ESNext/Bun. Confidence: high Scope-risk: narrow Directive: Timer tests should assert returned handle identity for ownership, not incidental numeric values. Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Not-tested: GitHub-hosted CodeRabbit refresh until pushed. * test: enforce structured UDS timeout failures CodeRabbit's follow-up surfaced a real consistency gap: UDS send socket errors used UdsPeerConnectionError while response timeouts still rejected a generic Error. Timeouts now use the same structured peer failure contract, and the test exercises that path through a short explicit timeout instead of waiting for the production default. The AgentSummary unchanged-fingerprint test now also asserts that the second unchanged tick does not log errors, preserving the existing behavior checks without changing production scheduling semantics. Constraint: Keep the production timeout default at 5000ms while allowing tests to exercise the timeout path quickly. Rejected: Leave timeout failures as generic Error | callers would need separate handling for the same peer connection failure class. Confidence: high Scope-risk: narrow Directive: Keep UDS send timeout and socket-error branches on the same structured error contract. Tested: bun test src/services/AgentSummary/__tests__/agentSummary.test.ts src/utils/__tests__/udsMessaging.test.ts Tested: bunx tsc --noEmit --pretty false Tested: bun run lint Tested: bun run test:all Tested: bun test --coverage --coverage-reporter lcov --coverage-dir coverage Tested: bun run build Tested: bun run build:vite Not-tested: GitHub-hosted CodeRabbit refresh until pushed. --------- Co-authored-by: unraid <local@unraid.local>
1 parent a65df4a commit b47731a

6 files changed

Lines changed: 323 additions & 54 deletions

File tree

src/services/AgentSummary/__tests__/agentSummary.test.ts

Lines changed: 123 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,10 @@ import type {
55
CacheSafeParams,
66
ForkedAgentResult,
77
} from '../../../utils/forkedAgent.js'
8-
import { startAgentSummarization } from '../agentSummary.js'
8+
import {
9+
type AgentSummaryDependencies,
10+
startAgentSummarization,
11+
} from '../agentSummary.js'
912

1013
const transcriptMessages = [
1114
{ type: 'user', message: { content: 'start' }, uuid: 'u1' },
@@ -27,17 +30,16 @@ describe('startAgentSummarization', () => {
2730
let forkCalls: ForkCall[]
2831
let updateCalls: Array<{ taskId: string; summary: string }>
2932
let transcriptMessagesForTest: Message[]
33+
let debugLogs: string[]
34+
let loggedErrors: Error[]
35+
let clearedHandles: unknown[]
36+
let scheduledCount: number
37+
let lastTimerHandle: unknown
3038

31-
beforeEach(() => {
32-
forkCalls = []
33-
updateCalls = []
34-
scheduled = undefined
35-
handle = undefined
36-
transcriptMessagesForTest = transcriptMessages
37-
})
38-
39-
test('summarizes bounded transcript once and skips unchanged fingerprints', async () => {
40-
handle = startAgentSummarization(
39+
function startTestSummarization(
40+
dependencies: AgentSummaryDependencies = {},
41+
): { stop: () => void } {
42+
return startAgentSummarization(
4143
'task-1',
4244
asAgentId('a0000000000000000'),
4345
{
@@ -48,14 +50,22 @@ describe('startAgentSummarization', () => {
4850
} as unknown as CacheSafeParams,
4951
() => undefined,
5052
{
51-
clearTimeout: () => undefined,
53+
clearTimeout: ((timeoutId: unknown) => {
54+
clearedHandles.push(timeoutId)
55+
}) as typeof clearTimeout,
5256
getAgentTranscript: async () => ({
5357
messages: transcriptMessagesForTest,
5458
contentReplacements: [],
5559
}),
5660
isPoorModeActive: () => false,
57-
logError: () => undefined,
58-
logForDebugging: () => undefined,
61+
logError: error => {
62+
loggedErrors.push(
63+
error instanceof Error ? error : new Error(String(error)),
64+
)
65+
},
66+
logForDebugging: message => {
67+
debugLogs.push(message)
68+
},
5969
runForkedAgent: async (args: ForkCall) => {
6070
forkCalls.push(args)
6171
return {
@@ -73,14 +83,34 @@ describe('startAgentSummarization', () => {
7383
if (typeof callback !== 'function') {
7484
throw new Error('Expected timer callback')
7585
}
86+
scheduledCount += 1
7687
scheduled = callback as () => void | Promise<void>
77-
return 1 as unknown as ReturnType<typeof setTimeout>
88+
lastTimerHandle = { id: scheduledCount }
89+
return lastTimerHandle as ReturnType<typeof setTimeout>
7890
}) as unknown as typeof setTimeout,
7991
updateAgentSummary: (taskId: string, summary: string) => {
8092
updateCalls.push({ taskId, summary })
8193
},
94+
...dependencies,
8295
},
8396
)
97+
}
98+
99+
beforeEach(() => {
100+
forkCalls = []
101+
updateCalls = []
102+
scheduled = undefined
103+
handle = undefined
104+
transcriptMessagesForTest = transcriptMessages
105+
debugLogs = []
106+
loggedErrors = []
107+
clearedHandles = []
108+
scheduledCount = 0
109+
lastTimerHandle = undefined
110+
})
111+
112+
test('summarizes bounded transcript once and skips unchanged fingerprints', async () => {
113+
handle = startTestSummarization()
84114

85115
expect(typeof scheduled).toBe('function')
86116
await scheduled!()
@@ -104,49 +134,95 @@ describe('startAgentSummarization', () => {
104134

105135
expect(forkCalls).toHaveLength(1)
106136
expect(updateCalls).toHaveLength(1)
137+
expect(loggedErrors).toEqual([])
107138
})
108139

109-
test('skips summarization when bounded context is too small', async () => {
110-
transcriptMessagesForTest = transcriptMessages.slice(0, 2)
111-
112-
handle = startAgentSummarization(
113-
'task-1',
114-
asAgentId('a0000000000000000'),
140+
test('skips summarization when filtering leaves too little bounded context', async () => {
141+
transcriptMessagesForTest = [
142+
{ type: 'user', message: { content: 'start' }, uuid: 'u1' },
115143
{
116-
forkContextMessages: transcriptMessages,
117-
model: 'claude-test',
118-
} as unknown as CacheSafeParams,
119-
() => undefined,
120-
{
121-
clearTimeout: () => undefined,
122-
getAgentTranscript: async () => ({
123-
messages: transcriptMessagesForTest,
124-
contentReplacements: [],
125-
}),
126-
isPoorModeActive: () => false,
127-
logError: () => undefined,
128-
logForDebugging: () => undefined,
129-
runForkedAgent: async (args: ForkCall) => {
130-
forkCalls.push(args)
131-
return { messages: [] } as unknown as ForkedAgentResult
132-
},
133-
setTimeout: ((callback: TimerHandler) => {
134-
if (typeof callback !== 'function') {
135-
throw new Error('Expected timer callback')
136-
}
137-
scheduled = callback as () => void | Promise<void>
138-
return 1 as unknown as ReturnType<typeof setTimeout>
139-
}) as unknown as typeof setTimeout,
140-
updateAgentSummary: (taskId: string, summary: string) => {
141-
updateCalls.push({ taskId, summary })
144+
type: 'assistant',
145+
uuid: 'a1',
146+
message: {
147+
content: [{ type: 'tool_use', id: 'missing', name: 'Read' }],
142148
},
143149
},
150+
{ type: 'user', message: { content: 'continue' }, uuid: 'u2' },
151+
] as unknown as Message[]
152+
153+
handle = startTestSummarization()
154+
155+
expect(typeof scheduled).toBe('function')
156+
await scheduled!()
157+
158+
expect(forkCalls).toEqual([])
159+
expect(updateCalls).toEqual([])
160+
expect(debugLogs).toContain(
161+
'[AgentSummary] Skipping summary for task-1: no bounded context available',
144162
)
163+
})
164+
165+
test('skips summarization before building context when transcript is too short', async () => {
166+
transcriptMessagesForTest = transcriptMessages.slice(0, 2)
167+
handle = startTestSummarization()
145168

146169
expect(typeof scheduled).toBe('function')
147170
await scheduled!()
148171

149172
expect(forkCalls).toEqual([])
150173
expect(updateCalls).toEqual([])
174+
expect(debugLogs).toContain(
175+
'[AgentSummary] Skipping summary for task-1: not enough messages (2)',
176+
)
177+
})
178+
179+
test('skips and reschedules while poor mode is active', async () => {
180+
handle = startTestSummarization({
181+
isPoorModeActive: () => true,
182+
})
183+
184+
expect(typeof scheduled).toBe('function')
185+
const initialScheduledCount = scheduledCount
186+
const initialTimerHandle = lastTimerHandle
187+
await scheduled!()
188+
189+
expect(forkCalls).toEqual([])
190+
expect(updateCalls).toEqual([])
191+
expect(debugLogs).toContain(
192+
'[AgentSummary] Skipping summary — poor mode active',
193+
)
194+
expect(scheduledCount).toBe(initialScheduledCount + 1)
195+
expect(lastTimerHandle).not.toBe(initialTimerHandle)
196+
})
197+
198+
test('logs summary errors and schedules the next timer', async () => {
199+
const error = new Error('fork failed')
200+
handle = startTestSummarization({
201+
runForkedAgent: async () => {
202+
throw error
203+
},
204+
})
205+
206+
expect(typeof scheduled).toBe('function')
207+
const initialScheduledCount = scheduledCount
208+
const initialTimerHandle = lastTimerHandle
209+
await scheduled!()
210+
211+
expect(loggedErrors).toEqual([error])
212+
expect(updateCalls).toEqual([])
213+
expect(scheduledCount).toBe(initialScheduledCount + 1)
214+
expect(lastTimerHandle).not.toBe(initialTimerHandle)
215+
})
216+
217+
test('stop clears the pending summary timer', () => {
218+
handle = startTestSummarization()
219+
const pendingHandle = lastTimerHandle
220+
221+
handle.stop()
222+
223+
expect(debugLogs).toContain(
224+
'[AgentSummary] Stopping summarization for task-1',
225+
)
226+
expect(clearedHandles).toEqual([pendingHandle])
151227
})
152228
})

src/services/AgentSummary/__tests__/summaryContext.test.ts

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,13 @@ describe('getSummaryContextFingerprint', () => {
141141
expect(estimateMessageChars(message)).toBeGreaterThan(0)
142142
})
143143

144+
test('treats unsupported top-level primitives as zero-size estimates', () => {
145+
expect(
146+
estimateMessageChars((() => undefined) as unknown as Message),
147+
).toBe(0)
148+
expect(estimateMessageChars(1n as unknown as Message)).toBe(0)
149+
})
150+
144151
test('returns null for an empty transcript', () => {
145152
expect(getSummaryContextFingerprint([])).toBeNull()
146153
})

src/utils/__tests__/teammateMailbox.test.ts

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
11
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
2-
import { mkdir, readFile, rm, writeFile } from 'node:fs/promises'
2+
import { mkdir, readFile, rm, stat, writeFile } from 'node:fs/promises'
33
import { mkdtempSync } from 'node:fs'
44
import { tmpdir } from 'node:os'
55
import { dirname, join } from 'node:path'
66
import type { Message } from 'src/types/message.js'
7+
import { getErrnoCode } from 'src/utils/errors.js'
78
import {
89
compactMailboxMessages,
910
getLastPeerDmSummary,
@@ -171,6 +172,17 @@ describe('compactMailboxMessages', () => {
171172

172173
expect(compacted).toEqual([])
173174
})
175+
176+
test('returns an empty mailbox when all retention lanes are disabled', () => {
177+
const compacted = compactMailboxMessages([message('unread', false)], {
178+
maxMessages: 0,
179+
maxReadMessages: 0,
180+
maxUnreadProtocolMessages: 0,
181+
maxRetainedBytes: 1_000,
182+
})
183+
184+
expect(compacted).toEqual([])
185+
})
174186
})
175187

176188
describe('teammate mailbox retention', () => {
@@ -331,6 +343,32 @@ describe('teammate mailbox retention', () => {
331343
expect(await readFile(inboxPath, 'utf-8')).toBe('{not-json')
332344
})
333345

346+
test('writeToMailbox rejects when the inbox path is already a directory', async () => {
347+
const inboxPath = getInboxPath('worker', 'alpha')
348+
await mkdir(inboxPath, { recursive: true })
349+
350+
const error = await writeToMailbox(
351+
'worker',
352+
{
353+
from: 'team-lead',
354+
text: 'new',
355+
timestamp: new Date(5).toISOString(),
356+
},
357+
'alpha',
358+
).then(
359+
() => undefined,
360+
err => err,
361+
)
362+
363+
const code = getErrnoCode(error)
364+
expect(code).toBeDefined()
365+
if (code === undefined) {
366+
throw new Error('Expected filesystem errno code')
367+
}
368+
expect(['EISDIR', 'EPERM', 'EACCES']).toContain(code)
369+
expect((await stat(inboxPath)).isDirectory()).toBe(true)
370+
})
371+
334372
test('readMailbox fails closed on corrupt mailbox content', async () => {
335373
const inboxPath = getInboxPath('worker', 'alpha')
336374
await mkdir(dirname(inboxPath), { recursive: true })

0 commit comments

Comments
 (0)