Skip to content

Commit 86a8929

Browse files
committed
feat: extend chaos testing to proxy/record mode
Pre-flight chaos (drop/disconnect) prevents upstream contact. Post-response chaos (malformed) corrupts relay after recording. SSE bypass tracked via aimock_chaos_bypassed_total metric. Explicit source label (fixture/proxy) on all chaos metrics. All handlers migrated to new applyChaos signature.
1 parent 6d09323 commit 86a8929

26 files changed

Lines changed: 929 additions & 184 deletions

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
# @copilotkit/aimock
22

3+
## [Unreleased]
4+
5+
### Added
6+
7+
- **Chaos testing in proxy mode** — Pre-flight chaos (drop/disconnect) prevents upstream contact; post-response chaos (malformed) corrupts relay body after recording the real upstream response. SSE bypass tracked via `aimock_chaos_bypassed_total` metric. Explicit `source` label (`fixture`/`proxy`/`internal`) on all chaos Prometheus counters and journal entries.
8+
39
## [1.17.0] - 2026-05-04
410

511
### Added

docs/chaos-testing/index.html

Lines changed: 76 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,10 @@ <h2>Failure Modes</h2>
7070
<td>HTTP 500</td>
7171
<td>
7272
Returns a 500 error with
73-
<code>{"error":{"message":"Chaos: request dropped","code":"chaos_drop"}}</code>
73+
<code
74+
>{"error":{"message":"Chaos: request
75+
dropped","type":"server_error","code":"chaos_drop"}}</code
76+
>
7477
</td>
7578
</tr>
7679
<tr>
@@ -205,7 +208,7 @@ <h2>Per-Request Headers</h2>
205208
<span class="str">"Content-Type"</span>: <span class="str">"application/json"</span>,
206209
<span class="str">"x-aimock-chaos-disconnect"</span>: <span class="str">"1.0"</span>,
207210
},
208-
<span class="prop">body</span>: <span class="fn">JSON.stringify</span>({ <span class="prop">model</span>: <span class="str">"gpt-4"</span>, <span class="prop">messages</span>: [...] }),
211+
<span class="prop">body</span>: <span class="fn">JSON.stringify</span>({ <span class="prop">model</span>: <span class="str">"gpt-4"</span>, <span class="prop">messages</span>: [{ <span class="prop">role</span>: <span class="str">"user"</span>, <span class="prop">content</span>: <span class="str">"hello"</span> }] }),
209212
});</code></pre>
210213
</div>
211214

@@ -218,7 +221,7 @@ <h2>CLI Flags</h2>
218221
<div class="code-block-header">
219222
CLI chaos flags <span class="lang-tag">shell</span>
220223
</div>
221-
<pre><code>$ npx -p @copilotkit/aimock llmock --fixtures ./fixtures \
224+
<pre><code>$ npx -p @copilotkit/aimock aimock --fixtures ./fixtures \
222225
--chaos-drop 0.1 \
223226
--chaos-malformed 0.05 \
224227
--chaos-disconnect 0.02</code></pre>
@@ -240,6 +243,55 @@ <h2>CLI Flags</h2>
240243
</div>
241244
</div>
242245

246+
<h2>Proxy Mode</h2>
247+
<p>
248+
When aimock is configured as a record/replay proxy (<code>--record</code>), chaos applies
249+
to proxied requests too &mdash; so a staging environment pointed at real upstream APIs
250+
still sees the failure modes your tests expect. Chaos is rolled <em>once per request</em>,
251+
after fixture matching, with the same headers&nbsp;&gt;&nbsp;fixture&nbsp;&gt;&nbsp;server
252+
precedence.
253+
</p>
254+
<table class="endpoint-table">
255+
<thead>
256+
<tr>
257+
<th>Mode</th>
258+
<th>When upstream is contacted</th>
259+
<th>What the client sees</th>
260+
</tr>
261+
</thead>
262+
<tbody>
263+
<tr>
264+
<td><code>drop</code></td>
265+
<td>Never &mdash; upstream not contacted</td>
266+
<td>HTTP 500 chaos body; upstream is not called</td>
267+
</tr>
268+
<tr>
269+
<td><code>disconnect</code></td>
270+
<td>Never &mdash; upstream not contacted</td>
271+
<td>Connection destroyed; upstream is not called</td>
272+
</tr>
273+
<tr>
274+
<td><code>malformed</code></td>
275+
<td>Called &mdash; post-response</td>
276+
<td>
277+
Request proxies normally; the upstream response is captured, then the body is
278+
replaced with invalid JSON before relay. The recorded fixture (if recording) keeps
279+
the real upstream response &mdash; chaos is a live-traffic decoration, not a fixture
280+
mutation.
281+
</td>
282+
</tr>
283+
</tbody>
284+
</table>
285+
<p>
286+
<strong>SSE bypass.</strong> If upstream returns
287+
<code>Content-Type: text/event-stream</code>, aimock streams chunks to the client
288+
progressively. By the time <code>malformed</code> would fire, the bytes are already on the
289+
wire &mdash; the chaos action cannot be applied. This bypass is observable via the
290+
<code>aimock_chaos_bypassed_total</code> counter (see Prometheus Metrics below) and a
291+
warning in the server log, so a configured chaos rate doesn't silently drop to 0% on SSE
292+
traffic. Streaming mutation is planned for a future phase.
293+
</p>
294+
243295
<h2>Journal Tracking</h2>
244296
<p>
245297
When chaos triggers, the journal entry includes a <code>chaosAction</code> field recording
@@ -255,6 +307,7 @@ <h2>Journal Tracking</h2>
255307
"path": "/v1/chat/completions",
256308
"response": {
257309
"status": 500,
310+
"source": "fixture",
258311
"fixture": { "...": "elided for brevity" },
259312
"chaosAction": "drop"
260313
}
@@ -269,15 +322,31 @@ <h2>Journal Tracking</h2>
269322
<h2>Prometheus Metrics</h2>
270323
<p>
271324
When metrics are enabled (<code>--metrics</code>), each chaos trigger increments the
272-
<code>aimock_chaos_triggered_total</code> counter with an <code>action</code> label:
325+
<code>aimock_chaos_triggered_total</code> counter, tagged with <code>action</code> and
326+
<code>source</code>. <code>source="fixture"</code> means a fixture matched (or would have,
327+
before chaos intervened); <code>source="proxy"</code> means the request was on the proxy
328+
dispatch path.
273329
</p>
274330

275331
<div class="code-block">
276332
<div class="code-block-header">Metrics output <span class="lang-tag">text</span></div>
277333
<pre><code># TYPE aimock_chaos_triggered_total counter
278-
aimock_chaos_triggered_total{action="drop"} 3
279-
aimock_chaos_triggered_total{action="malformed"} 1
280-
aimock_chaos_triggered_total{action="disconnect"} 2</code></pre>
334+
aimock_chaos_triggered_total{action="drop",source="fixture"} 3
335+
aimock_chaos_triggered_total{action="malformed",source="fixture"} 1
336+
aimock_chaos_triggered_total{action="disconnect",source="proxy"} 2</code></pre>
337+
</div>
338+
339+
<p>
340+
When a chaos action is rolled but can't be applied &mdash; today, only
341+
<code>malformed</code> on an SSE proxy response &mdash; the bypass is recorded in a
342+
separate counter so operators can distinguish "chaos didn't roll" from "chaos rolled but
343+
was bypassed":
344+
</p>
345+
346+
<div class="code-block">
347+
<div class="code-block-header">Bypass counter <span class="lang-tag">text</span></div>
348+
<pre><code># TYPE aimock_chaos_bypassed_total counter
349+
aimock_chaos_bypassed_total{action="malformed",source="proxy",reason="sse_streamed"} 4</code></pre>
281350
</div>
282351
</main>
283352
<aside class="page-toc" id="page-toc"></aside>
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
import { describe, it, expect, afterEach } from "vitest";
2+
import { createServer, type ServerInstance } from "../server.js";
3+
4+
// minimal helpers duplicated to keep this test isolated
5+
import * as http from "node:http";
6+
7+
function post(url: string, body: unknown): Promise<{ status: number; body: string }> {
8+
return new Promise((resolve, reject) => {
9+
const data = JSON.stringify(body);
10+
const parsed = new URL(url);
11+
const req = http.request(
12+
{
13+
hostname: parsed.hostname,
14+
port: parsed.port,
15+
path: parsed.pathname,
16+
method: "POST",
17+
headers: {
18+
"Content-Type": "application/json",
19+
"Content-Length": Buffer.byteLength(data),
20+
},
21+
},
22+
(res) => {
23+
const chunks: Buffer[] = [];
24+
res.on("data", (c: Buffer) => chunks.push(c));
25+
res.on("end", () => {
26+
resolve({ status: res.statusCode ?? 0, body: Buffer.concat(chunks).toString() });
27+
});
28+
},
29+
);
30+
req.on("error", reject);
31+
req.write(data);
32+
req.end();
33+
});
34+
}
35+
36+
let server: ServerInstance | undefined;
37+
38+
afterEach(async () => {
39+
if (server) {
40+
await new Promise<void>((resolve) => server!.server.close(() => resolve()));
41+
server = undefined;
42+
}
43+
});
44+
45+
const CHAT_REQUEST = {
46+
model: "gpt-4",
47+
messages: [{ role: "user", content: "What is the capital of France?" }],
48+
};
49+
50+
describe("chaos (fixture mode)", () => {
51+
it("chaos short-circuits even when fixture would match", async () => {
52+
const fixture = {
53+
match: { userMessage: "capital of France" },
54+
response: { content: "Paris" },
55+
};
56+
57+
server = await createServer([fixture], {
58+
port: 0,
59+
chaos: { dropRate: 1.0 },
60+
});
61+
62+
const resp = await post(`${server.url}/v1/chat/completions`, CHAT_REQUEST);
63+
64+
expect(resp.status).toBe(500);
65+
const body = JSON.parse(resp.body);
66+
expect(body).toMatchObject({ error: { code: "chaos_drop" } });
67+
});
68+
69+
it("rolls chaos once per request: drop journals the matched fixture, not null", async () => {
70+
// Pins the single-roll behavior: chaos evaluation happens AFTER fixture
71+
// matching, so when drop fires on a request that matches a fixture, the
72+
// journal entry reflects the match (not null, as the old double-roll
73+
// pre-flight path would have recorded).
74+
const fixture = {
75+
match: { userMessage: "capital of France" },
76+
response: { content: "Paris" },
77+
};
78+
79+
server = await createServer([fixture], {
80+
port: 0,
81+
chaos: { dropRate: 1.0 },
82+
});
83+
84+
const resp = await post(`${server.url}/v1/chat/completions`, CHAT_REQUEST);
85+
expect(resp.status).toBe(500);
86+
87+
const last = server.journal.getLast();
88+
expect(last?.response.chaosAction).toBe("drop");
89+
expect(last?.response.fixture).toBe(fixture);
90+
// Match count reflects that the fixture did participate in the decision
91+
expect(server.journal.getFixtureMatchCount(fixture)).toBe(1);
92+
});
93+
94+
it("disconnect journals the matched fixture with status 0", async () => {
95+
// Symmetric to the drop test above. Disconnect's status is 0 (no response
96+
// ever written before res.destroy()) which is a slightly unusual shape;
97+
// pin it so future refactors don't silently change it to e.g. 500.
98+
const fixture = {
99+
match: { userMessage: "capital of France" },
100+
response: { content: "Paris" },
101+
};
102+
103+
server = await createServer([fixture], {
104+
port: 0,
105+
chaos: { disconnectRate: 1.0 },
106+
});
107+
108+
// Client sees a socket destroy mid-request → post() rejects
109+
await expect(post(`${server.url}/v1/chat/completions`, CHAT_REQUEST)).rejects.toThrow();
110+
111+
const last = server.journal.getLast();
112+
expect(last?.response.chaosAction).toBe("disconnect");
113+
expect(last?.response.status).toBe(0);
114+
expect(last?.response.fixture).toBe(fixture);
115+
expect(server.journal.getFixtureMatchCount(fixture)).toBe(1);
116+
});
117+
});

src/__tests__/metrics.test.ts

Lines changed: 41 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -549,7 +549,7 @@ describe("integration: /metrics endpoint", () => {
549549
expect(infMatch![1]).toBe(countMatch![1]);
550550
});
551551

552-
it("increments chaos counter when chaos triggers", async () => {
552+
it("increments chaos counter when chaos triggers (fixture source)", async () => {
553553
const fixtures: Fixture[] = [
554554
{
555555
match: { userMessage: "hello" },
@@ -565,7 +565,43 @@ describe("integration: /metrics endpoint", () => {
565565

566566
const res = await httpGet(`${instance.url}/metrics`);
567567
expect(res.body).toContain("aimock_chaos_triggered_total");
568-
expect(res.body).toMatch(/aimock_chaos_triggered_total\{[^}]*action="drop"[^}]*\} 1/);
568+
// Require both labels: action AND source. The source label is part of the
569+
// public metric contract (added when chaos was extended to proxy mode) and
570+
// an unasserted label is a regression hazard — future callers that forget
571+
// to pass source would serialize `source=""` and pass a bare action match.
572+
expect(res.body).toMatch(
573+
/aimock_chaos_triggered_total\{[^}]*action="drop"[^}]*source="fixture"[^}]*\} 1/,
574+
);
575+
});
576+
577+
it('chaos counter carries source="proxy" on proxy path', async () => {
578+
// Counterpart to the fixture-source test: proves the source label flips
579+
// correctly when the chaos roll belongs to the proxy dispatch branch.
580+
// Together these two tests pin both label values of the source dimension.
581+
const upstream = await createServer(
582+
[{ match: { userMessage: "hi" }, response: { content: "upstream" } }],
583+
{ port: 0 },
584+
);
585+
try {
586+
instance = await createServer([], {
587+
metrics: true,
588+
chaos: { dropRate: 1.0 },
589+
record: {
590+
providers: { openai: upstream.url },
591+
fixturePath: "/tmp/aimock-metrics-proxy-source",
592+
proxyOnly: true,
593+
},
594+
});
595+
596+
await httpPost(`${instance.url}/v1/chat/completions`, chatRequest("hi"));
597+
598+
const res = await httpGet(`${instance.url}/metrics`);
599+
expect(res.body).toMatch(
600+
/aimock_chaos_triggered_total\{[^}]*action="drop"[^}]*source="proxy"[^}]*\} 1/,
601+
);
602+
} finally {
603+
await new Promise<void>((resolve) => upstream.server.close(() => resolve()));
604+
}
569605
});
570606

571607
it("increments chaos counter on Anthropic /v1/messages endpoint", async () => {
@@ -588,7 +624,9 @@ describe("integration: /metrics endpoint", () => {
588624

589625
const res = await httpGet(`${instance.url}/metrics`);
590626
expect(res.body).toContain("aimock_chaos_triggered_total");
591-
expect(res.body).toMatch(/aimock_chaos_triggered_total\{[^}]*action="drop"[^}]*\} 1/);
627+
expect(res.body).toMatch(
628+
/aimock_chaos_triggered_total\{[^}]*action="drop"[^}]*source="fixture"[^}]*\} 1/,
629+
);
592630
});
593631

594632
it("tracks fixtures loaded gauge", async () => {

src/__tests__/multimedia-record.test.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ describe("multimedia record: image response detection", () => {
105105
};
106106

107107
const { req, res } = createMockReqRes("/v1/images/generations");
108-
const proxied = await proxyAndRecord(
108+
const outcome = await proxyAndRecord(
109109
req,
110110
res,
111111
request,
@@ -115,7 +115,7 @@ describe("multimedia record: image response detection", () => {
115115
{ record, logger },
116116
);
117117

118-
expect(proxied).toBe(true);
118+
expect(outcome).toBe("relayed");
119119
expect(fixtures).toHaveLength(1);
120120
const fixture = fixtures[0];
121121
expect(fixture.match.endpoint).toBe("image");

0 commit comments

Comments
 (0)