mock-server
diff --git a/‎changelog.md‎
Lines changed: 1 addition & 0 deletions b/‎changelog.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/code/llm-mocking.md‎
Lines changed: 12 additions & 0 deletions b/‎docs/code/llm-mocking.md‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎docs/plans/mockserver-llm-mocking.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/plans/mockserver-llm-mocking.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎jekyll-www.mock-server.com/mock_server/ai_mcp_tools.html‎
Lines changed: 22 additions & 0 deletions b/‎jekyll-www.mock-server.com/mock_server/ai_mcp_tools.html‎
Lines changed: 22 additions & 0 deletions
diff --git a/‎mockserver-ui/src/__tests__/conversationCodegen.test.ts‎
Lines changed: 58 additions & 0 deletions b/‎mockserver-ui/src/__tests__/conversationCodegen.test.ts‎
Lines changed: 58 additions & 0 deletions
diff --git a/‎mockserver-ui/src/components/ConversationWizardStep2.tsx‎
Lines changed: 95 additions & 1 deletion b/‎mockserver-ui/src/components/ConversationWizardStep2.tsx‎
Lines changed: 95 additions & 1 deletion
@@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 
 ### Added
+- Added declarative **LLM fault/chaos profiles** for resilience testing, attachable to any mock LLM response (`mock_llm_completion`, each `create_llm_conversation` turn, the Java `LlmConversationBuilder`, and raw expectation JSON via a `chaos` block). Supports probabilistic provider errors (e.g. 429/529 with a `Retry-After` header), mid-stream truncation of an SSE stream (keep a leading fraction of events), and appending a malformed (broken-JSON) SSE chunk. Errors are deterministic at probability 0.0/1.0 and reproducible at fractional probabilities via a `seed`; truncation and malformed-SSE are always deterministic. A new `LLM_CHAOS_INJECTED_COUNT` metric tracks injections. The dashboard conversation wizard exposes the profile per turn. See the AI/MCP tools page and `docs/code/llm-mocking.md`.
 - Added two MCP tools for **agent-run analysis and tool-call assertions**, both backed by a new deterministic `org.mockserver.llm.analysis.AgentRunAnalyzer` that reconstructs an agent run by decoding the LLM requests MockServer recorded. `verify_tool_call` asserts that an agent called a named tool a given number of times (`atLeast`/`atMost`, with an optional regex over the tool-call arguments); `explain_agent_run` summarises the run's structure (message and assistant-turn counts, the ordered tool-call sequence, tool results, and the latest message role). Read-only and offline — no LLM call. See the AI/MCP tools page and `docs/code/llm-mocking.md`.
 - Added a **runtime-LLM client SPI** (`org.mockserver.llm.client`) that lets MockServer call a real LLM you already run, as the foundation for opt-in features such as drift detection and exploratory semantic matching. Mirrors the existing codec registry: an `LlmClient` per provider (Ollama, OpenAI, OpenAI Responses, Azure OpenAI, Anthropic, Gemini, Bedrock) registered in `LlmClientRegistry`, an immutable `LlmBackend` config (with the API key redacted in logs), and a three-layer `LlmBackendResolver` (provider env vars → `mockserver.llmProvider`/`llmApiKey`/`llmModel`/`llmBaseUrl` → named-backends JSON via `mockserver.llmBackendsConfig`). All runtime-LLM use goes through `LlmCompletionService`, which is **off unless a backend is configured**, **fails closed** on any timeout/error/non-2xx (never flipping a deterministic result), and caches per normalised prompt for reproducibility. Ollama is the reference backend (no key, local); Bedrock builds the Anthropic-on-Bedrock request and relies on the `headers` escape hatch pending automatic SigV4 signing. See the configuration properties page and `docs/code/llm-mocking.md`.
 - LLM conversation mocks can now opt into deterministic **prompt normalisation** before the `latestMessageContains` / `latestMessageMatches` predicates are evaluated, so a match is not blocked by cosmetic differences in dynamically-assembled agent prompts. A new `normalization` block on `conversationPredicates` (also exposed per-turn in the `create_llm_conversation` MCP tool and the dashboard conversation wizard) supports collapsing whitespace, lowercasing, sorting JSON object keys, dropping built-in volatile values (ISO-8601 timestamps, UUIDs, `req_`/`msg_`/`call_` ids), and dropping named JSON fields. Normalisation is pure and idempotent — it never makes a test flaky — and has no effect unless a text predicate is set. See the AI/MCP tools page and `docs/code/llm-mocking.md`.
 
@@ -167,6 +167,16 @@ Two MCP tools expose the LLM mocking feature to agents:
 
 The first two validate provider availability against `ProviderCodecRegistry` at registration time. The analysis tools delegate to `org.mockserver.llm.analysis.AgentRunAnalyzer`.
 
+## Fault / chaos injection
+
+`LlmChaosProfile` (`org.mockserver.model`) attaches a fault profile to any `HttpLlmResponse` for resilience testing. Applied by `HttpLlmResponseActionHandler`:
+
+- **Probabilistic error** — `chaosErrorResponseOrNull(...)` returns an error `HttpResponse` (`errorStatus` + optional `Retry-After`) when triggered. An `errorStatus` with no `errorProbability` always fires; a fractional probability draws once (reproducible via `seed`). `HttpActionHandler` checks this first and, if present, returns the error on the normal (non-streaming) path — a provider error is a plain HTTP response, not an SSE stream, even for a would-be streaming completion.
+- **Mid-stream truncation** — `applyStreamingChaos(...)` keeps a leading `truncateAtFraction` of the SSE events (default 0.5) so the stream ends early.
+- **Malformed SSE** — appends a deliberately broken-JSON chunk so the client must handle a corrupt event.
+
+Truncation and malformed-SSE are fully deterministic; the error path is deterministic at probability 0.0/1.0. Each injection increments the `LLM_CHAOS_INJECTED_COUNT` metric. The profile round-trips as the top-level `chaos` field on `HttpLlmResponse` (alongside `completion`, `embedding`, and `conversationPredicates`) and is exposed per turn in the dashboard wizard and via the `chaos` MCP parameter.
+
 ## Agent-run analysis
 
 `AgentRunAnalyzer` (`org.mockserver.llm.analysis`) is a deterministic, read-only inspector. Given the LLM requests MockServer recorded (retrieved via the normal request log), it decodes each with the provider's `ProviderCodec` and treats the **richest** conversation (most messages — the latest dialogue snapshot) as the canonical run. From that it derives:
@@ -346,3 +356,5 @@ Key source files under `mockserver/mockserver-core/src/main/java/org/mockserver/
 | `llm/client/LlmCompletionService.java` | Orchestrator: off-unless-configured, fail-closed, cached |
 | `llm/client/LlmTransport.java` + `NettyHttpClientLlmTransport.java` | Transport seam over `NettyHttpClient` |
 | `llm/analysis/AgentRunAnalyzer.java` | Deterministic read-only agent-run inspection (tool-call counts, run summary) |
+| `model/LlmChaosProfile.java` | Fault/chaos profile carried on `HttpLlmResponse` |
+| `mock/action/http/HttpLlmResponseActionHandler.java` | Encodes LLM responses and applies chaos (error / truncation / malformed SSE) |
@@ -23,7 +23,7 @@ The original RFC (RFC-1 LLM Response Builder + RFC-2 Stateful Scripted Conversat
 | # | Item | Status |
 |---|---|---|
 | 5 | Token/cost analytics + budget assertions | ✅ Shipped (U3 — token/cost rollup tile + session inspector) |
-| 6 | LLM fault/chaos profiles (429/529 + Retry-After, mid-stream truncation, malformed SSE, probabilistic error rates) | ❌ Not started (was U6, ~8–12 days) |
+| 6 | LLM fault/chaos profiles (429/529 + Retry-After, mid-stream truncation, malformed SSE, probabilistic error rates) | ✅ Shipped — `LlmChaosProfile` on `HttpLlmResponse`, applied in `HttpLlmResponseActionHandler` (+ dispatcher); MCP `chaos` on `mock_llm_completion` and per conversation turn; dashboard wizard control; `LLM_CHAOS_INJECTED_COUNT` metric |
 | 7 | VCR mode + strict mode + body redaction + field normalisation | 🟡 Partial — cassette manager shipped in U4; strict-mode, body redaction, and field normalisation still open |
 
 ### Tier 3 — valuable / specialised
 
@@ -907,9 +907,29 @@ <h3>mock_llm_completion</h3>
         <tr><td><code>stopReason</code></td><td>string</td><td>No</td><td>Stop reason to encode in the provider format (e.g. <code>end_turn</code>, <code>tool_use</code>, <code>stop</code>)</td></tr>
         <tr><td><code>usage</code></td><td>object</td><td>No</td><td>Token usage. Accepts <code>inputTokens</code> (integer) and <code>outputTokens</code> (integer).</td></tr>
         <tr><td><code>streaming</code></td><td>boolean</td><td>No</td><td>When <code>true</code>, the response is delivered as a Server-Sent Events stream. Defaults to <code>false</code>.</td></tr>
+        <tr><td><code>chaos</code></td><td>object</td><td>No</td><td>Optional fault/chaos profile for resilience testing (see the table below). Also accepted per turn in <a href="#create_llm_conversation"><code>create_llm_conversation</code></a>.</td></tr>
     </tbody>
 </table>
 
+<p><strong><code>chaos</code> fields</strong> (all optional):</p>
+
+<table>
+    <thead>
+        <tr><th>Field</th><th>Type</th><th>Description</th></tr>
+    </thead>
+    <tbody>
+        <tr><td><code>errorStatus</code></td><td>integer</td><td>HTTP error status to return instead of a normal response (e.g. <code>429</code>, <code>529</code>). Fires every time unless <code>errorProbability</code> is set. A provider error is returned as a normal HTTP response even for a streaming completion.</td></tr>
+        <tr><td><code>retryAfter</code></td><td>string</td><td>Value for the <code>Retry-After</code> header on an injected error (e.g. <code>"30"</code>).</td></tr>
+        <tr><td><code>errorProbability</code></td><td>number</td><td>Probability 0.0&ndash;1.0 of injecting the error. <code>1.0</code> (or omitted with <code>errorStatus</code> set) always fires; <code>0.0</code> never does. Fractional values are non-deterministic unless <code>seed</code> is set.</td></tr>
+        <tr><td><code>truncateMode</code></td><td>string</td><td><code>NONE</code> or <code>MID_STREAM</code>. <code>MID_STREAM</code> truncates a streaming response after a leading fraction of events.</td></tr>
+        <tr><td><code>truncateAtFraction</code></td><td>number</td><td>Fraction 0.0&ndash;1.0 of SSE events to keep before truncating (default <code>0.5</code>).</td></tr>
+        <tr><td><code>malformedSse</code></td><td>boolean</td><td>Append a malformed (broken-JSON) SSE chunk so the client must handle a corrupt event.</td></tr>
+        <tr><td><code>seed</code></td><td>integer</td><td>Makes a fractional <code>errorProbability</code> reproducible.</td></tr>
+    </tbody>
+</table>
+
+<p>Chaos is deterministic for truncation, malformed SSE, and an <code>errorProbability</code> of 0.0 or 1.0 &mdash; safe for repeatable tests. Use a fractional probability (optionally with a <code>seed</code>) only when you intend flakiness.</p>
+
 <p><strong>Example request (Anthropic text completion):</strong></p>
 
 <pre class="prettyprint code"><code class="code">{
@@ -1033,6 +1053,8 @@ <h3>create_llm_conversation</h3>
     </tbody>
 </table>
 
+<p>Each turn may also carry an optional <code>chaos</code> object (a sibling of <code>match</code> and <code>response</code>) with the same fields as the <a href="#mock_llm_completion"><code>mock_llm_completion</code></a> <code>chaos</code> profile, to inject faults into that turn's response.</p>
+
 <p><strong>Example request (2-turn conversation isolated by session header):</strong></p>
 
 <pre class="prettyprint code"><code class="code">{
 
@@ -305,3 +305,61 @@ describe('latestMessageMatches (regex predicate)', () => {
     expect(draft.turns[0]!.predicates.latestMessageMatches).toBe('weather.*paris');
   });
 });
+
+describe('chaos profile', () => {
+  function chaosDraft(): ConversationDraft {
+    const draft = baseDraft();
+    draft.turns[0]!.chaos = {
+      errorStatus: 429,
+      retryAfter: '30',
+      errorProbability: 1.0,
+      truncateMode: 'MID_STREAM',
+      truncateAtFraction: 0.5,
+      malformedSse: true,
+      seed: 7,
+    };
+    return draft;
+  }
+
+  it('emits withChaos in Java', () => {
+    const java = conversationToJava(chaosDraft());
+    expect(java).toContain('.withChaos(');
+    expect(java).toContain('.withErrorStatus(429)');
+    expect(java).toContain('.withTruncateMode(org.mockserver.model.LlmChaosProfile.TruncateMode.MID_STREAM)');
+    expect(java).toContain('.withSeed(7L)');
+  });
+
+  it('emits chaos object in JSON httpLlmResponse', () => {
+    const json = JSON.parse(conversationToJson(chaosDraft()));
+    const chaos = json[0].httpLlmResponse.chaos;
+    expect(chaos.errorStatus).toBe(429);
+    expect(chaos.malformedSse).toBe(true);
+  });
+
+  it('emits chaos object in MCP turn', () => {
+    const args = conversationToMcpArgs(chaosDraft());
+    const turns = args['turns'] as Array<Record<string, unknown>>;
+    const chaos = turns[0]!['chaos'] as Record<string, unknown>;
+    expect(chaos['errorStatus']).toBe(429);
+    expect(chaos['truncateMode']).toBe('MID_STREAM');
+  });
+
+  it('round-trips chaos through draftFromScenarioExpectations', () => {
+    const json = JSON.parse(conversationToJson(chaosDraft())) as Array<Record<string, unknown>>;
+    const { draft } = draftFromScenarioExpectations(
+      json.map((value, i) => ({ key: `k${i}`, value })),
+    );
+    expect(draft.turns[0]!.chaos?.errorStatus).toBe(429);
+    expect(draft.turns[0]!.chaos?.malformedSse).toBe(true);
+  });
+
+  it('omits NONE truncateMode from wire output', () => {
+    const draft = baseDraft();
+    draft.turns[0]!.chaos = { truncateMode: 'NONE', errorStatus: 500 };
+    const args = conversationToMcpArgs(draft);
+    const turns = args['turns'] as Array<Record<string, unknown>>;
+    const chaos = turns[0]!['chaos'] as Record<string, unknown>;
+    expect(chaos['truncateMode']).toBeUndefined();
+    expect(chaos['errorStatus']).toBe(500);
+  });
+});
@@ -14,7 +14,7 @@ import Collapse from '@mui/material/Collapse';
 import AddIcon from '@mui/icons-material/Add';
 import DeleteIcon from '@mui/icons-material/Delete';
 import PredicatePills from './PredicatePills';
-import type { TurnDraft, TurnMatchPredicates, TurnResponse, NormalizationDraft } from '../lib/conversationCodegen';
+import type { TurnDraft, TurnMatchPredicates, TurnResponse, NormalizationDraft, ChaosDraft } from '../lib/conversationCodegen';
 import type { ToolCallDraft } from '../lib/expectationFromCapture';
 
 // ---------------------------------------------------------------------------
@@ -88,6 +88,21 @@ export default function ConversationWizardStep2({ turns, onTurnsChange }: Step2P
     [turns, updatePredicates],
   );
 
+  const toggleChaos = useCallback(
+    (index: number, enabled: boolean) => {
+      updateTurn(index, { chaos: enabled ? {} : undefined });
+    },
+    [updateTurn],
+  );
+
+  const updateChaos = useCallback(
+    (index: number, partial: Partial<ChaosDraft>) => {
+      const turn = turns[index]!;
+      updateTurn(index, { chaos: { ...(turn.chaos ?? {}), ...partial } });
+    },
+    [turns, updateTurn],
+  );
+
   const updateToolCall = useCallback(
     (turnIndex: number, tcIndex: number, partial: Partial<ToolCallDraft>) => {
       const turn = turns[turnIndex]!;
@@ -354,6 +369,85 @@ export default function ConversationWizardStep2({ turns, onTurnsChange }: Step2P
                 />
               </Box>
             </Box>
+
+            {/* Fault / chaos injection (resilience testing) */}
+            <FormControlLabel
+              control={
+                <Switch
+                  checked={turn.chaos != null}
+                  onChange={(e) => toggleChaos(i, e.target.checked)}
+                  size="small"
+                />
+              }
+              label="Inject fault / chaos"
+              sx={{ '& .MuiFormControlLabel-label': { fontSize: '0.75rem' }, mt: 0.5 }}
+            />
+            <Collapse in={turn.chaos != null} unmountOnExit>
+              <Box sx={{ pl: 1.5, mb: 1, display: 'flex', flexWrap: 'wrap', gap: 1, alignItems: 'center' }}>
+                <TextField
+                  label="Error status"
+                  size="small"
+                  type="number"
+                  value={turn.chaos?.errorStatus ?? ''}
+                  onChange={(e) => updateChaos(i, { errorStatus: e.target.value === '' ? undefined : parseInt(e.target.value, 10) })}
+                  sx={{ width: 110 }}
+                />
+                <TextField
+                  label="Retry-After"
+                  size="small"
+                  value={turn.chaos?.retryAfter ?? ''}
+                  onChange={(e) => updateChaos(i, { retryAfter: e.target.value || undefined })}
+                  sx={{ width: 110 }}
+                />
+                <TextField
+                  label="Error prob (0-1)"
+                  size="small"
+                  type="number"
+                  value={turn.chaos?.errorProbability ?? ''}
+                  onChange={(e) => updateChaos(i, { errorProbability: e.target.value === '' ? undefined : parseFloat(e.target.value) })}
+                  sx={{ width: 130 }}
+                />
+                <TextField
+                  label="Truncate"
+                  size="small"
+                  select
+                  value={turn.chaos?.truncateMode ?? 'NONE'}
+                  onChange={(e) => updateChaos(i, { truncateMode: e.target.value as ChaosDraft['truncateMode'] })}
+                  sx={{ width: 130 }}
+                >
+                  <MenuItem value="NONE">None</MenuItem>
+                  <MenuItem value="MID_STREAM">Mid-stream</MenuItem>
+                </TextField>
+                <TextField
+                  label="Truncate frac"
+                  size="small"
+                  type="number"
+                  value={turn.chaos?.truncateAtFraction ?? ''}
+                  onChange={(e) => updateChaos(i, { truncateAtFraction: e.target.value === '' ? undefined : parseFloat(e.target.value) })}
+                  sx={{ width: 120 }}
+                />
+                <FormControlLabel
+                  control={
+                    <Checkbox
+                      size="small"
+                      checked={turn.chaos?.malformedSse === true}
+                      onChange={(e) => updateChaos(i, { malformedSse: e.target.checked })}
+                    />
+                  }
+                  label="Malformed SSE"
+                  sx={{ '& .MuiFormControlLabel-label': { fontSize: '0.75rem' } }}
+                />
+                <TextField
+                  label="Seed"
+                  size="small"
+                  type="number"
+                  value={turn.chaos?.seed ?? ''}
+                  onChange={(e) => updateChaos(i, { seed: e.target.value === '' ? undefined : parseInt(e.target.value, 10) })}
+                  sx={{ width: 100 }}
+                  helperText="reproducible prob"
+                />
+              </Box>
+            </Collapse>
           </CardContent>
         </Card>
       ))}