Skip to content

Commit b7f2d4b

Browse files
committed
docs(tutorials): chapter-by-chapter drift + cross-ref pass
10 fixes "six runtimes" → "seven" (Semantic Kernel). 12 adds OutputLengthZScoreGuardrail tenant partition note, new CostCeilingGuardrail section with TokenPricing + CostAccountingSession wiring, updates the "Together" summary. 14 adds Admin control plane section (triple-gate, favicon, read-auth flag). 15 adds Admin extension section (X-Atmosphere-Auth principal, AdminReadAuthFilter, UT000048 workaround). 17 adds Companion reattach section (RunRegistry + X-Atmosphere-Run-Id + ownership check). 18 adds Business-outcome MDC subsection cross-referencing Chapter 27.
1 parent edfbf1a commit b7f2d4b

6 files changed

Lines changed: 231 additions & 1 deletion

File tree

docs/src/content/docs/tutorial/10-ai-tools.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -503,7 +503,7 @@ The canonical tool-calling sample is:
503503

504504
- **[`samples/spring-boot-ai-tools/`](https://github.com/Atmosphere/atmosphere/tree/main/samples/spring-boot-ai-tools)** — uses the built-in LLM client with `@AiTool` methods (`AssistantTools`), conversation memory, and the `CostMeteringInterceptor`. Run with: `./mvnw spring-boot:run -pl samples/spring-boot-ai-tools`
505505

506-
Because `@AiTool` definitions are framework-agnostic, the same `AssistantTools` class works with any of the six runtimes (built-in, Spring AI, LangChain4j, ADK, Embabel, Koog) by swapping the adapter dependency.
506+
Because `@AiTool` definitions are framework-agnostic, the same `AssistantTools` class works with any of the seven runtimes (built-in, Spring AI, LangChain4j, ADK, Embabel, Koog, Semantic Kernel) by swapping the adapter dependency.
507507

508508
## Summary
509509

docs/src/content/docs/tutorial/12-ai-filters.md

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -245,6 +245,136 @@ Register guardrails on an endpoint:
245245
@AiEndpoint(path = "/chat", guardrails = {PiiGuardrail.class})
246246
```
247247

248+
## Built-in guardrails
249+
250+
Two zero-dep guardrails ship in `org.atmosphere.ai.guardrails`. Both
251+
resolve through the framework-scoped wiring pattern (Spring bean
252+
bridge → `ServiceLoader` → annotation — same order
253+
`FactResolver` uses), so an annotation-declared `@AiEndpoint` picks
254+
up Spring-registered beans automatically.
255+
256+
### PiiRedactionGuardrail
257+
258+
Regex-based detection of email, phone, credit card number, US SSN,
259+
and IPv4 in both requests and responses.
260+
261+
**Request path** (default mode) returns `Modify` with a redacted
262+
message — the model never sees the raw PII. `.blocking()` switches
263+
to `Block` on match.
264+
265+
**Response path** Blocks on match in both modes. This is an **early
266+
termination**, not a retroactive redaction — by the time the guardrail
267+
sees the accumulated response, earlier tokens have already streamed
268+
to the client. Blocking suppresses *subsequent* tokens and surfaces a
269+
`SecurityException` on the session; it does not unsend bytes already
270+
on the wire. For synchronous per-token scrubbing use
271+
`PiiRedactionFilter` (covered earlier in this page under
272+
"PiiRedactionFilter") — it runs inside the broadcaster chain and
273+
rewrites each text frame before it is sent. The guardrail here is a
274+
safety net that halts the leak before more PII flows and writes the
275+
hit to the audit log.
276+
277+
Enable via Spring property (recommended):
278+
279+
```properties
280+
atmosphere.ai.guardrails.pii.enabled=true
281+
# Optional — default is redact-on-request, block-on-response
282+
atmosphere.ai.guardrails.pii.blocking=true
283+
```
284+
285+
Or via annotation:
286+
287+
```java
288+
@AiEndpoint(path = "/chat",
289+
guardrails = { PiiRedactionGuardrail.class })
290+
```
291+
292+
Patterns bundled:
293+
294+
| Kind | Example input | Replacement |
295+
|----------|------------------------------|----------------------|
296+
| email | `alice@example.com` | `[redacted-email]` |
297+
| phone | `(555) 123-4567` | `[redacted-phone]` |
298+
| us-ssn | `123-45-6789` | `[redacted-us-ssn]` |
299+
| credit | `4111 1111 1111 1111` | `[redacted-card]` |
300+
| ipv4 | `203.0.113.42` | `[redacted-ip]` |
301+
302+
Extend by subclassing and adding patterns before calling `super`.
303+
304+
### OutputLengthZScoreGuardrail
305+
306+
Statistical drift detector. Maintains a rolling window of response
307+
lengths and Blocks any response whose length is more than N standard
308+
deviations above the window mean. Catches runaway prompts and
309+
injection payloads that balloon responses without a specific
310+
signature.
311+
312+
```properties
313+
atmosphere.ai.guardrails.drift.enabled=true
314+
atmosphere.ai.guardrails.drift.window-size=50
315+
atmosphere.ai.guardrails.drift.z-score-threshold=3.0
316+
atmosphere.ai.guardrails.drift.min-samples=10
317+
```
318+
319+
The first `min-samples` responses always pass — the guardrail needs a
320+
baseline before it can flag an outlier. A falling hit rate means the
321+
model's output distribution is stabilizing; a rising rate is a
322+
regression signal worth investigating.
323+
324+
**Multi-tenant deployments** — the rolling window partitions by the
325+
`business.tenant.id` MDC tag (populated by the Business Metadata
326+
bridge — see [Chapter 27](./27-business-metadata-observability/)).
327+
A noisy tenant cannot poison another tenant's baseline. Turns without
328+
a tenant tag share a shared `__default__` bucket, so single-tenant
329+
apps behave unchanged.
330+
331+
### CostCeilingGuardrail
332+
333+
Blocks outbound `@Prompt` dispatch when a tenant's cumulative LLM
334+
cost hits a configured budget. Closes the
335+
observability→enforcement loop: you tag tenants via
336+
`BusinessMetadata`, the built-in `CostCeilingAccountant` meters
337+
`TokenUsage` from every runtime, and this guardrail stops the next
338+
request before it spends more.
339+
340+
```java
341+
@Bean
342+
CostCeilingGuardrail costCeiling() {
343+
return new CostCeilingGuardrail(/* per-tenant budget USD */ 100.00);
344+
}
345+
346+
@Bean
347+
TokenPricing openAiPricing() {
348+
// Your provider's per-model input/output dollar-per-token rates.
349+
return (usage, model) -> switch (model) {
350+
case "gpt-4o" -> usage.inputTokens() * 0.0000025
351+
+ usage.outputTokens() * 0.00001;
352+
default -> 0.0;
353+
};
354+
}
355+
```
356+
357+
Spring Boot auto-wires the `CostAccountingSession` decorator when
358+
both beans are present: every `@Prompt` session's `TokenUsage` event
359+
flows into `addCost(tenantId, dollars)` keyed by the same
360+
`business.tenant.id` MDC tag the drift guardrail uses. Reset
361+
counters on a monthly boundary via `resetTenant(...)` /
362+
`resetAll()`.
363+
364+
Request-side blocks surface on the session as a `SecurityException`
365+
with the reason string `cost ceiling reached for tenant X (spent Y
366+
of budget Z)`, so the client sees a terminal error envelope rather
367+
than an infinite hang.
368+
369+
### Together
370+
371+
Register all three in a production deployment — PII for compliance
372+
(blocks the leak), drift for incident detection (flags a model
373+
behaving badly), cost ceiling for spend enforcement (halts a runaway
374+
tenant before the next invoice). None is on by default: the
375+
framework's position is that a guardrail that fires unexpectedly is
376+
worse than no guardrail at all, so you opt in explicitly.
377+
248378
## Model routing
249379

250380
The `ModelRouter` interface (in `org.atmosphere.ai`) mirrors Atmosphere's transport failover pattern (WebSocket -> SSE -> long-polling) applied to the AI layer (GPT-4 -> Claude -> Gemini).

docs/src/content/docs/tutorial/14-spring-boot.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,27 @@ Spring Boot's embedded containers do not process `@HandlesTypes` from `ServletCo
4848

4949
The starter also performs a second pass to discover custom annotation types registered via `@AtmosphereAnnotation` processors (e.g., the AI module's `@AiEndpoint` processor), and re-scans user packages for classes annotated with those custom annotations.
5050

51+
### Admin control plane
52+
53+
When `atmosphere-admin` is on the classpath, the starter also registers
54+
`AtmosphereAdminEndpoint` under `/api/admin/*` (read endpoints open by
55+
default, write endpoints triple-gated: feature flag → Principal →
56+
`ControlAuthorizer`), plus `AtmosphereFaviconAutoConfiguration` which
57+
serves `/favicon.ico` and `/favicon.png` with the framework logo so
58+
browsers stop logging a 404 on the admin dashboard and every sample.
59+
60+
Two opt-in flags you'll want to know about:
61+
62+
| Property | Default | Effect |
63+
|----------|---------|--------|
64+
| `atmosphere.admin.http-write-enabled` | `false` | Enables `POST` / `DELETE` on `/api/admin/*`; still requires a Principal + `ControlAuthorizer` grant. |
65+
| `atmosphere.admin.http-read-auth-required` | `false` | Requires the same principal chain (minus `ControlAuthorizer`) for `GET` / `HEAD` / `OPTIONS`. Flip for multi-tenant deployments exposing `/api/admin/*` on a routable network. |
66+
| `atmosphere.favicon.enabled` | `true` | Set to `false` if the application ships its own `/favicon.ico`. |
67+
68+
See [Chapter 29 — Agent-to-Agent Flow Viewer](./29-admin-flow-viewer/)
69+
and [`modules/admin/README.md`](https://github.com/Atmosphere/atmosphere/blob/main/modules/admin/README.md)
70+
for the full admin surface and principal-chain wiring.
71+
5172
## Application class
5273

5374
The application class is a standard Spring Boot entry point:

docs/src/content/docs/tutorial/15-quarkus.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,35 @@ The extension supports Quarkus dev mode (`quarkus:dev`) with live reload. The `A
164164
| Config phase | Runtime | `BUILD_AND_RUN_TIME_FIXED` |
165165
| SCI handling | Overridden via servlet context attribute | Suppressed via `IgnoredServletContainerInitializerBuildItem` |
166166

167+
## Admin extension
168+
169+
`atmosphere-quarkus-admin-extension` mirrors the Spring `/api/admin/*`
170+
surface for Quarkus apps, with three Quarkus-shaped deltas:
171+
172+
- **Fourth principal source** — on top of the Spring chain
173+
(`SecurityContext``org.atmosphere.auth.principal` attribute →
174+
`ai.userId` attribute), `AdminResource` also accepts the
175+
`X-Atmosphere-Auth` header and validates it constant-time against
176+
`atmosphere.admin.auth.token`. Intended for sample fixtures and
177+
operator tooling that haven't integrated Jakarta Security yet;
178+
production stacks still resolve Jakarta Security first.
179+
- **`AdminReadAuthFilter`** — Quarkus JAX-RS `@Provider` that
180+
enforces the same `atmosphere.admin.http-read-auth-required`
181+
opt-in flag as the Spring `AdminApiAuthFilter`. Default off;
182+
multi-tenant operators flip it when exposing `/api/admin/*` on a
183+
routable network.
184+
- **Vert.x dispatch**`resteasy-reactive` runs on Vert.x, so
185+
servlet attribute access throws `IllegalStateException: UT000048`.
186+
`AdminResource` guards against this and reads `X-Atmosphere-Auth`
187+
via `@Context HttpHeaders`, which works on both servlet and
188+
reactive transports.
189+
190+
Everything else matches the Spring starter one-for-one — triple-gate
191+
writes, audit log, `ControlAuthorizer` bean resolution via CDI (the
192+
Quarkus `AdminProducer` looks up a user-supplied `ControlAuthorizer`
193+
bean before falling back to `REQUIRE_PRINCIPAL`), and the MCP-tool
194+
admin surface when `atmosphere-mcp` is also on the classpath.
195+
167196
## Running the sample
168197

169198
The `quarkus-chat` sample demonstrates a complete chat application:

docs/src/content/docs/tutorial/17-durable-sessions.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -234,6 +234,35 @@ Two implementations are provided:
234234

235235
These share the same backend connections as the corresponding `SessionStore` implementations. The `PersistentConversationMemory` class handles serialization and sliding-window logic on top of the persistence SPI.
236236

237+
## Companion: mid-stream reattach for `@Prompt` runs
238+
239+
Durable sessions restore room + broadcaster memberships after a
240+
disconnect — the client sees the same presence and subscriptions it
241+
had before. But an `@Prompt` that was *actively streaming* when the
242+
client dropped needs a parallel mechanism: the buffered events the
243+
client didn't finish receiving.
244+
245+
That's what `RunRegistry` + `X-Atmosphere-Run-Id` do. On every
246+
`@Prompt` dispatch `AiEndpointHandler` registers a run with an
247+
`AgentResumeHandle` and returns the run id as the
248+
`X-Atmosphere-Run-Id` response header. `RunEventCapturingSession`
249+
mirrors every `session.send` / `complete` / `error` into the run's
250+
bounded `RunEventReplayBuffer`. When the client reconnects carrying
251+
the run id, `RunReattachSupport.replayPendingRun` drains the buffer
252+
onto the new resource so the user catches up on the tokens they
253+
missed — routed through the broadcaster's filter chain so
254+
`PiiRedactionFilter` applies identically to replay and live frames.
255+
256+
`DurableSessionInterceptor` stashes the header into the request
257+
attribute `org.atmosphere.session.runId` so `AiEndpointHandler.onReady`
258+
sees it without a compile-time dependency on the durable-sessions
259+
module.
260+
261+
Ownership is enforced: replay refuses when the reconnecting caller's
262+
`userId` does not match the run's registered `userId`, so a bearer
263+
token leak cannot replay someone else's conversation. Anonymous runs
264+
keep the open-mode carve-out for demo deployments.
265+
237266
## Combining with Clustering
238267

239268
Durable sessions and clustering serve different purposes and work well together:

docs/src/content/docs/tutorial/18-observability.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,27 @@ framework.interceptor(new MDCInterceptor());
309309

310310
MDC keys are automatically included as top-level fields in JSON layouts (logstash-logback-encoder, logback-contrib).
311311

312+
### Business-outcome MDC for AI calls
313+
314+
When `atmosphere-ai` is on the classpath, `AiEndpointHandler` layers a
315+
second MDC snapshot on top of the transport-level keys above. For
316+
every `@Prompt` turn it publishes the `business.*` keys
317+
(`tenant.id`, `customer.id`, `customer.segment`, `session.revenue`,
318+
`session.cost`, `session.currency`, `session.id`, `event.kind`,
319+
`event.subject`) onto the dispatching virtual thread and clears them
320+
in `finally`. Any log record emitted during the turn — pipeline,
321+
runtime, tool calls, guardrails — carries the tags, so Dynatrace /
322+
Datadog / OpenTelemetry log exporters can join LLM cost and latency
323+
against per-tenant KPIs without a separate correlation pipeline.
324+
325+
Performance of the snapshot → apply → clear cycle is pinned by
326+
`BusinessMdcBenchmark` in `modules/benchmarks` (JMH, baseline vs.
327+
six-key production vs. empty-snapshot) so regressions on the hot
328+
path show as numbers.
329+
330+
See [Chapter 27 — Tag agent calls with business outcomes](/docs/tutorial/27-business-metadata-observability/)
331+
for the complete wiring and exporter integration.
332+
312333
## BackpressureInterceptor
313334

314335
The `BackpressureInterceptor` protects against slow consumers by limiting the number of pending messages per client:

0 commit comments

Comments
 (0)