mock-server
diff --git a/‎changelog.md‎
Lines changed: 1 addition & 0 deletions b/‎changelog.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/code/llm-mocking.md‎
Lines changed: 11 additions & 0 deletions b/‎docs/code/llm-mocking.md‎
Lines changed: 11 additions & 0 deletions
diff --git a/‎docs/plans/mockserver-llm-mocking.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/plans/mockserver-llm-mocking.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎jekyll-www.mock-server.com/mock_server/configuration_properties.html‎
Lines changed: 20 additions & 0 deletions b/‎jekyll-www.mock-server.com/mock_server/configuration_properties.html‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎mockserver/mockserver-core/pom.xml‎
Lines changed: 39 additions & 0 deletions b/‎mockserver/mockserver-core/pom.xml‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎mockserver/mockserver-core/src/main/java/org/mockserver/configuration/ConfigurationProperties.java‎
Lines changed: 58 additions & 0 deletions b/‎mockserver/mockserver-core/src/main/java/org/mockserver/configuration/ConfigurationProperties.java‎
Lines changed: 58 additions & 0 deletions
@@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 
 ### Added
+- Added optional **OpenTelemetry (OTLP) export**, in two independent, off-by-default parts. (1) **Metrics export** — MockServer's existing metrics (the same explicitly-defined gauges already exposed for Prometheus: `REQUESTS_RECEIVED_COUNT`, `RESPONSE_EXPECTATIONS_MATCHED_COUNT`, the LLM/SSE/chaos counters, etc.) can also be pushed to an OTLP collector as an alternative to Prometheus (`mockserver.otelMetricsEnabled`). Implemented as OTel observable gauges reading the current values, so the Prometheus and OTLP views stay in lock-step. (2) **GenAI span export** — MockServer emits one explicit OpenTelemetry GenAI semantic-convention span per LLM completion it serves (`gen_ai.system`, `gen_ai.request.model`, `gen_ai.usage.input_tokens`/`output_tokens`, `gen_ai.response.finish_reasons`, tool-call count) (`mockserver.otelTracesEnabled`). These are spans MockServer codes deliberately — **no auto-instrumentation** is added. Both use the OTLP HTTP/protobuf exporter with the JDK HttpClient sender (no gRPC/OkHttp), share `mockserver.otelEndpoint`, and are fail-soft (a setup error logs one line and never stops the server or affects a response). `io.opentelemetry.*` is relocated in the shaded JAR. See the configuration properties page.
 - Added **drift detection** for LLM fixtures (`detect_llm_drift` MCP tool): replays a recorded cassette's exchanges against the live provider (via the runtime-LLM client SPI) and reports **structural** drift — new/removed fields and type changes in the responses — not semantic differences, so benign wording changes never flag. Built on a reusable, pure `StructuralShapeDiff` and a `DriftDetector` that **fails closed** per exchange (a network error or non-2xx live response is reported as could-not-check, never as drift, never thrown). Off unless a runtime backend is configured. Intended for an opt-in/scheduled CI lane (real API keys + tokens), never the per-commit build. See the AI/MCP tools page and `docs/code/llm-mocking.md`.
 - Completed the **VCR (record/replay) toolkit** for LLM fixtures with three additions. (1) **Strict mode** — `load_expectations_from_file` accepts `strict` (or set `mockserver.llmVcrStrict`), which registers a low-priority catch-all per cassette path so a request matching no recorded fixture returns HTTP 599 instead of silently falling through. (2) **Body-field redaction** — `record_llm_fixtures` accepts `redactBodyFields` (or set `mockserver.fixtureBodyRedactFields`) to redact named JSON fields from recorded request/response bodies, complementing the existing header redaction. (3) **Replay field normalisation** — `load_expectations_from_file` accepts `normalizeRequestBodyFields` to drop volatile JSON fields from each recorded request body and match the remainder loosely (ignoring extra fields), so per-run values (request ids, timestamps) do not block replay. These are operational settings exposed via config and MCP. See the AI/MCP tools and configuration properties pages.
 - Added declarative **LLM fault/chaos profiles** for resilience testing, attachable to any mock LLM response (`mock_llm_completion`, each `create_llm_conversation` turn, the Java `LlmConversationBuilder`, and raw expectation JSON via a `chaos` block). Supports probabilistic provider errors (e.g. 429/529 with a `Retry-After` header), mid-stream truncation of an SSE stream (keep a leading fraction of events), and appending a malformed (broken-JSON) SSE chunk. Errors are deterministic at probability 0.0/1.0 and reproducible at fractional probabilities via a `seed`; truncation and malformed-SSE are always deterministic. A new `LLM_CHAOS_INJECTED_COUNT` metric tracks injections. The dashboard conversation wizard exposes the profile per turn. See the AI/MCP tools page and `docs/code/llm-mocking.md`.
 
@@ -300,6 +300,15 @@ Adding a provider = implement `LlmClient` + one `register(...)` line — the sam
 
 This SPI is never on the deterministic assertion/matching path. The features that consume it (drift detection, semantic matching) are tracked in `docs/plans/mockserver-llm-mocking.md`.
 
+## OpenTelemetry export
+
+Optional, off-by-default OTLP export, in two independent parts (both fail-soft — a setup error logs one line and never affects the server or a response; `io.opentelemetry` is relocated in the shaded jar):
+
+- **Metrics** (`org.mockserver.metrics.OtelMetricsExporter`, `mockserver.otelMetricsEnabled`) — bridges the existing `Metrics.Name` gauges (the same set exposed for Prometheus, including the LLM/SSE/chaos counters) to OTLP as observable gauges that read the current values, so Prometheus and OTLP stay consistent. An alternative to the Prometheus endpoint.
+- **GenAI spans** (`org.mockserver.telemetry.GenAiSpanExporter` + `GenAiSpans`, `mockserver.otelTracesEnabled`) — `HttpLlmResponseActionHandler` calls `GenAiSpans.recordCompletion(provider, model, completion)` on each served completion (streaming and non-streaming), emitting one span with GenAI semantic-convention attributes (`gen_ai.system`, `gen_ai.request.model`, `gen_ai.usage.*`, `gen_ai.response.finish_reasons`, tool-call count). These are spans MockServer codes deliberately — **no auto-instrumentation**. `GenAiSpans` is a process-wide no-op until `GenAiSpanExporter` installs a tracer.
+
+Both use the OTLP HTTP/protobuf exporter with the JDK HttpClient sender (no gRPC/OkHttp) and share `mockserver.otelEndpoint` (a base collector URL; `/v1/metrics` and `/v1/traces` appended per signal, resolved by `telemetry.OtelEndpoints`).
+
 ## Drift detection
 
 `detect_llm_drift` (MCP) closes the loop on stale cassettes: it replays a recorded cassette's exchanges against the **live** provider and reports structural drift in the responses. Built from two pieces in `org.mockserver.llm.drift`:
@@ -381,3 +390,5 @@ Key source files under `mockserver/mockserver-core/src/main/java/org/mockserver/
 | `fixture/FixtureRedactor.java` | Masks sensitive headers and (optional) JSON body fields when recording fixtures |
 | `llm/drift/StructuralShapeDiff.java` | Pure JSON shape diff (added/removed/type-changed paths) |
 | `llm/drift/DriftDetector.java` + `DriftReport.java` | Replays a cassette against the live provider and reports structural drift, fail-closed |
+| `metrics/OtelMetricsExporter.java` | Optional OTLP metrics export bridging the Prometheus gauges (off by default) |
+| `telemetry/GenAiSpanExporter.java` + `GenAiSpans.java` + `OtelEndpoints.java` | Optional explicit GenAI span export per served completion (off by default) |
@@ -33,7 +33,7 @@ The original RFC (RFC-1 LLM Response Builder + RFC-2 Stateful Scripted Conversat
 | 8 | MCP/A2A conformance contract testing (`run_mcp_contract_test`) | ❌ Not started — see Item assessments below |
 | 9a | Normalised prompt matching (deterministic) | ✅ Shipped — `NormalizationOptions` on `ConversationPredicates`, applied by `PromptNormalizer` before the text predicates (whitespace/case/JSON-key-sort/volatile-field drop) |
 | 9b | Semantic prompt matching (runtime LLM / embeddings) | ❌ Not started — opt-in only, not for assertions |
-| 10 | OTel GenAI / OpenInference span export | ❌ Not started |
+| 10 | OTel GenAI span export (+ metrics export) | ✅ Shipped — explicit GenAI spans per served completion (`GenAiSpanExporter`/`GenAiSpans`, `mockserver.otelTracesEnabled`) AND metrics export bridging the existing Prometheus gauges to OTLP (`OtelMetricsExporter`, `mockserver.otelMetricsEnabled`), as an alternative to Prometheus. Explicit spans only — no auto-instrumentation. Both off by default, fail-soft, OTLP HTTP/JDK-sender, `io.opentelemetry` relocated in shade. |
 | 11 | Correlated agent-run session / call-graph view | ❌ Not started |
 | 12 | Prompt-injection / adversarial-response harness | ❌ Not started |
 | 13 | Drift detection (fixtures vs real API in CI) | ✅ Shipped — `detect_llm_drift` MCP tool over `StructuralShapeDiff` + `DriftDetector` (replays cassette via runtime-LLM SPI, diffs response shape, fails closed); off unless a backend resolves; opt-in CI lane |
 
@@ -888,6 +888,26 @@ <h2>Streaming Proxy Configuration:</h2>
     <pre class="code" style="padding: 2px;"><code class="code">-Dmockserver.llmVcrStrict="true"</code></pre>
 </div>
 
+<button id="button_configuration_otel" class="accordion title"><strong>OpenTelemetry Export (Metrics &amp; GenAI Spans)</strong></button>
+<div class="panel title">
+    <p>MockServer can export to an OpenTelemetry (OTLP) collector, in two independent parts that are each <strong>off by default</strong> and <strong>fail-soft</strong> (a startup error logs one line and never stops the server or affects a response). Both use the OTLP HTTP/protobuf exporter with the JDK HTTP client (no gRPC/OkHttp) and share the same endpoint.</p>
+    <p><strong>1. Metrics export</strong> &mdash; push MockServer's explicitly-defined metrics (request counts, expectation-match counts, action counts including the LLM and chaos counters) to OTLP, as an alternative to the Prometheus endpoint. Implemented as observable gauges reading the current values, so the Prometheus and OTLP views stay consistent. It does <strong>not</strong> add tracing or automatic instrumentation.</p>
+    <p>Type: <span class="keyword">boolean</span> Default: <span class="this_value">false</span></p>
+    <pre class="prettyprint lang-java code"><code class="code">ConfigurationProperties.otelMetricsEnabled(boolean enabled)</code></pre>
+    <pre class="code" style="padding: 2px;"><code class="code">-Dmockserver.otelMetricsEnabled=... &nbsp; MOCKSERVER_OTEL_METRICS_ENABLED=...</code></pre>
+    <p>Export interval (seconds), default <span class="this_value">60</span>:</p>
+    <pre class="code" style="padding: 2px;"><code class="code">-Dmockserver.otelMetricsExportIntervalSeconds=... &nbsp; MOCKSERVER_OTEL_METRICS_EXPORT_INTERVAL_SECONDS=...</code></pre>
+    <p><strong>2. GenAI span export</strong> &mdash; emit one OpenTelemetry GenAI semantic-convention span per LLM completion MockServer serves, carrying provider (<span class="inline_code">gen_ai.system</span>), model, token usage and finish reason. These are spans MockServer codes deliberately &mdash; <strong>no</strong> auto-instrumentation is added.</p>
+    <p>Type: <span class="keyword">boolean</span> Default: <span class="this_value">false</span></p>
+    <pre class="prettyprint lang-java code"><code class="code">ConfigurationProperties.otelTracesEnabled(boolean enabled)</code></pre>
+    <pre class="code" style="padding: 2px;"><code class="code">-Dmockserver.otelTracesEnabled=... &nbsp; MOCKSERVER_OTEL_TRACES_ENABLED=...</code></pre>
+    <p><strong>OTLP endpoint (shared)</strong> &mdash; the collector base URL (e.g. <span class="inline_code">http://localhost:4318</span>); the <span class="inline_code">/v1/metrics</span> and <span class="inline_code">/v1/traces</span> paths are appended per signal. Empty uses the OTLP default (<span class="inline_code">http://localhost:4318</span>).</p>
+    <pre class="prettyprint lang-java code"><code class="code">ConfigurationProperties.otelEndpoint(String baseUrl)</code></pre>
+    <pre class="code" style="padding: 2px;"><code class="code">-Dmockserver.otelEndpoint=... &nbsp; MOCKSERVER_OTEL_ENDPOINT=...</code></pre>
+    <p>Example (both signals to a collector):</p>
+    <pre class="code" style="padding: 2px;"><code class="code">-Dmockserver.otelMetricsEnabled="true" -Dmockserver.otelTracesEnabled="true" -Dmockserver.otelEndpoint="http://otel-collector:4318"</code></pre>
+</div>
+
 <button id="button_configuration_stream_idle_timeout_seconds" class="accordion title"><strong>Streaming Response Idle Timeout</strong></button>
 <div class="panel title">
     <p>The maximum time in seconds a streaming response connection may be idle (no chunk received from the upstream server) before MockServer closes it and logs the captured portion as truncated. This replaces the fixed global socket timeout for streaming responses, which would otherwise terminate long-lived LLM completions. The timeout resets on every chunk received, so a slow-but-active stream is never cut off prematurely.</p>
 
@@ -287,6 +287,45 @@
             <artifactId>prometheus-metrics-model</artifactId>
         </dependency>
 
+        <!-- opentelemetry (optional metrics export; off by default).
+             OTLP HTTP/protobuf via the JDK HttpClient sender — okhttp sender excluded
+             to avoid pulling okhttp + kotlin-stdlib into the shaded jar. -->
+        <dependency>
+            <groupId>io.opentelemetry</groupId>
+            <artifactId>opentelemetry-api</artifactId>
+        </dependency>
+        <dependency>
+            <groupId>io.opentelemetry</groupId>
+            <artifactId>opentelemetry-sdk-common</artifactId>
+        </dependency>
+        <dependency>
+            <groupId>io.opentelemetry</groupId>
+            <artifactId>opentelemetry-sdk-metrics</artifactId>
+        </dependency>
+        <dependency>
+            <groupId>io.opentelemetry</groupId>
+            <artifactId>opentelemetry-sdk-trace</artifactId>
+        </dependency>
+        <dependency>
+            <groupId>io.opentelemetry</groupId>
+            <artifactId>opentelemetry-exporter-otlp</artifactId>
+            <exclusions>
+                <exclusion>
+                    <groupId>io.opentelemetry</groupId>
+                    <artifactId>opentelemetry-exporter-sender-okhttp</artifactId>
+                </exclusion>
+            </exclusions>
+        </dependency>
+        <dependency>
+            <groupId>io.opentelemetry</groupId>
+            <artifactId>opentelemetry-exporter-sender-jdk</artifactId>
+        </dependency>
+        <dependency>
+            <groupId>io.opentelemetry</groupId>
+            <artifactId>opentelemetry-sdk-testing</artifactId>
+            <scope>test</scope>
+        </dependency>
+
         <!-- test -->
         <dependency>
             <groupId>junit</groupId>
 
@@ -93,6 +93,10 @@ public class ConfigurationProperties {
     private static final String MOCKSERVER_LLM_REQUEST_TIMEOUT_MILLIS = "mockserver.llmRequestTimeoutMillis";
     private static final String MOCKSERVER_FIXTURE_BODY_REDACT_FIELDS = "mockserver.fixtureBodyRedactFields";
     private static final String MOCKSERVER_LLM_VCR_STRICT = "mockserver.llmVcrStrict";
+    private static final String MOCKSERVER_OTEL_METRICS_ENABLED = "mockserver.otelMetricsEnabled";
+    private static final String MOCKSERVER_OTEL_TRACES_ENABLED = "mockserver.otelTracesEnabled";
+    private static final String MOCKSERVER_OTEL_ENDPOINT = "mockserver.otelEndpoint";
+    private static final String MOCKSERVER_OTEL_METRICS_EXPORT_INTERVAL_SECONDS = "mockserver.otelMetricsExportIntervalSeconds";
     private static final String MOCKSERVER_USE_SEMICOLON_AS_QUERY_PARAMETER_SEPARATOR = "mockserver.useSemicolonAsQueryParameterSeparator";
     private static final String MOCKSERVER_ASSUME_ALL_REQUESTS_ARE_HTTP = "mockserver.assumeAllRequestsAreHttp";
     private static final String MOCKSERVER_HTTP2_ENABLED = "mockserver.http2Enabled";
@@ -1068,6 +1072,60 @@ public static void llmVcrStrict(boolean strict) {
         setProperty(MOCKSERVER_LLM_VCR_STRICT, "" + strict);
     }
 
+    /**
+     * When true, MockServer's explicitly-defined metrics (the same gauges exposed
+     * for Prometheus) are also exported via OpenTelemetry OTLP. Off by default.
+     * No spans or auto-instrumentation are added — metrics only.
+     */
+    public static boolean otelMetricsEnabled() {
+        return Boolean.parseBoolean(readPropertyHierarchically(PROPERTIES, MOCKSERVER_OTEL_METRICS_ENABLED, "MOCKSERVER_OTEL_METRICS_ENABLED", "" + false));
+    }
+
+    public static void otelMetricsEnabled(boolean enabled) {
+        setProperty(MOCKSERVER_OTEL_METRICS_ENABLED, "" + enabled);
+    }
+
+    /**
+     * When true, MockServer emits explicit GenAI semantic-convention spans for LLM
+     * traffic it serves (one span per completion, carrying provider, model, token
+     * usage and finish reason) via OpenTelemetry OTLP. Off by default. These are
+     * spans MockServer codes deliberately — no auto-instrumentation is added.
+     */
+    public static boolean otelTracesEnabled() {
+        return Boolean.parseBoolean(readPropertyHierarchically(PROPERTIES, MOCKSERVER_OTEL_TRACES_ENABLED, "MOCKSERVER_OTEL_TRACES_ENABLED", "" + false));
+    }
+
+    public static void otelTracesEnabled(boolean enabled) {
+        setProperty(MOCKSERVER_OTEL_TRACES_ENABLED, "" + enabled);
+    }
+
+    /**
+     * Base OTLP HTTP endpoint for the collector (e.g. {@code http://localhost:4318}).
+     * The {@code /v1/metrics} and {@code /v1/traces} paths are appended per signal.
+     * Empty uses the OTLP exporter defaults ({@code http://localhost:4318}). A value
+     * that already ends in {@code /v1/metrics} or {@code /v1/traces} is accepted and
+     * normalised to the base.
+     */
+    public static String otelEndpoint() {
+        return readPropertyHierarchically(PROPERTIES, MOCKSERVER_OTEL_ENDPOINT, "MOCKSERVER_OTEL_ENDPOINT", "");
+    }
+
+    public static void otelEndpoint(String endpoint) {
+        setProperty(MOCKSERVER_OTEL_ENDPOINT, endpoint);
+    }
+
+    /**
+     * How often (seconds) OTel metrics are exported. Default 60.
+     */
+    public static long otelMetricsExportIntervalSeconds() {
+        // clamp to >= 1s; a zero/negative interval would make PeriodicMetricReader throw
+        return Math.max(1L, readLongProperty(MOCKSERVER_OTEL_METRICS_EXPORT_INTERVAL_SECONDS, "MOCKSERVER_OTEL_METRICS_EXPORT_INTERVAL_SECONDS", 60L));
+    }
+
+    public static void otelMetricsExportIntervalSeconds(long seconds) {
+        setProperty(MOCKSERVER_OTEL_METRICS_EXPORT_INTERVAL_SECONDS, "" + seconds);
+    }
+
     public static long regexMatchingTimeoutMillis() {
         return readLongProperty(MOCKSERVER_REGEX_MATCHING_TIMEOUT_MILLIS, "MOCKSERVER_REGEX_MATCHING_TIMEOUT_MILLIS", 5000L);
     }