diff --git a/CHANGELOG.md b/CHANGELOG.md index 0aac2b4..613754c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The ### Added +- **Implicit prefix-cache wire-byte stability** (proposal 0047, spec v0.39.0). The OpenAI Chat Completions wire body is now byte-stable across equivalent OA inputs — equivalent calls produce byte-identical request bodies regardless of dict insertion order at every user-supplied-dict boundary (tool definitions including the top-level `function` record + the `parameters` JSON Schema, `response_format.json_schema.schema`, `RuntimeConfig` extras, `tool_call.arguments` JSON encoding). A new `_canonicalize_dict_keys` helper recursively sorts dict keys at every nesting level while preserving caller-supplied array ordering (the spec's split between "object keys MUST be sorted" and "array order MUST be preserved per caller-supplied order"). A top-level belt-and-suspenders canonicalization pass over the assembled body catches anything the per-field passes miss. Combined with the existing `Response.usage.cached_tokens` / `cache_creation_tokens` fields sourced from `prompt_tokens_details` (v0.12.0) and the OTel observer's `openarmature.llm.cache_read.input_tokens` + `openarmature.llm.cache_creation.input_tokens` attributes (also v0.12.0), this closes proposal 0047 end-to-end. Prompt-management §13 *Cross-variable substring stability* is satisfied by the existing Jinja2 `StrictUndefined` render path; pinned by a new test. Scope is the Chat Completions endpoint only — the OpenAI Responses API endpoint and the Anthropic / Gemini wire-format mappings are deferred (the providers aren't implemented in python today). - **`LlmFailedEvent` typed event variant** (proposal 0058, spec v0.53.0). Carves LLM provider failures into a spec-normatively-typed event variant alongside `LlmCompletionEvent`. 17 mirrored identity / scoping / request-side fields + 3 failure-specific fields (`error_category` always-present from the llm-provider §7 normative category enumeration; optional `error_type` for vendor-specific detail or upstream exception class name; always-present `error_message`). `OpenAIProvider.complete()` emits the typed event alongside the §7 exception on both raise paths — adapter-caught provider exceptions AND pre-send validation raises. Caller-side exception flow unchanged; the exception still raises out of `complete()`. Mutually exclusive with `LlmCompletionEvent` on the same call. Both bundled observers (OTel + Langfuse) consume `LlmFailedEvent` directly: same `openarmature.llm.complete` span / Generation shape as the success path with ERROR status / level + `openarmature.error.category` attribute (OTel) / `error_category` as statusMessage (Langfuse), `start_time` back-dated by `latency_ms` so the failure duration reflects the time-to-raise. ### Changed diff --git a/conformance.toml b/conformance.toml index da4501d..4d31243 100644 --- a/conformance.toml +++ b/conformance.toml @@ -266,11 +266,36 @@ status = "implemented" since = "0.11.0" # Spec v0.39.0 (proposal 0047). Implicit prefix-cache wire-byte -# stability. Cross-provider invariant requiring intra-impl byte -# equality across calls with equivalent inputs. Queued for v0.13.0 -# alongside 0049 (LLM provider hardening + typed event batch). +# stability. Cross-capability proposal landed in v0.13.0 across +# three pieces: (1) ``Response.usage`` cache-stat fields +# (``cached_tokens`` / ``cache_creation_tokens``) sourced from the +# OpenAI ``prompt_tokens_details`` payload, with conditional emission +# preserved (absent-vs-zero distinction stays observable) — landed +# in the v0.12.0 cycle as the proposal's payload-side prerequisite; +# (2) OTel observer emits ``openarmature.llm.cache_read.input_tokens`` +# (and optional ``openarmature.llm.cache_creation.input_tokens``) +# when the corresponding usage field is populated — also v0.12.0; +# (3) §8.1 intra-impl wire-byte canonicalization in the OpenAI +# adapter — landed here. The canonicalizer recursively sorts dict +# keys at every nesting level while preserving caller-supplied +# array order, applied at the four user-input boundaries +# (``tool.parameters`` / ``tool.function`` record top-level per +# spec Q5, ``response_format.json_schema.schema``, ``RuntimeConfig`` +# extras, ``tool_call.arguments`` JSON encoding) plus a top-level +# belt-and-suspenders pass over the assembled request body. Scope +# is the Chat Completions endpoint only; the OpenAI Responses API +# endpoint is deferred to a future cycle (no python consumer +# today). Prompt-management §13 cross-variable substring stability +# is satisfied by the existing Jinja2 ``StrictUndefined`` render +# path; pinned by ``tests/unit/test_prompts.py:: +# test_cross_variable_substring_stability_text_prompt`` and +# ``test_cross_variable_substring_stability_chat_prompt``. +# Anthropic / Gemini +# wire-byte conformance fixtures stay deferred — neither provider +# is implemented in python today. [proposals."0047"] -status = "not-yet" +status = "implemented" +since = "0.13.0" # Spec v0.40.0 (proposal 0048). Read-symmetric invocation metadata. # Adds ``get_invocation_metadata()`` symmetric to the existing diff --git a/docs/concepts/prompts.md b/docs/concepts/prompts.md index bb99691..08f87b0 100644 --- a/docs/concepts/prompts.md +++ b/docs/concepts/prompts.md @@ -365,6 +365,73 @@ The filesystem backend layout is `/