Skip to content

Latest commit

 

History

History
362 lines (264 loc) · 121 KB

File metadata and controls

362 lines (264 loc) · 121 KB

Changelog

All notable changes to openarmature-python are documented in this file.

The format follows Keep a Changelog. The package follows Semantic Versioning; pre-1.0 minor bumps may carry behavioral changes per spec governance.

[Unreleased]

Fixed

  • OTel: an orphaned LLM span inside a fan-out instance now parents under the per-instance dispatch span (observability §5.5). An LLM provider span whose calling node has no open span (for example a provider call originating from middleware or a wrapper) and which fires inside a fan-out instance now parents under the per-instance fan-out dispatch span, matching the Langfuse observer, instead of falling through to the subgraph or invocation span. This resolves a divergence between the two observers. The generalized nearest-open-ancestor fallback (nested instances at any depth, and the instance-vs-deeper-subgraph ordering) is pending a spec clause and fixture; this aligns the top-level instance case now.
  • current_fan_out_index() inside fan-out instance middleware now returns the executing instance's index (and current_fan_out_index_chain() its lineage) instead of None. The engine set the fan-out lineage ContextVars per-node, inside the inner subgraph, which left them unset in instance_middleware that wraps the subgraph from outside; they are now set around the instance-middleware chain. The documented instance_middleware use (RetryMiddleware) does not read the index, so no shipped behavior changes. This corrects the value seen by custom instance middleware that reads the index or calls set_invocation_metadata.
  • Langfuse per-branch dispatch-span observation (observability §4.3 / §8.4.2, proposals 0042 / 0044). The Langfuse observer now synthesizes a per-branch Span observation under a parallel_branches dispatcher node, so each branch's inner observations nest under their own branch span (a three-level dispatcher / per-branch-span / inner-nodes tree) instead of parenting directly under the dispatcher. The per-branch observation carries the OA-emitted branch_name alongside the caller baseline metadata and any per-branch augmentation, and the Generation observation now carries branch_name too. The OTel observer already produced this shape (proposal 0044 shipped OTel-only in v0.11.0); this brings the Langfuse mapping into line. Callable branches (proposal 0075) are unchanged.
  • Augmentation no longer lands on a shared-parent fan-out / parallel-branches node (observability §3.4, proposal 0045). A key set via set_invocation_metadata inside a fan-out instance or a parallel-branches branch was incorrectly applied to the shared fan-out / dispatcher NODE span (the fork point) when the augmenting context executed at that node's own namespace, in addition to the per-instance / per-branch dispatch span where it belongs. Both the OTel and Langfuse observers now skip the shared-parent node in that case, matching the behavior already applied to strict-ancestor shared parents. The per-instance / per-branch dispatch spans and the lineage ancestors that carry the augmentation are unaffected.
  • Langfuse fan-out-instance dispatch nested below the top level (observability §5.4, proposal 0013). The Langfuse observer's per-instance dispatch synthesis and parent resolution are now prefix-general, so a fan-out node sitting inside a serial subgraph wrapper (rather than at the top namespace level) gets its per-instance dispatch observation synthesized and its inner observations parented under it. This matches the OTel observer, which already resolved across every namespace prefix.
  • Dispatch spans nested inside an outer fan-out instance no longer collide across outer instances (observability §5.4 / §3.4, proposals 0013 / 0044 / 0045). A fan-out instance dispatch or a parallel-branches per-branch dispatch sitting inside an outer fan-out instance (a fan-out within a fan-out, or parallel-branches within a fan-out) was keyed by its local namespace only, so the same dispatch in different outer instances shared one key: the second outer instance's inner nodes reparented under the first instance's dispatch, and an inner augmentation reached the wrong outer instance's dispatch. Both the OTel and Langfuse observers now key dispatches by their full enclosing fan-out instance / branch lineage, and resolve a nested node's parent by that lineage too, so each outer instance gets its own correctly-parented dispatch subtree with isolated augmentation. Top-level and serial-nested dispatch behavior is unchanged.
  • Nested fan-out no longer collapses under concurrency (engine). A fan-out nested inside an outer fan-out instance shared a single per-fan-out tracking entry across all outer instances, because that entry was keyed by namespace plus node name only. With concurrent outer instances the second instance found the first's entry already marked complete and rolled its result forward, so every outer instance returned the first instance's inner result (silently wrong output) and the inner subgraph ran only once. The tracking key now carries the enclosing fan-out instance lineage, so each outer instance gets its own inner fan-out progress and correct per-instance results. Top-level and subgraph- or branch-nested fan-outs are unaffected (their enclosing lineage is empty). Resume of a fan-out nested inside an outer fan-out instance does not yet round-trip per-outer-instance progress, so it re-runs rather than skipping on resume; tracked as a follow-up.

[0.15.0] — 2026-06-22

Added

  • Detached-trace invocation span (proposal 0061, observability §4.4, spec v0.61.0). The OTel observer now synthesizes an openarmature.invocation span at the root of each detached trace (a detached subgraph and each detached fan-out instance), carrying the parent's shared invocation_id (detached mode is observer-side trace rendering, not a new run) and the detached unit's own entry_node; the detached subgraph / instance span nests under it. A raising detached subgraph surfaces ERROR plus the error category and an OTel exception event on both the parent dispatch span and the detached invocation span. This is observer-side only, with no graph-engine change; the Langfuse observer is unchanged (its Trace entity already plays the invocation-level-container role). Conformance fixtures 008 (rewritten) and 058 (newly wired) run in test_observability.
  • Per-attempt LLM spans under call-level retry (proposal 0050, observability §5.5 / llm-provider §7.1). Completes proposal 0050, which shipped partial in v0.14.0 (failure-isolation middleware and the complete(retry=...) loop landed then; the per-attempt span surface was deferred). Under call-level retry the OTel observer now emits one openarmature.llm.complete span per attempt, each carrying openarmature.llm.attempt_index (0-based, 0..N-1, and 0 for a no-retry call). An intermediate failed attempt's span carries ERROR status plus its error category and the request-side attributes; the final attempt's span carries the terminal outcome and, on success, the full response surface. A python-internal LlmRetryAttemptEvent, dispatched once per attempt, is the sole source of the OTel span; the terminal LlmCompletionEvent / LlmFailedEvent stay one per call (payload, latency, Langfuse Generation) and no longer drive the OTel span. Langfuse renders one terminal Generation per call, with the per-attempt detail on the OTel span surface only (a spec-side §8 clarification to pin this is tracked, non-blocking). conformance.toml flips proposal 0050 to implemented; the call-level fixtures 056-058 are driven through the provider plus OTel observer and the single-attempt observability fixture 057 is wired.
  • Langfuse trace.userId / trace.sessionId population (proposal 0064, observability §8.4.1, spec v0.62.0). The Langfuse observer now promotes a recognized userId key in the caller-supplied invocation metadata to Langfuse's first-class trace.userId field (the Users dashboard), additively: the key also remains at trace.metadata.userId. Promotion is automatic and unconditional; an absent key leaves trace.userId unset. The LangfuseClient.trace() surface (the Protocol, the in-memory client, and the SDK adapter) gains session_id / user_id. trace.sessionId is sourced from openarmature.session_id, which the sessions capability (proposal 0020) establishes; that capability is not yet implemented in python, so the sessionId plumbing is in place but dormant (no source) and unset in the interim. conformance.toml records proposal 0064 partial on that basis: fixture 084 cases 2/3/4 (not session-bound, userId present additively, userId absent) run, and the session-bound cases 1/5 defer until 0020. Langfuse-only: the OTel side already carries openarmature.session_id and openarmature.user.* as span attributes, and OTel has no trace-level session/user field.
  • Per-fetch prompt cache control: cache_ttl_seconds (proposal 0072, prompt-management §5 / §6, spec v0.63.0). PromptBackend.fetch, PromptManager.fetch, and PromptManager.get gain an optional cache_ttl_seconds read-side control: None preserves current behavior, 0 forces a fresh read past any client-side cache, and N > 0 bounds a served entry's staleness to N seconds; a negative value is rejected at the manager. It governs only which cached entry may be served, not whether or how results are cached. The bundled filesystem backend is cacheless and ignores it; the bundled Langfuse backend forwards it to the Langfuse SDK's get_prompt cache. Conformance fixtures 033/034 run through a caching harness backend (conformance-adapter §6.8: source_read_count plus a controllable advance_clock).
  • Failure-isolation catch gate + cause-chain classification primitive (proposal 0074, pipeline-utilities §6.3 / §6.4, spec v0.65.0). FailureIsolationMiddleware gains an optional catch: a set of error categories. An exception is caught only if the derived category of its cause chain (the outermost non-carrier link's category, resolved through the engine's node_exception carriers, the same value reported as caught_exception.category) is in the set. This closes a degrade-into-crash footgun: at a wrapping placement (subgraph, fan-out instance, branch) the engine wraps the originating failure in a carrier, so a predicate inspecting the surface exception sees only the carrier and misses it, whereas catch classifies through the carrier. catch composes with predicate as a conjunction; both default permissive (both unset stays catch-all), and a null derived category never matches a non-empty set. The carrier-skipping walk behind catch and caught_exception is promoted to a public primitive, classify_cause_chain(exc) -> CaughtException (the ordered chain, the derived category, and its message — the same record the event carries), exported from openarmature.graph for use in a custom predicate, a router, a metric, or a full-chain retry classifier. The default retry classifier stays deliberately single-level (it classifies at re-attempt granularity); this is now documented, with no behavior change. Conformance fixture 072 (catch matches through an instance-placement carrier and degrades; a non-matching catch propagates with no event). The optional native-exception-type catch form (spec MAY) is not shipped.
  • Inline-callable parallel branches and conditional when (proposal 0075, pipeline-utilities §11, spec v0.66.0). ParallelBranchesNode gains two additive branch forms. A branch may now give its work as call, an inline async function over the parent state returning a parent-shaped partial update, instead of a compiled subgraph with its own state schema and inputs / outputs projection; the returned partial is the branch's contribution directly, merged via the parent reducer with no projection. This makes the primitive adoptable for the "M heterogeneous lightweight parallel calls over shared state, each independently failure-isolated" shape (hybrid recall, paired reads) that previously dropped to a hand-rolled gather, while reusing the existing concurrency, fail-fast cancellation, per-branch failure isolation, and reducer fan-in. A branch gives its work as exactly one of subgraph / call, and a callable branch declares no inputs / outputs, else a new compile-time ParallelBranchesInvalidBranchSpec; a node may mix the two forms freely. A branch (either form) may also carry an optional when predicate over the parent state, evaluated once at dispatch: a False result skips the branch entirely (no dispatch, contribution, observer events, or span), and an all-skipped node is a valid no-op distinct from the compile-time ParallelBranchesNoBranches. A callable branch is the unit of work, so it emits one started / completed observer pair keyed by branch_name (rendered as a single branch span); a skipped branch emits nothing. ParallelBranchesInvalidBranchSpec is exported from openarmature.graph. Conformance fixtures 073 (two callable branches merge to disjoint fields), 074 (conditional when skips / dispatches), and 075 (callable branch failure-isolation degrade) run in test_pipeline_utilities.
  • Tool-call request observability on LLM spans (proposal 0076, observability §5.5.1 / §5.5.10 / §5.5.5, spec v0.67.0). The tool calls a model requests in its completion now have an output-side home on the openarmature.llm.complete span, closing the gap where they surfaced only incidentally on the next turn's input history. Which tools were requested renders by default as three ungated identity projections (the class of openarmature.llm.model): openarmature.llm.output.tool_calls.count, .names, and .ids, with .names and .ids index-aligned in request order and .count equal to their length. The full request, arguments included, renders as the payload-gated openarmature.llm.output.tool_calls, a JSON [{id, name, arguments}] array reusing the input tool-call encoding, surfaced only with disable_provider_payload=False. The whole family is emitted only on a tool-calling completion; a completion that requests no tools emits none of it (absence, not count = 0). The typed LlmCompletionEvent gains an additive output_tool_calls field carrying the ToolCall records, the source the span attributes render from (in python the OTel span renders from the per-attempt LlmRetryAttemptEvent, which carries the field too). This is the request side; the tool-execution complement (a separate openarmature.tool.call span) is a later proposal, joined to this one by the ToolCall.id. A Langfuse request-side mapping is out of scope. Conformance fixtures 085 (two requested calls surface count / names / ids), 086 (no calls, family absent), and 087 (payload gating: identity survives payload-off while the full serialization is suppressed) run in test_observability.
  • OTel GenAI metrics (proposal 0067, observability §11, spec v0.68.0). The OTel observer can now emit the OpenTelemetry metrics signal alongside its spans: two histogram instruments over provider calls, opt in with enable_metrics=True (default off, independent of span emission). openarmature.gen_ai.client.token.usage records an LLM completion's input and output token counts (one observation each, tagged openarmature.gen_ai.token.type); openarmature.gen_ai.client.operation.duration records the call's wall-clock duration, once per attempt under call-level retry, including a failed attempt (which carries error.type). Both carry openarmature.gen_ai.operation ("chat"), gen_ai.request.model, and gen_ai.system, and use the spec's explicit bucket advisories. The Meter comes from the configured MeterProvider (injectable via meter_provider=...; the OTel global is the no-op fallback when none is set). The instrument names are OA-namespaced, mirroring the upstream gen_ai.client.* instruments (at Development status) so a future cutover is a mechanical prefix-strip; metrics target OTel only (no Langfuse mapping). They are a projection of the per-attempt event stream, so they record with spans disabled. conformance.toml records proposal 0067 partial: the LLM-call metrics (fixtures 088 / 090 / 091) are implemented, and the embedding-call metrics (fixture 089) are deferred until the embedding capability (proposal 0059) lands. The LLM fixtures run in test_observability via an in-memory MetricReader capture (the conformance-adapter §6.9 primitive).
  • Tool-execution observability (proposal 0063, graph-engine §6 + observability §5.5 / §8.4, spec v0.69.0). A model requests tools in its completion (the request side, proposal 0076); the caller executes them in node-body code, and that execution is now observable. with_tool_call(tool_name, arguments, tool_call_id=...) is a node-body instrumentation scope (a context manager, like with_active_prompt, exported from openarmature.observability): you run the tool inside it and report the outcome with scope.set_result(...). OpenArmature observes the execution and emits a typed ToolCallEvent on success or a ToolCallFailedEvent (carrying error_type / error_message, deliberately with no error_category) on a raise, then re-raises (it observes, it does not run, select, loop, or swallow). Both events carry the identity / scoping baseline plus tool_name, tool_call_id (the link back to the requesting LlmCompletionEvent.output_tool_calls entry, or None for a standalone instrumented function), arguments, latency_ms, and call_id; ToolCallEvent adds result. The OTel observer renders an openarmature.tool.call span parented under the calling node, with OA-namespace openarmature.tool.{name,call.id,call.arguments,call.result} attributes and the standard error.type on failure; the Development gen_ai.tool.* / execute_tool surface is mirrored, not emitted in v1. The Langfuse observer renders a dedicated Tool observation (asType="tool", not a Generation) under the node's Span observation, with the arguments / result as input / output and the tool name / call id in metadata, ERROR level on failure. Arguments and result are payload, gated by disable_provider_payload (no new flag); disable_llm_spans does not gate the tool span. Conformance fixtures 092-098 run in test_observability.

Changed

  • Pinned spec advances v0.60.0 → v0.70.1 across the v0.15.0 cycle: v0.61.0 (proposal 0061, the detached-trace invocation span above), v0.62.0 (proposal 0064, the Langfuse session/user population above), v0.63.0 (proposal 0072, the prompt cache control above), the v0.63.1 patch (pipeline-utilities coverage fixtures 070/071 for the already-implemented 0069 / 0070 behavior, no new proposal), and v0.64.0 (proposal 0073, GenAI semconv adoption reconciliation: OA retains gen_ai.system despite the upstream rename to gen_ai.provider.name; textual-only, with no emitted-attribute or fixture change, so the existing gen_ai.* fixtures stand as the retention regression), v0.65.0 (proposal 0074, the failure-isolation catch gate above), v0.66.0 (proposal 0075, the inline-callable parallel branches and conditional when above), the v0.66.1 patch (an observability §8 call-level-retry Langfuse-mapping clarification reconciling §8 with the per-attempt §5.5 spans: one terminal Generation per complete() call, not one per attempt, which the Langfuse observer already renders by driving the Generation from the terminal LlmCompletionEvent / LlmFailedEvent and skipping the per-attempt LlmRetryAttemptEvent; no behavior or fixture change), v0.67.0 (proposal 0076, the tool-call request observability above), v0.68.0 (proposal 0067, the OTel GenAI metrics above), and v0.69.0 (proposal 0063, the tool-execution observability above), the v0.70.0 step (proposal 0060, retrieval-provider rerank, not implemented in python so it rides not-yet), and the v0.70.1 patch (observability conformance fixture 110, pinning the already-shipped proposal 0075 callable-branch span shape; conformance coverage only, no behavior change). conformance.toml records 0061 / 0072 / 0074 / 0075 / 0076 / 0063 implemented, 0064 partial (its sessionId half is dormant pending the sessions capability) and 0067 partial (its embedding-call metrics await the embedding capability), 0073 textual-only, and 0060 not-yet (its 11 rerank fixtures 099-109 defer with it). Proposal 0050 needed no pin bump of its own (it was already within the pin from its v0.42.0 acceptance); its v0.14.0 partial entry flips to implemented with the per-attempt span surface above.

[0.14.0] — 2026-06-17

Added

  • FailureIsolationMiddleware (proposal 0050, pipeline-utilities §6.3). A third bundled middleware primitive alongside RetryMiddleware and TimingMiddleware. It catches exceptions escaping the wrapped node's inner chain and returns a configured degraded partial update, so a non-critical node can fail without aborting the whole invocation. Configuration: degraded_update (a static mapping or a state -> partial_update callable, resolved at catch time), event_name (required, no default, since a generic name makes downstream telemetry strictly worse), an optional predicate (Exception -> bool; only matching exceptions are caught, others propagate), and an optional async on_caught hook. It catches Exception; BaseException (cancellation) propagates, matching RetryMiddleware. On a catch it dispatches a new framework-emitted FailureIsolatedEvent (a distinct observer-event variant carrying event_name, the wrapped node's lineage identity, pre_state / post_state, and a CaughtException record of category plus message) onto the observer delivery queue; the bundled OTel and Langfuse observers render it as a marker span / observation. Compose it OUTER of RetryMiddleware for the "retry transients, degrade gracefully on exhaustion" pattern; in that composition FailureIsolatedEvent.attempt_index reports the wrapped node's final (exhausting) attempt rather than the post-retry-reset baseline. Additive: existing pipelines see no behavior change, and 0050 itself needed no pin bump (it was already within the v0.53.0 pin the cycle started from).
  • Call-level retry on Provider.complete() (proposal 0050, llm-provider §7). The provider's complete() gains an optional retry: RetryConfig | None parameter. When supplied, the wire call is retried in-call on transient provider errors per the config (classifier, backoff, on_retry, max_attempts), so a node issuing several LLM calls in a loop does not re-run the already-successful calls when a later call hits a transient failure. The request is built and validated once (pre-send validation errors are never retried), and the call stays terminal-only on the observability surface: exactly one LlmCompletionEvent (eventual success) or LlmFailedEvent (retry exhaustion or a non-transient error) fires per complete() call, with a single call_id shared across attempts. The per-attempt span surface (N per-attempt spans and the openarmature.llm.attempt_index attribute) is deferred to a future cycle; conformance.toml marks proposal 0050 partial accordingly; 0050 needed no pin bump of its own.

Changed

  • RetryMiddleware now takes a RetryConfig record instead of individual constructor kwargs (proposal 0050 prep). The four retry settings (max_attempts / classifier / backoff / on_retry, each optional) move onto a frozen RetryConfig; construct as RetryMiddleware(RetryConfig(max_attempts=...)), while bare RetryMiddleware() still applies the defaults. This is a breaking change to the RetryMiddleware constructor. The record is the same shape the upcoming call-level complete(retry=...) parameter will accept, so one retry config serves both the per-node and per-call layers. None fields resolve to the canonical defaults (default_classifier / exponential_jitter_backoff) at use, preserving the prior behavior.
  • Failure-isolation events report the originating cause's category at non-node placements (proposal 0065, pipeline-utilities §6.3). When FailureIsolationMiddleware runs as instance middleware (§9.7), branch middleware (§11.7), or parent-node middleware on a fan-out / parallel-branches node, the graph engine has already wrapped the originating error as a node_exception carrier before the middleware catches it. FailureIsolatedEvent.caught_exception.category now resolves through that carrier (and any nested carriers) to the nearest categorized originating cause and reports its category instead of the masking node_exception, so the reported category agrees with what the §6.1 retry classifier acted on. For example, an instance whose retries exhaust on provider_unavailable now surfaces provider_unavailable rather than node_exception. The message tracks the resolved cause for category/message coherence. Node-level placement was already faithful and is unchanged, and catch/degrade behavior is unchanged at every site (only the event's reported cause changes). The wrapped-instance/branch lineage SHOULD (fan_out_index / branch_name) is deferred to a follow-up, since it needs the engine to surface per-instance identity to the wrapping-site middleware. conformance.toml marks proposal 0065 implemented, and conformance fixture 064 (three cases: the §9.7 instance and §11.7 branch sites plus an uncategorized cause) passes.
  • Observer privacy flag disable_llm_payload renamed to disable_provider_payload (proposal 0059, observability §5.5.4, spec v0.54.0). The observer-level flag on both bundled observers (OTelObserver and LangfuseObserver) is renamed, and its scope broadens from LLM-completion payload to any provider-call payload (LLM completion today; embedding and rerank when those land). This is a breaking change to both observer constructors: config passing disable_llm_payload=True (or False) updates to disable_provider_payload=... with no other change. The default stays True (payload suppressed), and the gating behavior for LlmCompletionEvent / LlmFailedEvent rendering is unchanged at every existing site. The rename is the only part of proposal 0059 adopted this cycle: the retrieval-provider capability itself (the EmbeddingProvider protocol, the EmbeddingEvent / EmbeddingFailedEvent typed variants, and the embedding span / observation mapping) is not yet implemented and rides as not-yet in conformance.toml. The §5.5.4 rename touches existing LLM-payload gating, so it lands with the pin.
  • Fan-out failure-isolation degrade contribution implemented (proposal 0066, pipeline-utilities §9.3 / §9.8 / §11.7, spec v0.56.0). When FailureIsolationMiddleware degrades a fan-out instance, that instance is a success whose contribution is its degraded_update, read in subgraph-field-name space and never merged onto the failed instance's pre-failure state. This also fixes a latent bug: an instance degraded_update's extra_outputs values were previously looked up by the parent field name and silently dropped (collect_field was unaffected). A static degraded_update that omits the node's collect_field is now a compile-time error (FanOutDegradedUpdateMissingCollectField); a callable degraded_update that omits it yields a graceful null slot rather than raising, preserving one collection slot per item. The parallel-branches counterpart (a branch degraded_update omitting a projected outputs field skips that field) was already correct as of the parallel-branches fix above and is now pinned by fixture 065. Success-path and resume behavior for correctly-configured fan-outs is unchanged.
  • Failure-isolation events carry the full structured cause chain (proposal 0068, pipeline-utilities §6.3, spec v0.57.0). FailureIsolatedEvent.caught_exception gains a chain: an ordered list of CauseLink records (each carrying category, message, and a carrier flag), from the caught exception (outermost) to the originating raise (innermost), with graph-engine node_exception carrier wrappers flagged carrier=True. The existing category and message are retained and redefined as a derivation over the chain: the category of the outermost non-carrier link whose category is a non-empty string (else category is null and message is the outermost non-carrier link's message). This supersedes proposal 0065's single "originating cause" representation, which was ambiguous once the post-carrier chain held more than one non-carrier link; the derivation reproduces 0065's single-carrier values, so fixture 064 is unchanged. A new CauseLink type is exported from openarmature.graph. The bundled OTel and Langfuse observers continue to render the derived category; surfacing the full chain is left to custom observers. The change is additive to the event shape, and catch/degrade behavior is unchanged. Conformance fixture 066 (three cases: an instance-site carrier chain, a node-level single non-carrier link, and an uncategorized null-category cause) passes.
  • Pinned spec advances v0.53.0 → v0.60.0 across the v0.14.0 cycle, in seven steps: v0.54.0 (proposal 0059, the observer-flag rename above), v0.55.1 (proposal 0065 above; the v0.55.1 patch also carries an observability §11 span-links text reconciliation that narrows an Out of scope bullet, with no python-observable change), v0.56.0 (proposal 0066, the fan-out degrade contribution above), v0.57.0 (proposal 0068, the failure-isolation cause chain above), v0.58.0 (proposal 0070, conformance-adapter crash-injection and cause-chaining test vocabulary: a crash_injection directive and a recursive mock cause, with conformance fixtures 067 and 068, no library behavior change), v0.59.0 (proposal 0069, fan-out degrade contribution refinements to 0066: an omitted extra_outputs source is a positional null slot, an absent collect_field is a null slot the fan-in does not raise on except under a strict-element reducer, and a degraded slot survives resume; python already satisfied these, so the change is conformance coverage via fixture 069 plus a strict-reducer unit test, no library behavior change), and v0.60.0 (proposal 0071, conformance-adapter failure-mock directive catalog: a descriptive §5.1 catalog of the flaky* family the adapter already implements, no new fixtures and no code change). conformance.toml records 0065, 0066, 0068, 0070, and 0069 as implemented, 0071 as textual-only, and 0059 as not-yet (only its cross-spec flag rename was adopted).

Fixed

  • Parallel-branches branch middleware now runs in the branch subgraph's state space (pipeline-utilities §11.7). Branch middleware wraps the branch's subgraph invocation, so a middleware that short-circuits with a subgraph-space partial update (notably FailureIsolationMiddleware's degraded_update) is now projected to the parent through the branch's outputs mapping, exactly like a real subgraph result. Previously the outputs projection ran inside the middleware chain, so a branch-level degraded_update written in the subgraph's fields reached the parent state unprojected and tripped extra-field validation. The bug was latent because the only bundled branch middleware exercised until now was RetryMiddleware, which re-invokes the chain rather than returning a cross-space update; it surfaces with failure isolation at a branch placement. A degraded_update that does not cover a projected outputs field contributes nothing for that field (the parent keeps its prior value) rather than raising, consistent with the §11.4 buffer-then-merge model for partial contributions. The success path, fan-out instance middleware (which already operated in subgraph space), and node-level placement are unchanged.

[0.13.0] — 2026-06-09

LLM provider hardening release. The pinned spec advances from v0.46.0 to v0.53.0, absorbing four implemented proposals. Proposal 0049 introduces the first spec-normatively-typed observer event variant, LlmCompletionEvent, dispatched on every successful LLM provider call; proposal 0058 adds the failure-side counterpart, LlmFailedEvent; proposal 0057 extends the completion variant with eight request-side fields. The bundled OpenAIProvider retires its sentinel-namespace NodeEvent emission for LLM calls entirely, and the OTel and Langfuse observers now drive their LLM span / Generation from the typed events with back-dated timestamps so durations reflect the adapter boundary. Proposal 0047 closes implicit prefix-cache wire-byte stability: Response.usage gains cache-stat fields, the OTel observer emits openarmature.llm.cache_read attributes, and the OpenAI Chat Completions request body is byte-stable across equivalent inputs regardless of dict insertion order. Custom observers that filtered LLM calls by sentinel namespace MUST migrate to isinstance discrimination; LLM_NAMESPACE and LlmEventPayload remain as a documented compatibility surface.

Added

  • Implicit prefix-cache wire-byte stability (proposal 0047, spec v0.39.0). Closes proposal 0047 end-to-end across three pieces all landing in v0.13.0: (1) Response.usage.cached_tokens / cache_creation_tokens fields sourced from the OpenAI prompt_tokens_details payload (PR #136); (2) the OTel observer emits openarmature.llm.cache_read.input_tokens and optional openarmature.llm.cache_creation.input_tokens when the corresponding usage field is populated (PR #140); (3) the OpenAI Chat Completions wire body is now byte-stable across equivalent OA inputs — equivalent calls produce byte-identical request bodies regardless of dict insertion order at every user-supplied-dict boundary (tool definitions including the top-level function record + the parameters JSON Schema, response_format.json_schema.schema, RuntimeConfig extras, tool_call.arguments JSON encoding) via a new _canonicalize_dict_keys helper that recursively sorts dict keys at every nesting level while preserving caller-supplied array ordering, plus a top-level belt-and-suspenders canonicalization pass over the assembled body (PR #145). Prompt-management §13 Cross-variable substring stability is satisfied by the existing Jinja2 StrictUndefined render path; pinned by a new test. Scope is the Chat Completions endpoint only — the OpenAI Responses API endpoint and the Anthropic / Gemini wire-format mappings are deferred (the providers aren't implemented in python today).
  • LlmFailedEvent typed event variant (proposal 0058, spec v0.53.0). Carves LLM provider failures into a spec-normatively-typed event variant alongside LlmCompletionEvent. 17 mirrored identity / scoping / request-side fields + 3 failure-specific fields (error_category always-present from the llm-provider §7 normative category enumeration; optional error_type for vendor-specific detail or upstream exception class name; always-present error_message). OpenAIProvider.complete() emits the typed event alongside the §7 exception on both raise paths — adapter-caught provider exceptions AND pre-send validation raises. Caller-side exception flow unchanged; the exception still raises out of complete(). Mutually exclusive with LlmCompletionEvent on the same call. Both bundled observers (OTel + Langfuse) consume LlmFailedEvent directly: same openarmature.llm.complete span / Generation shape as the success path with ERROR status / level + openarmature.error.category attribute (OTel) / error_category as statusMessage (Langfuse), start_time back-dated by latency_ms so the failure duration reflects the time-to-raise.
  • LlmCompletionEvent extended with proposal 0057 request-side fields (spec v0.51.0). The typed event now carries input_messages, output_content, request_params, request_extras, active_prompt, active_prompt_group, call_id, and response_model alongside the existing v0.49.0 fields. request_id renamed to response_id per the proposal's response-side naming. Inline image bytes in input_messages stay redacted per observability §5.5.5 — the OpenAI provider reuses the existing message-serialization helper for the projection. Observer-side privacy gates (OTel disable_llm_payload, Langfuse equivalents) apply at rendering, symmetric with the §5.5.1 span attribute path.

Changed

  • Sentinel-namespace NodeEvent emission for LLM events retired entirely from OpenAIProvider (proposal 0058 cleanup). The provider no longer dispatches the ("openarmature.llm.complete",)-namespaced NodeEvents on either outcome path; both success and failure flow through their respective typed variants exclusively. The _make_llm_event helper is removed. External custom observers that filtered LLM calls by event.namespace == LLM_NAMESPACE MUST migrate to isinstance(event, LlmCompletionEvent) for success and isinstance(event, LlmFailedEvent) for failure to keep receiving LLM-call notifications. LlmEventPayload and LLM_NAMESPACE remain in openarmature.observability.llm_event as a documented compatibility surface for custom providers that haven't migrated; neither is referenced by the bundled provider or observers anymore.
  • Pinned spec advances from v0.46.0 to v0.53.0 across the v0.13.0 cycle. Absorbs four implemented proposals (0047 — implicit prefix-cache wire-byte stability; 0049 — typed LlmCompletionEvent; 0057 — LlmCompletionEvent request-side field-set extension; 0058 — typed LlmFailedEvent) plus 0023 (canonical state reducers, v0.52.0) carried as not-yet in the manifest. Pin journey: v0.46.0 → v0.51.0 (PR #141 absorbs 0057) → v0.53.0 (PR #144 absorbs 0058; spec v0.52.0's 0023 entry rides along as not-yet). Fixtures 034–038 (0023) stay parser-deferred.
  • tool_call.arguments JSON encoding now uses sort_keys=True (proposal 0047 §8 byte-stability requirement for caller-supplied dicts JSON-encoded into a string field). Functionally equivalent — the encoded string parses to the same dict — but byte-different from the previous insertion-order encoding. Downstream consumers that snapshot wire bodies (golden-file tests, audit logging, recorded fixtures) will see byte-different tool_calls[].function.arguments strings across this upgrade for any call whose argument dict was emitted in non-sorted insertion order before.
  • OTel and Langfuse observers drive the openarmature.llm.complete span / Generation observation lifecycle from the typed LlmCompletionEvent (proposal 0049 + 0057, observability §5.5.7). Successful LLM-provider calls now open + close the OTel span and the Langfuse Generation in one shot at typed-event arrival, with start_time back-dated by LlmCompletionEvent.latency_ms so duration reflects the adapter-boundary measurement rather than dispatcher queue delay. The §5.5 attribute set and §8.4 Generation metadata are unchanged. (Failure paths land on LlmFailedEvent later in the same cycle — see the proposal 0058 entry above.)
  • OpenAIProvider.complete() no longer emits the sentinel NodeEvent pair on the success path (v0.13.0 cleanup). The bundled OTel and Langfuse observers now consume the typed LlmCompletionEvent directly; the sentinel pair was kept on the success path through earlier releases for compatibility with pre-typed-event observers. External custom observers that filtered LLM calls by event.namespace == LLM_NAMESPACE MUST migrate to isinstance(event, LlmCompletionEvent) to continue seeing successful LLM calls. (The failure-path sentinel emission is retired entirely later in the same cycle — see the proposal 0058 entry above.)
  • LangfuseClient Protocol gains optional start_time / end_time timestamps on generation(...) and the Generation/Span handles' end(...). The Langfuse observer passes back-dated timestamps on the typed-event success path so the Langfuse UI shows the actual adapter-boundary duration. The SDK adapter handles v4 Langfuse SDK quirks transparently: Langfuse.start_observation() does NOT accept start_time, so back-dated generations are routed through the private _otel_tracer.start_span(name=..., start_time=int_ns) API (mirroring the SDK's own create_event precedent) and the resulting OTel span is wrapped in LangfuseGeneration directly; the non-back-dated path still uses start_observation. LangfuseSpan.end() is typed Optional[int] (nanoseconds), so the adapter converts the Protocol's datetime surface to int nanoseconds before forwarding. The InMemoryLangfuseClient stores both fields verbatim on LangfuseObservation for test assertions.
  • OpenAIProvider(populate_caller_metadata=...) default flipped from False to True. The python implementation now populates LlmCompletionEvent.caller_invocation_metadata by default so the bundled OTel and Langfuse observers can emit the §5.6 openarmature.user.<key> span-attribute family without a separate opt-in. Pass populate_caller_metadata=False to suppress the snapshot when no downstream consumer needs it. The spec-defined opt-in mechanism is unchanged; only the python default flips.

[0.12.0] — 2026-06-05

Observability release. The pinned spec advances from v0.38.0 to v0.46.0, absorbing eight accepted proposals (0047-0054). Three ship as fully implemented this cycle: proposal 0048 grows a read-symmetric get_invocation_metadata() API + a §9 Queryable observer pattern concept doc section; proposal 0052 puts openarmature.implementation.name + .version attribution attributes on every OTel invocation span + every Langfuse Trace; proposal 0054 ships CompiledGraph.drain_events_for(invocation_id, *, timeout) as the architectural pair to 0048's §9.4 accumulator lifecycle. Two ship as textual-only acks (0051 Langfuse trace I/O caveat; 0053 §3.4 shared-parent boundary clarification). One Fixed: the retry middleware now resets the invocation-metadata ContextVar between attempts per §3.4. The production-observability example grows the queryable accumulator + drain_events_for pattern end-to-end so the new APIs have a runnable demo.

Changed

  • Pinned spec advanced from v0.38.0 to v0.46.0. Submodule + [tool.openarmature].spec_version + conformance.toml spec_pin advance together. Absorbs eight new proposals (0047-0054) into the conformance manifest. Two ship as textual-only acknowledgments with no code change required: proposal 0051 (observability §8.4.1 Langfuse trace.input / trace.output implementation-surface caveat — documents that vendor SDK round-trip is required to project caller-side trace I/O updates onto the wire; the v0.11.0 (proposal 0043) caller-hook shape already matches the documented behavior) and proposal 0053 (observability §3.4 shared-parent boundary clarification — tightens the structural-shared-parent classification to predicate the invocation span on whether at least one fan-out or parallel-branches dispatch is on the augmenter's call-stack path; behavior already matches via fixtures 034 + 039). Three ship as fully implemented this cycle: proposal 0048 (read-symmetric metadata + queryable observer pattern docs — see Added below), proposal 0052 (implementation attribution attributes — see Added below), and proposal 0054 (per-invocation observer event drain — see Added below; bundled with 0048 as the §9.4 accumulator-lifecycle pair). The remaining proposals are marked not-yet in the conformance manifest with roadmap targets: 0047 + 0049 (v0.13.0 LLM provider hardening batch) and 0050 (v0.14.0 retry & reliability batch).
  • README and docs homepage refreshed around reasons-to-choose. Replaced the 10-bullet "Why OpenArmature" feature inventory in README.md with 5 differentiating reasons (LLM-infused workflows to agents on one engine; crash-safe resume by contract; destination-pluggable observability with OTel + Langfuse, no SaaS lock-in; compile-time topology checks; spec + conformance). The docs homepage (docs/index.md) card grid carries the same five plus a sixth card retained from the previous grid for async-first / LLM-agnostic: workflows-to-agents, crash-safe, pluggable observability, bad-graphs-don't-compile, parallelism (fan-out + parallel-branches + nested correctness), async-first.
  • Docs sweep: stale references and em-dash normalization. Fixed three definite stale references (spec_version='0.26.0' in the Langfuse example output now reads '0.38.0'; the dangling v0.16.1 qualifier dropped from the parallel-branches concept page; compiled.attach_observer corrected to graph.attach_observer in non-obvious-shapes.md for variable-name consistency with the rest of the docs). Swept em dashes out of the user-facing docs (130 instances across 17 files) per the convention set during the patterns expansion. mkdocs strict build clean; no broken intra-docs links.
  • The checkpointing-and-migration example grows a crash-and-resume drama. The first invoke of the v1 graph now hits a simulated transient failure inside size_crew (raises a RuntimeError on its first attempt only). The example catches NodeException at the invoke() boundary, prints what's saved on disk (define_objective's position is already in completed_positions), then re-invokes with resume_invocation=<id>. The retried size_crew succeeds, draft_timeline runs, and the pipeline finishes - dramatizing the synchronous-checkpoint-by-contract reliability claim from the README pitch. The existing v1->v2 migration phase rides on top of the crash-survived checkpoint, so both reliability stories compose in one demo. Walk-through doc rewritten to cover both phases.
  • Examples renamed and catalog reorganized by topic. All 13 example directories drop their numeric prefixes (examples/00-hello-world/ -> examples/hello-world/, etc.); the corresponding docs/examples/NN-name.md files do the same. The catalog at docs/examples/index.md and the mkdocs nav are regrouped into seven topical sections: Foundations, Composition, Concurrency, Prompts, Tool use, Reliability, Observability. Catalog entries and walk-through H1s drop the number prefix; cross-references between examples are rewritten by-name (no more "Example 03"). The examples/README.md is brought up to date with the post-v0.11.0 catalog (entries for chat-with-multimodal, langfuse-observability, production-observability now present) under the same grouping. tests/test_examples_smoke.py's DEMOS list updated; the pytest parametrize IDs follow the new names. Em dashes scrubbed from the example sources and examples/README.md to match the convention established in the v0.11.0-cycle docs sweep.

Added

  • Implementation attribution attributes (proposal 0052, observability §5.1 + §8.4.1, spec v0.44.0). Every OTel invocation span now carries openarmature.implementation.name ("openarmature-python") and openarmature.implementation.version (the package's __version__) alongside the existing openarmature.graph.spec_version. Every Langfuse Trace mirrors the rows as trace.metadata.implementation_name / trace.metadata.implementation_version. The values let operators triage in their observability backend without a separate deployment-manifest lookup: "which library, at which version, produced this trace" — the first question operators reach for. Always-emit invariant: neither disable_state_payload, disable_llm_payload, nor any other privacy knob gates these attributes, since they describe runtime identity rather than runtime data. Both observers expose implementation_name and implementation_version dataclass fields for test parameterization; the defaults read from the package identity via the same lazy-import pattern as spec_version. A new openarmature.__implementation_name__ = "openarmature-python" constant joins __version__ and __spec_version__ at the package root. The §3.4 reserved-key set grows from 24 to 26 names — implementation_name and implementation_version are reserved against caller-supplied collision, so a caller passing invocation_metadata={"implementation_name": "spoof"} is rejected at the invoke() boundary with ValueError.
  • CompiledGraph.drain_events_for(invocation_id, *, timeout=5.0) -> DrainSummary (proposal 0054, spec graph-engine §6 Per-invocation drain, v0.46.0). The architectural pair to proposal 0048's §9.4 queryable observer accumulator lifecycle: a terminal node calling await graph.drain_events_for(state.invocation_id) blocks until every event dispatched for that invocation has reached every attached observer, typically followed by a read against a queryable observer accumulator whose bucket the drain has now caught up to. Snapshot semantic: the drain awaits the events dispatched as of call time; new emissions after the call are out of scope. Reuses the existing DrainSummary shape verbatim — no new InvocationDrainSummary variant. Load-bearing divergence from drain(): a per-invocation drain timeout MUST NOT cancel the delivery worker, in contrast to drain()'s shutdown semantics. The graph stays serving other invocations after the timeout fires; the deliver loop keeps processing the queue. Default timeout is 5.0 seconds; None waits indefinitely; 0.0 is a non-blocking check. Negative or NaN timeout raises ValueError at the API boundary. Unknown invocation_id (already drained or never started) returns an empty summary, not an error.
  • get_invocation_metadata() read-symmetric API (proposal 0048, observability §3.4, spec v0.40.0). The canonical spec-idiomatic public name for the §3.4 read access pairs with set_invocation_metadata() on the write side: same function object as the historical current_invocation_metadata, exposed for callers wishing to use the symmetric get_/set_ naming. Returns the MappingProxyType snapshot of the current async context's view (caller baseline + in-node augments), or the empty mapping outside any active invocation. Read-only — callers MUST NOT mutate it. Both names are now exported from openarmature.observability; existing current_invocation_metadata callers continue to work unchanged.
  • docs/concepts/observability.md §9 Queryable observer pattern documents the convention-only observer-attached read methods that proposal 0048 §9 blesses: how to add a get_* read method to a custom observer (§9.1), the async-safety contract for concurrent reads under in-flight delivery (§9.2), the three-channel data-access guidance (typed State / untyped invocation metadata / queryable observer accumulator, §9.3, with a side-by-side table), and the lifecycle / explicit drop(invocation_id) discipline (§9.4). No new abstract surface on Observer per the spec — the pattern is convention-only and exists to bless the existing observer-state read shape used in production code.
  • Production observability example. examples/production-observability/ demonstrates the production-grade observability stack end-to-end: OTelObserver + LangfuseObserver attached to the same graph (proposal 0031), trace_input_from_state / trace_output_from_state caller hooks on the Langfuse observer (proposal 0043 §8.4.1) deriving domain dicts from State, the built-in TimingMiddleware recording per-node duration via an on_complete callback, and invoke(metadata={...}) carrying multi-tenant identifiers (tenantId / requestId / featureFlag) that both observers pick up at once. InMemoryLangfuseClient + InMemorySpanExporter capture in-process so the demo prints what each backend would have ingested without needing real production credentials. The v0.12.0 cycle extends the demo with a third observer: an LlmUsageAccumulator queryable-accumulator pattern subscribing to LLM-namespace events, accumulating per-invocation token totals keyed by current_invocation_id(), and a terminal persist node that calls await graph.drain_events_for(current_invocation_id(), timeout=2.0) to synchronize on the deliver loop before reading the bucket and dropping it. The accumulator's __call__ handles InvocationCompletedEvent as a backstop so a drain timeout doesn't leak buckets. The example's OTel formatter also surfaces the root openarmature.invocation span with its proposal 0052 implementation attribution attributes alongside the per-node spans.
  • Chat-with-multimodal example. examples/chat-with-multimodal/ demonstrates ChatPrompt + PlaceholderSegment (proposal 0046) end-to-end: a four-turn lunar-mission Q&A conversation with conversation memory threaded through state, one mid-conversation turn attaching a photograph via ImageURLBlockTemplate, the agent processing the multimodal turn naturally without changing the chat-history shape. Complementary to the tool-use example; chat history threading and tool calling are separate primitives.
  • docs/examples/index.md catalog now lists the langfuse-observability example. A pre-existing gap (it was missing from the catalog) caught and fixed alongside the chat-with-multimodal entry.
  • PyPI + spec-version shields on the docs homepage. docs/index.md now carries dynamic shields for the published PyPI version and the pinned spec version, sourced from img.shields.io. Both auto-update on every publish or spec bump; no maintenance burden. Mirrors the same shield URLs the README already uses.
  • vLLM production deployment notes. docs/model-providers/vllm.md grows a "Production deployment" section covering the VLLM_HTTP_TIMEOUT_KEEP_ALIVE gotcha (vLLM's stock 5s uvicorn keep-alive lapses pooled OA-side httpx connections and surfaces as ProviderUnavailable; widen to roughly 300s), a systemd unit skeleton, and the three throughput knobs that interact with OA's shared connection pool (--max-model-len, --max-num-seqs, --gpu-memory-utilization). The existing "Tool calling" section grows a --tool-call-parser family table verified against vLLM's docs (Llama 3.x / Llama 4 / Mistral / Hermes / Qwen3 / DeepSeek V3 / GPT-OSS), plus explicit "not supported here" callouts for Anthropic / Gemini (proprietary cloud) and mainstream Gemma (no vLLM parser).
  • Three new patterns docs. docs/patterns/state-migration-on-resume.md, docs/patterns/caller-supplied-trace-identifiers.md, and docs/patterns/observer-state-reconciliation.md graduate the corresponding entries from docs/agent/non-obvious-shapes.md into full pattern recipes with code snippets and "when this is right / when it isn't" guidance. The programmatic patterns API (openarmature.patterns.list() / get(name)) grows from 4 to 7 entries.
  • HyperDX OTel integration test path and "Production swap" docs in the observer-hooks example. examples/observer-hooks/main.py's module docstring grows a "Production swap" section showing how to substitute the demo's SimpleSpanProcessor + ConsoleSpanExporter for BatchSpanProcessor + OTLPSpanExporter pointed at HyperDX (or any other OTLP-HTTP collector). A new opt-in integration test (tests/integration/test_otel_hyperdx_export.py, gated by HYPERDX_API_KEY + HYPERDX_OTLP_ENDPOINT env vars and @pytest.mark.integration) drives the same production export path end-to-end against a live endpoint. opentelemetry-exporter-otlp-proto-http lands as a dev-only dep; not promoted to a public extras group yet.

Fixed

  • RetryMiddleware now enforces per-attempt invocation-metadata scoping (proposal 0048 / spec observability §3.4). Each retry attempt sees only the metadata in scope at retry-loop entry plus that attempt's own writes; writes from a prior attempt that subsequently failed are discarded. Prior to this fix the retry middleware only managed the attempt_index ContextVar and left the invocation-metadata ContextVar unchanged across attempts, so a set_invocation_metadata(...) call inside a failed attempt remained visible on the retry. The fix captures the pre-attempt baseline once at retry-loop entry, resets the metadata ContextVar to that baseline on each iteration, discards the failed attempt's writes on retry-eligible and terminal failure paths, and leaves the successful attempt's writes in place so downstream nodes see them. Closes the v0.12.0 cycle's partial claim on proposal 0048; manifest entry flips back to implemented.

Changed (breaking)

  • OpenAIProvider.ready() default probe flipped to chat_completions. A new constructor kwarg readiness_probe: Literal["models", "chat_completions", "both"] selects which wire path ready() exercises; the default is now the chat-completions path (POST /v1/chat/completions with max_tokens=1), which actually exercises the inference path. The previous catalog-only behavior is still available as readiness_probe="models", and readiness_probe="both" runs catalog then chat for the strongest signal. Motivation: OpenAI-compatible proxies (Bifrost and similar) can return 200 on GET /v1/models while rejecting POST /v1/chat/completions, leaving the catalog probe green while every real call fails. The new default surfaces that class of failure at preflight rather than at first inference. Non-200 chat-probe responses route through classify_http_error, so the canonical error categories (provider_authentication, provider_unavailable, provider_invalid_model, etc.) surface consistently. Callers that depended on the catalog-only behavior (cost-sensitive cloud setups where every ready() would now bill prompt tokens) can opt back in by passing readiness_probe="models".

[0.11.0] — 2026-06-01

Observability + prompt-management release. The pinned spec advances from v0.27.1 to v0.38.0, absorbing eight accepted proposals (0039-0046). Two headlines: (1) the Langfuse observer grows native trace.input / trace.output sourcing with caller hooks (0043) and the per-async-context augmentation boundary becomes lineage-aware for nested fan-out / parallel-branches topologies (0045); (2) prompt-management gains a Chat-prompt variant alongside the existing Text-prompt (0046) and LangfusePromptBackend lands for both Langfuse text and chat prompts. Caller-supplied invocation_id (0039), mid-invocation open-span metadata update (0040), three reserved-key surfaces (0041 + 0042), and the parallel-branches OTel dispatch span (0044) round out the cycle.

Added

  • Multi-message chat-prompt rendering (proposal 0046, prompt-management §3.1 / §6, spec v0.38.0). The Prompt type splits into a discriminated union over TextPrompt (existing single-string template) and the new ChatPrompt carrying an ordered list of ChatSegment entries — ContentSegment (role-tagged content; text-template OR content-blocks-template) and PlaceholderSegment (caller-supplied message-list injection). Content-block templates mirror llm-provider §3.1 (TextBlockTemplate, ImageURLBlockTemplate, ImageInlineBlockTemplate). PromptManager.render accepts a new placeholders: Mapping[str, Sequence[Message]] | None kwarg; chat prompts render segment-by-segment with strict-undefined per segment and per block. The Langfuse backend now maps Langfuse ChatPromptClient to ChatPrompt. Conformance fixtures 017-031 activate against the extended harness. Single-string Text-prompt rendering is unchanged at the call surface — existing prompt.template callers continue to work via the TextPrompt variant.

  • Inline image base64 validated at render time. A chat-prompt content-blocks template with an ImageInlineBlockTemplate whose rendered base64_data fails base64.b64decode(..., validate=True) now raises prompt_render_error at the prompt-manager boundary rather than letting the malformed payload reach the LLM provider, where the error would be provider-specific.

  • Nested-lineage augmentation containment scope (proposal 0045, observability §3.4, spec v0.37.0). The per-async-context augmentation boundary rewrites as three lineage-aware rules: the augmenter's call-stack ancestor chain MUST update (every strict dispatch ancestor on the path — each outer fan-out instance dispatch span, each outer parallel-branches branch dispatch span, each outer serial subgraph wrapper); siblings at any depth MUST NOT; shared parents (fan-out NODE, parallel-branches NODE, invocation span) MUST NOT. Engine-side: tracks per-depth lineage chains (fan_out_index_chain / branch_name_chain) parallel to namespace_prefix, available on NodeEvent and MetadataAugmentationEvent. Observer-side: OTelObserver._collect_augmentation_targets and LangfuseObserver._handle_metadata_augmentation rewrite against the three-step boundary decision tree. Single-level behavior (fixtures 029 / 030 / 034) is unchanged.

  • LangfuseObserver Trace input/output sourcing (proposal 0043, observability §8.4.1). New observer construction knobs populate trace.input and trace.output per the three-lever decision tree:

    • disable_state_payload: bool = True — privacy knob symmetric to disable_llm_payload. When ON (default), Trace fields receive the minimal stub {entry_node, correlation_id} / {final_node, status}; when OFF, the raw state object is serialized.
    • trace_input_from_state / trace_output_from_state — optional caller hooks returning the domain-shaped value to use for trace.input / trace.output. Returning None falls through to the next applicable lever.
    • status is the closed Literal["completed", "failed"] enum from spec §8.4.1.
  • Two new observer event types delivered through the existing graph.observer.Observer queue:

    • InvocationStartedEvent(initial_state, invocation_id, correlation_id, entry_node) — emitted once at invocation entry before any node fires.
    • InvocationCompletedEvent(final_state, status, final_node, invocation_id, correlation_id) — emitted once at invocation exit on both the success path (status="completed") and failure path (status="failed").

    The Observer.__call__ signature widens to NodeEvent | MetadataAugmentationEvent | InvocationStartedEvent | InvocationCompletedEvent. The new ObserverEvent type alias (re-exported from openarmature.graph) gives observer authors a one-name handle on the union; existing observers that ignore non-NodeEvent variants early-return after an isinstance(event, NodeEvent) check.

  • LangfuseTrace.input / LangfuseTrace.output dataclass fields on the in-memory recorder, populated by the new observer paths.

  • Parallel-branches OTel dispatch span synthesis (proposal 0044, observability §5.7). Mirroring the fan-out per-instance dispatch synthesis (proposal 0013), the OTel observer now synthesizes a per-branch dispatch span between the parallel-branches NODE span and each branch's inner-node spans. New ParallelBranchesEventConfig payload on NodeEvent (branch_names, branch_count, error_policy, parent_node_name); engine populates it on the parallel-branches NODE's started / completed events. New OTel span attributes:

    • openarmature.parallel_branches.branch_count + openarmature.parallel_branches.error_policy on the parallel-branches NODE span.
    • openarmature.node.branch_name + openarmature.parallel_branches.parent_node_name on each per-branch dispatch span.
    • openarmature.node.branch_name on every inner-node span beneath a per-branch dispatch span.
  • Caller-supplied invocation_id (proposal 0039, observability §5.1, spec v0.32.0). invoke(invocation_id=...) now accepts a caller-supplied non-empty URL-safe string in place of the framework-minted UUIDv4. Mirrors the correlation_id shape: caller-supplied wins; framework mints a UUIDv4 only when absent. Resume mints a fresh invocation_id per attempt (the previous attempt's id remains on the saved record). The Langfuse mapping derives trace.id via the SDK's create_trace_id(seed=invocation_id) for non-UUID values (raw id preserved under trace.metadata.invocation_id); UUID values continue to map via dashes-stripped hex.

  • Mid-invocation open-span metadata update (proposal 0040, observability §3.4, spec v0.31.0). set_invocation_metadata(**entries) called mid-invocation now updates currently-open spans in the augmenting async context in place, via the backend SDK's attribute / metadata update path (Span.set_attribute for OTel, observation update(metadata=...) for Langfuse). Tightens 0034's per-async-context delivery from SHOULD to MUST; preserves the ancestor / sibling boundary (spans in ancestor or sibling contexts MUST NOT be updated). Per spec §3.4 v0.31.0; proposal 0045 (v0.37.0) extends the boundary rule to be lineage-aware for nested dispatch.

  • LangfusePromptBackend (text and chat variants). A PromptBackend impl backed by the Langfuse SDK's prompt-registry surface. Gated behind the existing [langfuse] extra so the base package stays SDK-free. Maps Langfuse TextPromptClient to TextPrompt; ChatPromptClient to ChatPrompt (added by proposal 0046 — see entry above). Fails closed (PromptNotFound) when a Langfuse chat entry has an unsupported shape rather than silently dropping. The fetched Prompt carries the SDK Prompt object under observability_entities['langfuse_prompt'] so the existing Generation → Prompt link in the Langfuse observer fires automatically.

Changed (breaking, pre-1.0)

  • Prompt is now a discriminated-union type alias over TextPrompt | ChatPrompt (proposal 0046). The previous Prompt(...) class instantiation MUST update to TextPrompt(...); type annotations using Prompt as a return / parameter type continue to work (the alias is the union). The Langfuse backend no longer raises on Langfuse chat prompts — it returns ChatPrompt instead of PromptNotFound. Per spec §6 narrowing, Text prompts render to exactly one UserMessage; multi-message / multimodal prompts MUST use the Chat variant.
  • OTel span attribute openarmature.branch_name is renamed to openarmature.node.branch_name to align with the spec §5.7 attribute namespace. Prior python releases emitted openarmature.branch_name as a workaround because the spec hadn't defined an OTel attribute carrying branch_name yet; proposal 0044 (v0.36.0) formalizes the namespace. Downstream dashboards, queries, or alerts filtering on the old attribute name MUST update. Pre-1.0 break; the prior name was python-implementation-only and was never spec-normative.

Changed

  • Reserved-key extension (proposal 0042, observability §3.4, spec v0.34.0). Three additional bare key names — branch_name, detached, detached_from_invocation_id — are reserved against caller-supplied invocation_metadata and set_invocation_metadata collision; the framework rejects them at the invoke() boundary and at the mid-invocation augmentation helper with ValueError. The reserved-name set grows from 21 to 24. These three are top-level Langfuse metadata keys the observer mapping already writes; without reservation a caller key matching one would silently shadow the OA-emitted field. Maintenance extension of the 0041 mechanism.
  • Langfuse-emitted top-level metadata key collision reservation (proposal 0041, observability §3.4, spec v0.31.0). The reserved caller-metadata key set extends to cover every OA-emitted top-level key the §8.4 Langfuse mapping writes alongside caller keys (20 names — correlation_id, entry_node, spec_version, namespace, step, attempt_index, fan_out_index, subgraph_name, etc.). Whole-key exact match; rejected at the invoke() boundary independently of which backend is attached. Prevents a caller key from silently overwriting an OA-emitted field in Langfuse's flat top-level metadata. Pre-existing openarmature.* / gen_ai.* prefix reservation unchanged.
  • observation.metadata.detached: true moves to the parent-side dispatching observation (proposal 0042, observability §8.4.2). The Langfuse mapping previously emitted detached: true on the dispatch observation inside the detached child trace; the §8.4.2 row added by 0042 places it on the parent-side dispatching observation that fires the detached child (the link observation in the main trace for detached subgraphs; the parent fan-out node observation for detached fan-outs). The detached-side observation no longer carries the flag.
  • LangfuseClient.update_trace Protocol grows input / output keyword parameters so observer-supplied values land on the Trace's headline fields.

Fixed

  • install_log_bridge no-ops when an OTel logging handler is already attached to the root logger by an SDK that auto-installs one (e.g., HyperDX). The previous attach-always behavior produced duplicate LoggingHandler instances and double-emitted log records when used alongside such an SDK. The installer now detects an existing LoggingHandler whose provider matches the current LoggerProvider and skips the re-attach.
  • InvocationCompletedEvent.final_state on the failure path now surfaces the partial state at failure point. Spec §8.4.1 Resume semantics requires the failure-path trace.output hook to receive "the partial final state captured at the failure point"; the original PR #99 implementation defaulted to starting_state, so the hook saw pre-execution state when it should have seen post-execution-up-to-failure state. The engine now tracks the latest post-merge state via a latest_state_box on _InvocationContext, updated after every successful step and read on the failure path. Success-path behavior unchanged.
  • latest_state_box is per-context, not shared across subgraph descents. Unlike the sibling final_node_box (which shares by reference because the spec wants the innermost failing node's name — the real culprit), latest_state_box must isolate per level so the outermost Langfuse trace receives outer-state-typed values. Without the isolation, a subgraph-internal step's inner-typed state would leak up to the outer trace.output hook, breaking the hook's typed contract. Each subgraph / fan-out instance / parallel-branches branch gets its own fresh box. Pinned by four regression tests covering flat, subgraph, fan-out, and parallel-branches failure paths.

Notes

  • Pinned spec version bumped from v0.27.1 to v0.38.0 over the v0.11.0 cycle. Eight proposals absorbed: 0039 (caller-supplied invocation_id, v0.32.0), 0040 (mid-invocation open-span metadata update, v0.31.0), 0041 (Langfuse top-level metadata key collision reservation, v0.31.0), 0042 (reserved-key extension to 24 names, v0.34.0), 0043 (Langfuse trace input/output sourcing, v0.35.0), 0044 (parallel-branches OTel dispatch span, v0.36.0), 0045 (nested-lineage augmentation containment scope, v0.37.0), and 0046 (multi-message chat-prompt rendering, v0.38.0). The pinned spec also carries the textual additions in v0.32.0 (Gemini wire-format mapping, 0038, not yet implemented) and v0.33.0 (sessions capability, 0020, not yet implemented).
  • LangfuseSDKAdapter now applies trace.input / trace.output to the live Langfuse Trace. Input lands on the first real observation under the trace via set_trace_io; output uses a synthetic short-lived openarmature.trace_io observation as the carrier. The InMemoryLangfuseClient used by tests applies the fields directly.
  • Conformance fixture observability/conformance/037-langfuse-trace-input-output activated for all five cases (default stub / disable_state_payload=False / hooks non-null / hooks null-fallthrough / resume re-fire). The langfuse harness grew per-case checkpointer: in_memory wiring, a compact flaky: test seam, and a two-phase resume-flow assertion path.
  • The Langfuse v4 SDK marks set_current_trace_io / Span.set_trace_io deprecated ("removal in a future major version"). Empirical verification against Langfuse Cloud (v4.7.1, 2026-05-29) confirms it remains the only path that populates the Traces list view's headline Input / Output columns; propagate_attributes(metadata=...) does not substitute for it in the current UI. We will revisit when Langfuse publishes a concrete migration guide for v5.

[0.10.0] — 2026-05-27

Langfuse observability release. The pinned spec advances from v0.22.1 to v0.27.1, absorbing six accepted proposals (0031-0036). The headline is a native Langfuse backend mapping (a sibling to the OTel mapping) driven by a downstream production project integrating OpenArmature with Langfuse; this release also adds caller-supplied invocation metadata, two fan-out collection reducers, and a batch of provider / observability hardening surfaced by that same downstream integration.

Added

  • LangfuseObserver — native Langfuse backend mapping (proposal 0031, observability §8). An observer that consumes the §6 event stream as a sibling to the OTel observer (both can be attached to one graph; each honors its own opt-out). Maps invocation → Langfuse Trace, node / subgraph / fan-out → Span observation, LLM provider call → Generation observation. Sets the Trace id equal to the OA invocation_id so cross-system lookup is a direct hit; routes correlation_id to trace.metadata and every observation.metadata. Full subgraph dispatch, per-instance fan-out, and detached-trace-mode parenting (§8.3 / §8.5). Decoupled from the SDK via the LangfuseClient Protocol.
  • InMemoryLangfuseClient — an in-process recorder satisfying LangfuseClient, used by the conformance harness and useful for unit tests; captures Traces / Observations verbatim for assertion.
  • LangfuseSDKAdapter — bridges the real langfuse>=4.6 SDK to the LangfuseClient Protocol (UUID4 → OTel-hex trace-id conversion, propagate_attributes on every observation, usage translation). Gated behind the new [langfuse] extra (pip install openarmature[langfuse]); the observer itself needs no SDK install because the Protocol decouples it.
  • Public force_flush(timeout_ms=30_000) on OTelObserver and LangfuseObserver (downstream ask). Wraps the underlying provider / client flush so fast-teardown harnesses (serverless functions, CLI one-shots, FastAPI TestClient teardown) can drain the export buffer without reaching into the private _provider attribute. Distinct from CompiledGraph.drain(), which covers the engine's observer-event queue; force_flush() covers the outbound span-export buffer.
  • Caller-supplied invocation metadata (proposal 0034, observability §3.4 + §5.6 + §8.4). invoke(metadata={...}) accepts a per-invocation mapping of str → AttributeValue (OTel scalars or homogeneous arrays). The framework propagates every entry to all observability backends: the OTel observer emits each as an openarmature.user.<key> cross-cutting span attribute on every span; the Langfuse observer merges each as a top-level key into trace.metadata and every observation.metadata. openarmature.observability.set_invocation_metadata(**entries) augments the in-scope mapping mid-invocation (additive; respects fan-out / parallel-branches per-instance COW scoping); current_invocation_metadata() reads it. Boundary validation rejects keys under the reserved openarmature.* / gen_ai.* prefixes and non-OTel-compatible value types with a synchronous ValueError.
  • concat_flatten and merge_all reducers (proposal 0036, graph-engine §2). The fan-out collection analogs of append / merge: a fan-out subgraph emitting list[X] per instance lands list[list[X]] at the parent target_field (use concat_flatten to flatten one level); emitting dict[str, X] lands list[dict] (use merge_all to fold with last-write-wins per key). Both are strict — they raise ReducerError (graph-engine §4) when an update element isn't the expected list / mapping shape. Exported from openarmature.graph; the required-built-in set grows from three to five.
  • Three new RuntimeConfig declared fields (proposal 0032, llm-provider §6): frequency_penalty, presence_penalty, and stop_sequences. Surfaced on the OpenAI wire body per §8.1 (with stop_sequences renaming to OpenAI's stop key) and as gen_ai.request.* span attributes. Per the §6 null-skip rule, each declared field with value None is omitted from the wire body.
  • Prompt-management surface refinements (proposal 0033). Prompt.sampling (a SamplingConfig subclass of RuntimeConfig), Prompt.observability_entities, the LabelResolver three-step resolution chain (explicit > resolver > "production"), and filesystem layout / sampling-source ergonomics for the prompt-management capability.
  • Self-hosted vLLM cookbook at docs/model-providers/vllm.md — base-URL contract, the structured-output fallback flag, the genai_system="vllm" override, readiness-probe limitations + warm-up pattern, and tool calling.
  • conformance.toml manifest + CI guard. A machine-readable record of which spec proposals are implemented and since which version, validated against the pinned spec submodule by scripts/check_conformance_manifest.py on every PR. Consumed by the spec docs site to render per-implementation status.

Changed

  • OpenAIProvider rejects a /v1 suffix on base_url (downstream-surfaced bug). httpx joins base URLs by appending, so base_url="https://host/v1" plus the provider's /v1/chat/completions request produced a doubled /v1/v1/... wire path that silently 404/405'd on most backends while the readiness probe stayed green. The provider now raises ValueError at construction when base_url's path ends in /v1 (with or without a trailing slash, and through query strings / fragments). Other non-empty paths (proxy prefixes) are left intact. No existing users were affected; this is the first production integration.
  • metadata.subgraph_name / openarmature.subgraph.name carries the compiled-subgraph identity (proposal 0035 resolution), not the wrapper node name. SubgraphNode and FanOutConfig gain an optional subgraph_identity; the engine threads it through NodeEvent.subgraph_identities to the observers. Falls back to the empty string when no identity is tracked (observability §5.3). Distinct from the observation's name / namespace, which remain the wrapper node name.

Fixed

  • entry_node / trace name when the outer entry is a SubgraphNode. Subgraph wrappers don't emit their own events, so the first event the observer saw came from inside the subgraph; the Langfuse observer recorded the inner node as the trace's entry_node. Now resolves to event.namespace[0] (the outer entry).
  • Detached-mode link observation no longer carries subgraph_name. In detached-trace mode the wrapper role migrates to the detached trace; the parent trace's link observation is the SubgraphNode span (no wrapper role) and must not carry subgraph_name.

Notes

  • Pinned spec version bumped from v0.22.1 to v0.27.1 over the v0.10.0 cycle. Six proposals absorbed: 0031 (observability Langfuse mapping, v0.23.0), 0032 (RuntimeConfig declared-field expansion, v0.24.0), 0033 (prompt-management surface refinements, v0.25.0), 0034 (caller-supplied invocation metadata, v0.26.0), 0035 (Langfuse graph-topology conformance fixtures, v0.26.1 + v0.27.1 fixture corrections), and 0036 (fan-out collection reducers concat_flatten / merge_all, v0.27.0). All conformance fixtures pass against the v0.27.1 pin, including the un-deferred Langfuse subgraph / fan-out / detached-trace fixtures and the two new reducer fixtures.
  • langfuse>=4.6,<5 is the supported SDK range for LangfuseSDKAdapter, validated end-to-end against Langfuse Cloud. The v4 SDK's flush() is synchronous but exposes no timeout parameter, so LangfuseObserver.force_flush(timeout_ms=...) accepts the argument for Protocol symmetry but the underlying flush honors the SDK's own deadlines (best-effort).

[0.9.0] — 2026-05-25

Added

  • openarmature.patterns programmatic API. Two-function surface (list() -> list[str], get(name: str) -> str) exposing the same patterns content shipped in the bundled AGENTS.md. Each pattern is returned as a standalone markdown document: no heading demotion (patterns keep their original # Title), and relative ../concepts/...md / ../examples/...md / intra-pattern links are rewritten to absolute openarmature.ai URLs at build time so cross-references resolve outside the source tree. Useful for agents in sandboxed environments that can import openarmature but can't freely read arbitrary package paths. Content lives at src/openarmature/_patterns/<slug>.md, generated alongside the bundled AGENTS.md and drift-checked by tests/test_agents_md_drift.py. Unknown names raise KeyError with a message listing the known names.
  • openarmature CLI registered as a [project.scripts] entry point with two subcommands:
    • openarmature init appends a discovery pointer block (the python -c "..." one-liner + openarmature docs recipe) into the current project's AGENTS.md and CLAUDE.md so agent sessions opening the project find the bundled OpenArmature docs. Creates files when absent, appends when they exist, and skips re-runs via a <!-- openarmature-init --> comment marker. Flags: --force (re-append despite the marker), --dry-run (print what would be written), --cwd PATH (operate against a path other than the current directory).
    • openarmature docs prints the absolute path to the bundled AGENTS.md. Equivalent to the README discovery one-liner but ergonomic to type and remember.
    • The same surface is reachable as python -m openarmature ... via src/openarmature/__main__.py, so environments where the [project.scripts] entry doesn't land cleanly (some pip install --target layouts, path-shadowed venvs) still work as long as the package is importable.
  • Bundled agent documentation at openarmature/AGENTS.md. The wheel now ships a generated AGENTS.md file at the installed package root, agent-discoverable via python -c "import openarmature; print(openarmature.__path__[0] + '/AGENTS.md')". Sections include a TL;DR, capability summaries pulled from the pinned spec submodule's §1 (Purpose) + §2 (Concepts), the patterns docs, hand-written non-obvious-shapes recipes, and a one-line example index. Generator lives at scripts/build_agents_md.py; the committed file is CI-drift-checked by tests/test_agents_md_drift.py. The submodule pin discipline (build refuses unless the submodule HEAD is AT a v* tag via git tag --points-at HEAD) prevents draft (untagged) spec text — or text from a commit between two release tags — from leaking into a release bundle. Adopting projects can point their own AGENTS.md / CLAUDE.md at this path so agent sessions in their codebase find it automatically (or use openarmature init to do the wiring automatically).
  • FanOutInstanceProgress.result_is_error field (proposal 0027, accepted in spec v0.21.0). Explicit boolean discriminator on each per-instance entry in CheckpointRecord.fan_out_progressTrue for collect-mode error contributions (roll forward into errors_field), False for success contributions (roll forward into target_field). The engine reads the explicit field on resume rather than inferring routing from result's shape; the previous structural heuristic (_looks_like_error_record) is removed. Backward-compat path on load: pre-0027 records that omit the key default to False.
  • Strict CheckpointRecordInvalid on fan-out count drift (proposal 0029, accepted in spec v0.22.0). When the resumed run's resolved instance count differs from the saved fan_out_progress entry's instance_count, the engine raises CheckpointRecordInvalid before any fan-out instance work runs on the resumed path. Replaces the pre-0029 pad/truncate behavior which silently dropped completed contributions on shrink (breaking §10.11.1's exactly-once guarantee) and dispatched unsaved work on grow.
  • tool_choice parameter on Provider.complete() (proposal 0025, accepted in spec v0.20.0). Optional discriminated-union value constraining the model's tool-calling behavior — one of "auto", "required", "none", or a ForceTool(name=...) record. Validation runs pre-send: "required" and ForceTool both demand non-empty tools, and ForceTool.name must appear in the supplied list; violations raise ProviderInvalidRequest (§7's existing category — no new error category). When tool_choice is None (the default) the wire field is omitted and the provider's own default applies, preserving pre-0025 behavior exactly. The OpenAIProvider maps the spec shape onto OpenAI's wire shape per §8.1.1 (the ForceTool.type="tool" renames to wire type="function").
  • ForceTool and ToolChoice public types at openarmature.llm.ForceTool / openarmature.llm.ToolChoice. ForceTool is a frozen Pydantic model with type: Literal["tool"] = "tool" and name: str; ToolChoice = Literal["auto", "required", "none"] | ForceTool is the type alias used in Provider.complete()'s signature.
  • validate_tool_choice public validator at openarmature.llm.validate_tool_choice. Standalone validator covering the three §5 pre-send rules; useful for third-party Provider implementations that want to reuse the canonical validation logic.
  • Bounded drain timeout on CompiledGraph.drain() (proposal 0010, accepted in spec v0.19.0). drain() accepts an optional timeout: float | None = None parameter (non-negative seconds). When supplied, drain returns no later than the deadline; any observer events still queued or in-flight are reported as undelivered. Workers are cancelled cleanly so the compiled graph remains usable for subsequent invocations — partial delivery state from one drain does NOT leak into the next. Solves the "slow / hung / misbehaving observer blocks process exit" footgun for short-lived processes (CLIs, scripts, serverless functions). Observers SHOULD be cancellation-safe (idempotent writes, try/finally cleanup); the spec doesn't mandate it but the docs recommend it.
  • DrainSummary frozen dataclass at openarmature.graph.DrainSummary. Returned from every drain() call (with or without timeout). Fields: undelivered_count: int, timeout_reached: bool. The shape is consistent across timed and untimed drains — callers receive the same dataclass whether the timeout was supplied or not. Per the v0.19.0 contract the two declared fields are the spec-mandated minimum; richer diagnostic detail (per-observer counts, sampled event metadata) is reserved for follow-on PRs.
  • Per-instance fan-out resume contract (proposal 0009, accepted in spec v0.18.0). The engine now writes a checkpoint record at every completed event inside a fan-out instance (in addition to the existing outermost-graph + subgraph-internal + fan-out node completion saves). On resume the engine consults the saved record's fan_out_progress field and treats each instance as completed (skip, contribution rolls forward), in_flight (re-run from subgraph entry), or not_started (dispatch normally). The append reducer's no-double-merge guarantee holds across resume because completed is a one-shot accumulator state.
  • FanOutProgress and FanOutInstanceProgress public dataclasses on openarmature.checkpoint. The CheckpointRecord.fan_out_progress field is now tuple[FanOutProgress, ...] (default empty tuple), with per-instance state, result, and completed_inner_positions observability. Was a None placeholder under proposal 0008.
  • FanOutInternalSaveBatching config on InMemoryCheckpointer. Backends MAY opt into batching scoped to fan-out instance internal saves to bound the write volume of high-instance-count fan-outs. Outermost-graph, subgraph-internal, and the fan-out node's own completion save remain synchronous regardless. Default off. Buffered-but-unflushed saves are lost on crash by design; on resume, instances whose completed state was only buffered revert and re-run. Surfaces a new optional save_fan_out_internal / save_fan_out_in_flight_failure Checkpointer Protocol seam; backends that don't implement either fall back to the standard save.
  • Patterns docs section at docs/patterns/, sibling to Concepts. Seeded with four recipes drawn from downstream usage and proposal 0008's alternatives section: parameterized entry point, tool-dispatch-as-node, session-as-checkpoint-resume, and bypass-if-output-exists. Patterns are user-level how-to recipes composing existing primitives, not framework contracts; new patterns can be added without spec coordination. Each page follows a problem / approach / snippet / when this is the right pattern / when it isn't / cross-references structure.

Changed

  • CheckpointRecord.schema_version sourcing clarified per proposal 0028 (spec v0.21.1). Every save site within an invocation now reads schema_version from the declared graph state class — the class passed to GraphBuilder(...) — threaded as context.state_cls. Previously the outer dispatch save read from the declared class while fan-out instance internal saves read from type(state) at save time; the inconsistency only surfaced when a user passed a State subclass that shadowed schema_version, but the divergence made §10.12 migration lookups unreliable across save sites. Now uniform across outer / subgraph-internal / fan-out instance internal saves.
  • Provider.complete() signature extended with an optional tool_choice: ToolChoice | None = None parameter (per proposal 0025 v0.20.0). Backward-compatible: callers that omit the new argument see no wire-shape change. Third-party Provider implementations MUST add the parameter to remain Protocol-conformant under strict type checking (and to accept calls that pass tool_choice without raising TypeError); they MAY ignore it in their wire-body emission, which is how "provider doesn't honor tool_choice" looks at the impl level. The OpenAIProvider wire mapping is implemented per §8.1.1.
  • CompiledGraph.drain() return type changed from None to DrainSummary (pre-1.0; per proposal 0010 v0.19.0 contract). Callers that ignored the return are unaffected — await graph.drain() discards the returned dataclass exactly as before. Callers that explicitly typed the return as None will need to update their annotation.
  • Fan-out resume behavior flipped from atomic restart (0008's v1 contract) to per-instance resume. A crash mid-fan-out used to re-run the entire fan-out on resume; now only the instances that did not complete-and-record their contribution re-run. The economics matter for large fan-outs of expensive work (LLM calls, long extractions): an 80% complete fan-out crash now restores 80% of its results rather than discarding them.
  • SQLiteCheckpointer schema picks up a new fan_out_progress_blob column (added via ALTER TABLE for backward compatibility with pre-0009 databases). Pre-0009 rows back-fill as NULL on load and round-trip as the empty-tuple default. Both pickle and json serialization modes round-trip the new field.

Notes

  • Pinned spec version bumped from v0.17.0 to v0.22.1 over the v0.9.0 cycle. Ten spec versions absorbed: v0.17.1 (proposal 0019, multi-provider wire-format extension — purely textual reframe of llm-provider §8 as a catalog of wire-format mappings; OpenAI-compatible body nested under §8.1), v0.18.0 (proposal 0009, per-instance fan-out resume — pipeline-utilities §10.3 / §10.7 revised, §10.11 added; the append reducer no-double-merge invariant is the load-bearing correctness story), v0.18.1 (fixture-only patch correcting an off-by-one literal in fixture 052's expected results), v0.19.0 (proposal 0010, bounded drain timeout — graph-engine §6 amended with the timeout parameter and DrainSummary return contract), v0.20.0 (proposal 0025, llm-provider tool_choice — §5 / §7 / §8.1.1 amended), v0.20.1 (proposal 0026, llm-provider §8.X wire-format mapping subsection template — purely textual §8 framing paragraph; the existing OpenAI §8.1 mapping is the template's reference shape so no python module-level work was needed), v0.21.0 (proposal 0027, explicit result_is_error discriminator on fan_out_progress per-instance entries — see Added above), v0.21.1 (proposal 0028, canonical source for schema_version — declared graph state class wins over runtime subclass shadowing; see Changed above), v0.22.0 (proposal 0029, strict CheckpointRecordInvalid on fan-out count drift — see Added above), and v0.22.1 (proposal 0030, drain snapshot semantic + timeout-input validation — purely textual; python already implemented both behaviors per the 0010 impl PR, so no module-level work needed). All existing conformance fixtures continue to pass.

[0.8.0] — 2026-05-23

LLM-provider span payload and GenAI semconv release. Pinned spec jumps from v0.16.1 to v0.17.0 (proposal 0024 / observability §5.5 expansion). The trigger was a friction report from a downstream agent integrating OA with Langfuse over OTLP: LLM spans rendered "naked" (model + tokens only), prompt linkage silently dropped at the dispatch-worker task boundary, and every backend needed a per-service attribute-mapping shim. This release clears all eight items in that report.

Added

  • openarmature.llm.input.messages / openarmature.llm.output.content / openarmature.llm.request.extras span attributes (spec §5.5.1). When the OTel observer is constructed with disable_llm_payload=False, LLM spans carry the messages sent, the assistant response content, and the RuntimeConfig extras bag — JSON-encoded with sorted keys, no insignificant whitespace, UTF-8. Default-off (the flag is disable_llm_payload: bool = True) because the payload may contain PII the user hasn't audited; opt in deliberately. Subject to the §5.5.5 truncation contract.
  • GenAI semantic-conventions attributes (spec §5.5.2 + §5.5.3). LLM spans now carry gen_ai.system, gen_ai.request.model, gen_ai.response.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.finish_reasons (single-element string array), gen_ai.response.id, and per-set gen_ai.request.{temperature,max_tokens,top_p,seed} (only set fields — absence is meaningful per §5.5.2). The existing openarmature.llm.* attribute set is preserved alongside; both namespaces emit. Default-on (disable_genai_semconv: bool = False); opt out when an external auto-instrumentation library (OpenInference, opentelemetry-instrumentation-openai, etc.) is the canonical source of GenAI attributes for your stack.
  • OTelObserver(resource=...) constructor argument. Optional opentelemetry.sdk.resources.Resource passed to the private TracerProvider. Lets callers set service.name / service.version directly rather than via OTEL_SERVICE_NAME / OTEL_RESOURCE_ATTRIBUTES environment variables (which had to be set BEFORE constructing the observer to take effect — a footgun the explicit kwarg avoids).
  • Multi-processor support on OTelObserver. The span_processor constructor argument now accepts a SpanProcessor | Sequence[SpanProcessor]. Multi-destination export (e.g., HyperDX + Langfuse on one observer) becomes a one-line constructor call instead of a per-service CompoundSpanProcessor workaround.
  • OTelObserver(attribute_enrichers=...) hook. Sequence of Callable[[Span, NodeEvent | None], None] invoked just before the observer ends each span. Lets users add backend-specific attributes (custom langfuse.* keys, vendor span kinds, etc.) without subclassing or mutating span._attributes post-on_end. The event is None on synthetic close sites (subgraph dispatch, detached root, fan-out instance, invocation span, shutdown drain); enrichers that need per-event context short-circuit on None. Exceptions are caught and warned, never propagated to the dispatch worker.
  • OTelObserver(payload_max_bytes=...) truncation cap. Per-attribute byte cap for the §5.5.1 payload attributes. Default 65,536 (64 KiB) per attribute; minimum 256 bytes (rejected at construction). The truncation algorithm (spec §5.5.5) emits the largest UTF-8 code-point-aligned prefix that fits within cap - len(marker) bytes followed by the marker …[truncated, M bytes total]. Inline image bytes are unconditionally redacted at the provider before any cap applies (see Image redaction below).
  • OpenAIProvider(genai_system="openai") constructor argument. Default "openai"; override for non-OpenAI endpoints that speak the OpenAI Chat Completions wire format (vLLM, LM Studio, llama.cpp, sglang). Surfaces as the gen_ai.system span attribute. No base-URL sniffing happens — the same host:port could be any of several servers, and a wrong inference is worse than the explicit opt-in.
  • openarmature.observability.LLM_NAMESPACE and openarmature.observability.LlmEventPayload public exports. The ("openarmature.llm.complete",) sentinel namespace used by the LLM-provider hook and the payload shape backend observers consume. Third-party Provider implementations can dispatch their own LLM events via current_dispatch()(NodeEvent(..., namespace=LLM_NAMESPACE, pre_state=LlmEventPayload(...))); custom observers can recognize the same sentinel and read attributes off the payload. Previously private (_LLM_NAMESPACE, _LlmEventState); the old underscore-prefixed names are no longer exported.
  • Response.response_id and Response.response_model typed fields. Mirror the wire response's id and model fields when the provider returns them. Surface as gen_ai.response.id and gen_ai.response.model per spec §5.5.3; also useful for downstream cross-referencing with provider-side billing or audit logs without reaching into Response.raw.

Changed

  • Prompt-context attribute propagation now survives the dispatch-worker task boundary. Previously the OTel observer read current_prompt_result() / current_prompt_group() from inside _handle_llm_event, which runs in the engine's delivery-worker task. asyncio.create_task(deliver_loop(queue)) snapshots the current Context at task creation, before any node body runs — so the ContextVars set by with_active_prompt(...) were never visible to the worker. openarmature.prompt.* attributes silently went missing on the LLM span. Fixed by capturing both ContextVars at dispatch time inside the OpenAIProvider.complete() call (which runs in the node task, where with_active_prompt IS active) and threading the snapshots through the LlmEventPayload. The observer reads from the payload, not the ContextVar.
  • Inline image bytes are redacted at the provider, not the observer. Image content blocks with ImageSourceInline are serialized with source replaced by {type: "inline_redacted", byte_count: N} per §5.5.5 before the payload reaches the observability dispatch queue. Defense-in-depth: bytes never leave the provider in event form, so custom observers subscribing to the LLM event (enabled by LlmEventPayload being public) cannot accidentally leak raw image bytes regardless of their implementation. media_type and detail are preserved at the image-block level per llm-provider §3.1.2. URL-form images pass through unchanged.
  • OTelObserver.shutdown() docstring documents the BatchSpanProcessor flush gotcha. Under fast or unusual teardown orderings (e.g., FastAPI TestClient teardown that closes the event loop before the batch processor's export thread finishes), spans can appear dropped. Documented workarounds: call provider.force_flush(timeout_millis=…) explicitly before shutdown(), or use SimpleSpanProcessor in tests.

Notes

  • Pinned spec version bumped to v0.17.0. Per the additive-only governance rule (proposal 0024 adds; never renames), implementations passing v0.16.1 conformance fixtures continue to pass under v0.17.0; the new fixtures (012-021) add cases without modifying existing ones.

[0.7.0] — 2026-05-23

Docs-and-examples release. Pinned spec stays at v0.16.1; no proposals implemented this cycle. The focus was bringing the docs site, README, and examples up to par with the v0.6.0 implementation and filling reference-doc gaps that mkdocstrings was silently dropping.

Added

  • openarmature.graph.NextCall and openarmature.graph.default_classifier exports. Promoted from the openarmature.graph.middleware submodule. NextCall is the Protocol describing the next_ callable a middleware receives; default_classifier is the retry classifier's default predicate (matches category against TRANSIENT_CATEGORIES). Users writing custom middleware can type their next_ parameter and extend the default classifier without reaching into the submodule.
  • Middleware concept page. New docs/concepts/middleware.md covering the protocol shape, four registration sites (per-node, per-graph, per-branch, per-fan-out-instance), composition order, subgraph boundary, error semantics, and the built-in RetryMiddleware and TimingMiddleware.
  • Complete reference docs. Added docstrings to 35 previously-undocumented public members across graph, prompts, and checkpoint. mkdocstrings silently omits entries without a docstring, which meant the most fundamental builder methods (add_node, add_edge, set_entry, compile) and the entire Checkpointer backend method surface were invisible in the rendered reference. Every name in each subpackage's __all__ now renders.
  • Examples 05–09. New examples covering fan-out with retry, parallel branches, multimodal prompts, checkpointing with state migration, and tool use. Per-example docs pages with mermaid diagrams under docs/examples/. Examples 00–04 were scrubbed and standardized for consistency with the new set.
  • RELEASING.md. Documents the rc-first release flow (TestPyPI then PyPI), the tag-name dispatch rules, the pre-release checklist, rc iteration, and rollback via PyPI yank.
  • Docs site UX, nav, and reference cleanup. Sweep of nav structure, internal links, and reference page organization to match the v0.6.0 surface.

Changed

  • FanOutNode.run and ParallelBranchesNode.run raise NotImplementedError instead of RuntimeError. Both methods exist only to satisfy the Node protocol; the engine dispatches these node types through run_with_context. NotImplementedError is the right signal and stays backwards-compatible since it subclasses RuntimeError (existing except RuntimeError catches still work).

Notes

  • Pinned spec version unchanged at v0.16.1. No proposals landed this cycle; the release is docs- and examples-focused. The next functional release will resume with new spec proposals.

[0.6.0] — 2026-05-16

Consolidated release for the five-PR batch: structured output (proposal 0016), image content blocks (proposal 0015), prompt management (proposal 0017), state migration for checkpoints (proposal 0014), and parallel branches (proposal 0011). Pinned spec jumps from v0.10.0 to v0.16.1.

Added

  • Parallel branches (proposal 0011, introduced in spec v0.11.0; attempt-index propagation clarified in spec v0.16.1). New GraphBuilder.add_parallel_branches_node(name, *, branches, error_policy, errors_field, middleware) surface dispatches M heterogeneous compiled subgraphs concurrently per pipeline-utilities §11. BranchSpec (subgraph + inputs/outputs projection + branch middleware) and ParallelBranchesNode types exported from openarmature.graph. Branch insertion order determines fan-in merge order regardless of completion timing (§11.8). Two error policies: "fail_fast" raises ParallelBranchesBranchFailed (a NodeException subtype) with branch_name, original cause as __cause__, and recoverable_state carrying the parent's pre-dispatch snapshot — no buffered branch contributions are visible (§11.5 buffer-and-apply). "collect" records per-branch failures in an optional errors_field (each record carries branch_name + category + implementation-defined extras) and continues. Two new error categories: ParallelBranchesNoBranches (compile time, empty branches map) and ParallelBranchesBranchFailed (runtime, fail_fast branch raise).
  • NodeEvent.branch_name: str | None (proposal 0011 / graph-engine §6). Populated on events from nodes inside a parallel-branches branch, absent outside. Independent of fan_out_index — both may be present simultaneously when a branch contains a fan-out (or a fan-out instance contains a parallel-branches node). The combined (namespace, branch_name, fan_out_index, attempt_index, phase) tuple is the event-source uniqueness key.
  • openarmature.branch_name OTel span attribute. Mirrors the existing openarmature.node.fan_out_index. Emitted on synthesized inner-node spans when branch_name is populated on the event. The two attributes coexist on inner nodes of a fan-out-inside-a-branch composition.
  • Attempt-index ContextVar propagation through transitive retry (graph-engine §6 v0.16.1). Retry middleware now sets the attempt_index ContextVar before each next call; the engine reads current_attempt_index() when emitting events. This makes retry semantics symmetric across direct (per-node middleware) and transitive (instance / branch / fan-out instance_middleware) wrapping — events from inner nodes of a subgraph the retry re-invokes carry the wrapping retry's counter, not a freshly-zeroed inner counter. Innermost-wins precedence falls out of Python's ContextVar set/reset token stack. Pre-existing node-level retry behavior is unchanged.
  • State migration for checkpointed graphs (proposal 0014, introduced in spec v0.15.0; refined by proposal 0018 in spec v0.16.0). Saved checkpoints whose schema_version doesn't match the current state class now route through a registered migration chain instead of failing on resume. Surface: State.schema_version: ClassVar[str] = "" (declare a non-empty value to opt in), GraphBuilder.with_state_migration(from_version, to_version, migrate) and with_state_migrations(*migrations) for registration, StateMigration and MigrationRegistry types exported from openarmature.checkpoint. Chain resolution is BFS over the registered edges; the shortest path wins. Three new error categories: CheckpointStateMigrationChainAmbiguous (proposal 0018: duplicate (from, to) pair at registration time, or multiple distinct shortest paths between the saved and current versions at resume time), CheckpointStateMigrationMissing (no chain bridges the versions), and CheckpointStateMigrationFailed (a migration function raised). All non-transient. Post-migration deserialization failures still route to CheckpointRecordInvalid per §10.12.4. The same chain applies to each entry in parent_states in lockstep with the outer state per §10.12.2. Routing precedence per §10.10 (v0.16.0): chain-ambiguous → missing → failed → record-invalid.
  • Checkpointer.supports_state_migration Protocol attribute. Marks whether a backend can expose the structural intermediate form (a plain dict, JSON tree) the migration registry consumes. SQLiteCheckpointer(serialization="json") opts in; SQLiteCheckpointer(serialization="pickle") and InMemoryCheckpointer opt out. On version mismatch against a non-migration-eligible backend the engine raises CheckpointRecordInvalid per spec §10.12.1.
  • openarmature.checkpoint.migrate OTel span (proposal 0014 §6 cross-ref). Versioned resumes whose migration chain runs emit a zero-duration openarmature.checkpoint.migrate span on the OTel observer, parented under the invocation root span. Attributes: openarmature.checkpoint.migrate.from_version, openarmature.checkpoint.migrate.to_version (the final target), openarmature.checkpoint.migrate.chain_length. The §10.12.3 fast path (versions match, registry not consulted) emits no span. Engine-side: a synthetic checkpoint_migrated observer phase carries a _MigrationSummary payload from _migrate_record through to the OTel observer; the new phase is gated off default subscriptions (observers opt in explicitly via phases={..., "checkpoint_migrated"}).
  • Prompt-management capability (proposal 0017, introduced in spec v0.15.0). New openarmature.prompts subpackage. PromptManager composes one or more PromptBackends, exposes fetch / render / get, applies the §8 fallback semantics (prompt_store_unavailable continues to the next backend; prompt_not_found stops the chain), and renders templates with Jinja2's StrictUndefined per §7. Prompt / PromptResult / PromptGroup are Pydantic models matching spec §3 / §4 / §9. Three error categories (PromptNotFound, PromptRenderError, PromptStoreUnavailable) with PROMPT_TRANSIENT_CATEGORIES exported for retry-middleware classifiers. FilesystemPromptBackend is the minimum local-filesystem reference backend (layout: <root>/<label>/<name>.j2; version derived from the first 16 hex chars of template_hash). New runtime dependency: jinja2>=3.1.
  • openarmature.prompts.context — observability propagation per spec §11. with_active_prompt(result) and with_active_prompt_group(group) context managers + current_prompt_result() / current_prompt_group() inspectors. When the OTel observer is active and an LLM call fires inside with_active_prompt, the openarmature.llm.complete span carries the normative openarmature.prompt.* attributes (name, version, label, template_hash, rendered_hash, group_name). Nesting is innermost-wins.
  • Image content blocks for user messages (proposal 0015, introduced in spec v0.13.0). UserMessage.content now accepts str | list[ContentBlock]. The block surface introduces TextBlock, ImageBlock, ImageSourceURL, ImageSourceInline, and the ContentBlock / ImageSource discriminated unions over the block / source type field. ImageBlock carries a media_type (required for inline sources; ignored for URL sources; typed as str | None so callers MAY pass any image/* type the bound model supports) and an optional detail hint ("auto" / "low" / "high"; None default omits the field from the wire so providers apply their own default). System, assistant, and tool messages stay text-string-only; image inputs are user-only in v1.
  • OpenAIProvider content-array wire mapping. When UserMessage.content is a content-block sequence, the wire body uses OpenAI's content array per §8.1.1. TextBlock → {type: "text", text}. ImageBlock with a URL source maps to {type: "image_url", image_url: {url, detail?}}. ImageBlock with an inline source constructs an RFC 2397 data:<media_type>;base64,<base64_data> URI and goes through the same image_url entry shape. Inline bytes pass through unchanged — no inspection, transcoding, or re-encoding.
  • New error category ProviderUnsupportedContentBlock (non-transient). Raised when the bound model rejects a content block type / media variant. Distinct from ProviderInvalidRequest (which covers spec-shape malformation): this category surfaces a capability mismatch, letting callers route differently (e.g., fall back to a multimodal-capable provider) without overloading the malformed-request category. Carries block_type ("image" / "audio" / "video") and reason (provider's human-readable message) when those are recoverable from the rejection. OpenAIProvider detects content rejection via HTTP 400 bodies — heuristic on error.code (known set: image_content_not_supported, unsupported_image_media_type, audio_content_not_supported, etc.), error.type (image_parse_error), and error.message ("does not support" + image/audio/video).
  • Structured output (proposal 0016, introduced in spec v0.14.0). Provider.complete() now accepts an optional response_schema parameter — either a JSON Schema dict or a Pydantic BaseModel subclass. When supplied, the provider constrains the model's output to the schema and populates Response.parsed with the validated value (dict for dict-schema input, a BaseModel instance for class input). New StructuredOutputInvalid error category (non-transient by default) raises on JSON parse failure or schema validation failure; carries the requested schema, the raw response content, and a failure description.
  • OpenAIProvider native response_format wire path. When response_schema is supplied, the chat-completions request body carries response_format: { type: "json_schema", json_schema: { name, schema, strict } }. The strict flag is determined by a deep recursive walk over the schema (object-property required-coverage rule across anyOf / oneOf / allOf and $ref targets, with cycle protection); unresolvable refs fall through to strict: false. The name field uses schema.title when present, otherwise a deterministic sha256-prefix hash.
  • OpenAIProvider prompt-augmentation fallback. Constructor flag force_prompt_augmentation_fallback: bool (default False) and read-only inspect property uses_prompt_augmentation_fallback: bool. When the flag is on, structured-output calls build a fresh message list with a system directive containing the serialized schema, omit response_format from the wire, and validate the response post-receive. The caller's original messages list is never mutated. Use for OpenAI-compatible servers (older vLLM, some LM Studio releases, llama.cpp variants) that reject or silently ignore response_format.
  • Provider-agnostic schema helpers. openarmature.llm.validate_response_schema(schema) (raises ProviderInvalidRequest when the schema is not a dict with a top-level type: "object") and openarmature.llm.strict_mode_supported(schema) (the deep-tree strict-mode constraint check) are exported for reuse by future Anthropic/Gemini providers.
  • Capability-agnostic conformance harness helpers. tests/conformance/harness/wire.py adds match_wire_body (recursive deep-equal with "*" wildcard support), assert_response_format_absent, assert_system_references_schema, and assert_error_carries for the expected_wire_request[_checks] and expected.raises.carries.{...} fixture shapes. Used by the 0016 fixtures; available for the upcoming 0014 / 0015 / 0017 fixture sets.
  • Runtime dependency: jsonschema>=4.0. Used by the dict-schema validation path. The Pydantic-class path uses Pydantic's native validator and does not need jsonschema.

Changed

  • Pinned spec version: 0.10.0 → 0.16.1. Adopts the skip-ahead governance principle: the submodule jumps across v0.11.0–v0.16.1 (proposals 0009, 0011, 0014, 0015, 0016, 0017, 0018) in one bump. All five proposals (0011, 0014, 0015, 0016, 0017) are implemented in the batch's release; the v0.16.1 clarification of attempt-index propagation through transitive retry middleware lands with the proposal 0011 implementation.
  • CheckpointRecord.schema_version semantic shift (proposal 0014). Previously a backend-internal record-shape version (CHECKPOINT_SCHEMA_VERSION = "1" constant), now the user-facing state-schema version per spec §10.2. The framework reads type(state).schema_version at save time. Pre-PR-4 records carrying "1" are reinterpreted as user-facing v1 identifiers; users with such records either declare schema_version="1" on their state class or discard the pre-PR-4 records. SQLiteCheckpointer no longer rejects records with non-default schema_version at the backend boundary; version-mismatch routing is now an engine concern at resume time. The CHECKPOINT_SCHEMA_VERSION module constant is removed; future record-shape evolution can add backend-private metadata fields if needed.
  • NodeEvent.pre_state typed Any (was State). Required by the new checkpoint_migrated phase which carries a _MigrationSummary payload rather than a State instance. Observer authors who type-narrowed pre_state to State should treat it as Any and narrow per-phase (e.g., if event.phase == "completed": ...). The checkpoint_saved phase already carried a State-flavored shape (not necessarily a typed State subclass instance), so this widens the declared type to match runtime reality rather than introducing a new constraint.

Notes

  • Pre-1.0 MINOR. Two behavioral changes ship in this release:

    • Retry-MW attempt-index propagation. Events from inner nodes of a subgraph wrapped by retry middleware (branch middleware, fan-out instance_middleware, or any retry on a wrapping subgraph) now carry the wrapping retry's attempt counter on each re-invocation rather than starting at 0. Per-node retry behavior is unchanged. Matches spec v0.16.1's clarification of the graph-engine §6 contract.
    • CheckpointRecord.schema_version semantic shift. Previously a backend-internal record-shape version (the removed CHECKPOINT_SCHEMA_VERSION = "1" constant), now the user-facing state-schema version per spec §10.2. Pre-v0.6.0 records carrying "1" are reinterpreted as user-facing v1 identifiers; declare schema_version="1" on the corresponding state class or discard the records.

    Existing callers who don't wrap subgraphs in retry middleware and don't declare a state-schema version see no behavior change.

[0.5.0] — 2026-05-10

First release on real PyPI. Catches the implementation up from spec v0.5.x to v0.10.0 across six phases — the spec accepted eight proposals while the python lib was at v0.3.1, and v0.5.0 lands all of them in one curated drop.

Added

  • Typed conformance harness (Phase 0). Single parametrised test target driving all 68 spec fixtures under discriminated-union YAML parsers. Replaces the earlier hand-rolled per-fixture wiring.
  • Observer pair model (Phase 1, spec v0.6.0 / proposal 0005 §6). Observer Protocol (async callable), SubscribedObserver with phase subscription set ({"started", "completed", "checkpoint_saved"}), RemoveHandle.remove(), and a serial delivery queue per spec §6 ordering. Observer exceptions don't propagate; reported via warnings.warn.
  • Middleware (Phase 2, proposal 0004). Middleware Protocol with the canonical (state, next) → partial_update shape, compose_chain runtime, and five stdlib middlewares: RetryMiddleware, TimingMiddleware, ErrorRecoveryMiddleware, ShortCircuitMiddleware, TraceRecorderMiddleware. Per-graph and per-node middleware composition.
  • Fan-out runtime (Phase 3, proposal 0005 pipeline-utilities side). FanOutNode for parallel fan-out over an items_field or a count (int or callable resolver). Configurable concurrency, error policy (fail_fast / collect), inputs / extra_outputs projection, optional errors_field collection. Composes with retry middleware on the fan-out node and on per-instance subgraphs.
  • LLM provider (Phase 4, proposal 0006). New openarmature.llm package: Provider Protocol with ready() / complete(messages, tools=None, config=None); OpenAIProvider (HTTPX-based, OpenAI-compatible wire); typed Message / ToolCall / Tool / Response / RuntimeConfig; seven error categories (ProviderAuthentication, ProviderUnavailable, ProviderInvalidRequest, ProviderInvalidResponse, ProviderInvalidModel, ProviderModelNotLoaded, ProviderRateLimit with retry_after). Tool-call ids preserved verbatim through the wire.
  • Checkpointing (Phase 5, proposal 0008). Checkpointer Protocol (save / load / list / delete) with CheckpointRecord and NodePosition shapes; InMemoryCheckpointer reference impl; CheckpointNotFound / CheckpointRecordInvalid / CheckpointSaveFailed error categories; checkpoint_saved observer phase; resume-from-checkpoint semantics for fan-out and subgraph compositions.
  • Observability / OTel (Phase 6, proposal 0007). OTelObserver mapping observer events → OpenTelemetry spans with private TracerProvider (no global pollution); §4.4 detached subgraph + detached fan-out trace mode; §5.5 LLM-provider span emission with disable_llm_spans opt-out; §5.6 cross-cutting openarmature.correlation_id on every span; §10.8 checkpoint_saved zero-duration span. install_log_bridge wires the stdlib root logger through OTel's Logs Bridge (deprecation-aware via opentelemetry-instrumentation-logging) so log records emitted within an invocation carry the active span's trace_id/span_id plus openarmature.correlation_id. prepare_sync synchronous observer hook so logs emitted on the FIRST line of a node body (before any await) pick up the right span. Fan-out per-instance dispatch span synthesis (§5.4) with parent_node_name cached and applied per-instance.
  • current_correlation_id() public API. Read the per-invocation cross-backend join key from anywhere within the invocation's async call tree.
  • Subgraph configuration plural form. Builder accepts subgraphs: alongside subgraph: for fixture compatibility.

Changed

  • Pinned spec version: 0.5.x → 0.10.0. Lands proposals 0004 (middleware), 0005 (fan-out + observer pair model), 0006 (llm-provider), 0007 (observability/OTel), 0008 (checkpointing), 0011 (prepare_sync hook), 0012 (completed event after edge eval), 0013 (fan_out_config on NodeEvent).
  • Edge-resolution failures share the preceding node's event pair (spec v0.9.0 / proposal 0012). routing_error and edge_exception populate error on the preceding node's completed event with post_state=None instead of producing a separate pair. All five §4 runtime error categories now land via the same uniform mechanism.
  • Observer protocol contract. Async-only callable; phase-filtered delivery via SubscribedObserver.phases; serial single-task delivery worker; observer errors isolated via warnings.warn.

Fixed

  • Log bridge filter placement. Phase 6.0's _CorrelationIdFilter lived on the root logger; Python's logging propagation walks ancestor handlers but not ancestor filters, so child-logger records (the normal logging.getLogger("module") pattern) were missed. Replaced with a process-global LogRecord factory that fires uniformly at record construction.
  • OTelObserver concurrency-safe state scoping. Per-invocation span state now keyed by invocation_id so concurrent invocations sharing one observer instance don't collide on the in-flight span maps.
  • Spec submodule pin sync. Internal spec_version matched the submodule HEAD across phase boundaries; tracked via tests/test_smoke.py.

Notes

  • First real PyPI publish. Pre-release verification continues to flow through TestPyPI per docs/RELEASING.md. The pypi GitHub Environment requires a manual approval click before any real-PyPI upload — keep it on.
  • Pre-1.0 SemVer. Behavioral changes may land in MINOR bumps. Several Phase 1+ contracts changed shape vs. v0.4.0 — most user-visible: the observer pair model in Phase 1, the edge-resolution failure mechanism in Phase 6.1.
  • Cross-language posture. This release tracks spec v0.10.0; the OpenArmature TypeScript implementation will land separately under the same spec.