You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add failure-isolation catch gate and cause-chain classification primitive (0074) (#174)
* Pin spec v0.65.0 for proposal 0074
Advance the spec submodule pin v0.64.0 -> v0.65.0 for accepted proposal
0074 (failure-isolation catch gate + §6.4 cause-chain classification
primitive). Updates __spec_version__, the pyproject spec_version, the
smoke assertion, the conformance.toml spec_pin, and regenerates the
bundled AGENTS.md. conformance.toml records 0074 as implemented.
* Add failure-isolation catch gate and §6.4 primitive (0074)
FailureIsolationMiddleware gains an optional `catch` set of error
categories: an exception is caught only if the derived category of its
cause chain (resolved through node_exception carriers) is in the set,
conjoined with `predicate` (catch checked first, short-circuiting).
The carrier-skipping walk behind `catch` and `caught_exception` becomes
a public primitive, classify_cause_chain(exc) -> CaughtException. The
cause-chain types (CauseLink, CaughtException) move into the new
cause_chain module alongside it, so the concept has one home and events
consumes it; the public openarmature.graph paths are unchanged. The
default retry classifier's single-level depth is documented as
deliberate (no behavior change). Unit tests cover the gate, the
short-circuit, and the primitive.
* Wire failure-isolation catch conformance fixture 072
Parse the `catch` directive on the failure_isolation fixture middleware
config and add fixture 072 to the failure-isolation fixture set. 072
(two cases) drives the catch gate matching through a §9.7 instance
node_exception carrier (degrade) and a non-matching catch (propagate).
* Document failure-isolation catch + classification (0074)
Document the `catch` category gate and the public classify_cause_chain
primitive in the middleware concepts page, and add the 0.15.0 changelog
entry (advancing the spec-pin bullet to v0.65.0).
* Harden catch typing and tighten derived-category wording
PR #174 review: reject a bare str for FailureIsolationMiddleware.catch
(a str is a Collection[str], so it would substring-match and silently
mis-gate) and normalize to a frozenset. Tighten the derived-category
wording in the docstring, the concepts page, and the classify example
to the outermost non-carrier link with a category (an uncategorized
surface link resolves to the deeper categorized cause). Fix the stale
events/errors import comment now that cause_chain imports only errors.
Copy file name to clipboardExpand all lines: CHANGELOG.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,10 +12,11 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
12
12
- **Per-attempt LLM spans under call-level retry** (proposal 0050, observability §5.5 / llm-provider §7.1). Completes proposal 0050, which shipped `partial` in v0.14.0 (failure-isolation middleware and the `complete(retry=...)` loop landed then; the per-attempt span surface was deferred). Under call-level retry the OTel observer now emits one `openarmature.llm.complete` span per attempt, each carrying `openarmature.llm.attempt_index` (0-based, 0..N-1, and 0 for a no-retry call). An intermediate failed attempt's span carries ERROR status plus its error category and the request-side attributes; the final attempt's span carries the terminal outcome and, on success, the full response surface. A python-internal `LlmRetryAttemptEvent`, dispatched once per attempt, is the sole source of the OTel span; the terminal `LlmCompletionEvent` / `LlmFailedEvent` stay one per call (payload, latency, Langfuse Generation) and no longer drive the OTel span. Langfuse renders one terminal Generation per call, with the per-attempt detail on the OTel span surface only (a spec-side §8 clarification to pin this is tracked, non-blocking). `conformance.toml` flips proposal 0050 to `implemented`; the call-level fixtures 056-058 are driven through the provider plus OTel observer and the single-attempt observability fixture 057 is wired.
13
13
- **Langfuse `trace.userId` / `trace.sessionId` population** (proposal 0064, observability §8.4.1, spec v0.62.0). The Langfuse observer now promotes a recognized `userId` key in the caller-supplied invocation metadata to Langfuse's first-class `trace.userId` field (the Users dashboard), additively: the key also remains at `trace.metadata.userId`. Promotion is automatic and unconditional; an absent key leaves `trace.userId` unset. The `LangfuseClient.trace()` surface (the Protocol, the in-memory client, and the SDK adapter) gains `session_id` / `user_id`. `trace.sessionId` is sourced from `openarmature.session_id`, which the sessions capability (proposal 0020) establishes; that capability is not yet implemented in python, so the `sessionId` plumbing is in place but dormant (no source) and unset in the interim. `conformance.toml` records proposal 0064 `partial` on that basis: fixture 084 cases 2/3/4 (not session-bound, `userId` present additively, `userId` absent) run, and the session-bound cases 1/5 defer until 0020. Langfuse-only: the OTel side already carries `openarmature.session_id` and `openarmature.user.*` as span attributes, and OTel has no trace-level session/user field.
14
14
-**Per-fetch prompt cache control: `cache_ttl_seconds`** (proposal 0072, prompt-management §5 / §6, spec v0.63.0). `PromptBackend.fetch`, `PromptManager.fetch`, and `PromptManager.get` gain an optional `cache_ttl_seconds` read-side control: `None` preserves current behavior, `0` forces a fresh read past any client-side cache, and `N > 0` bounds a served entry's staleness to N seconds; a negative value is rejected at the manager. It governs only which cached entry may be served, not whether or how results are cached. The bundled filesystem backend is cacheless and ignores it; the bundled Langfuse backend forwards it to the Langfuse SDK's `get_prompt` cache. Conformance fixtures 033/034 run through a caching harness backend (conformance-adapter §6.8: `source_read_count` plus a controllable `advance_clock`).
15
+
- **Failure-isolation `catch` gate + cause-chain classification primitive** (proposal 0074, pipeline-utilities §6.3 / §6.4, spec v0.65.0). `FailureIsolationMiddleware` gains an optional `catch`: a set of error categories. An exception is caught only if the *derived category* of its cause chain (the outermost non-carrier link's category, resolved through the engine's `node_exception` carriers, the same value reported as `caught_exception.category`) is in the set. This closes a degrade-into-crash footgun: at a wrapping placement (subgraph, fan-out instance, branch) the engine wraps the originating failure in a carrier, so a `predicate` inspecting the surface exception sees only the carrier and misses it, whereas `catch` classifies through the carrier. `catch` composes with `predicate` as a conjunction; both default permissive (both unset stays catch-all), and a null derived category never matches a non-empty set. The carrier-skipping walk behind `catch` and `caught_exception` is promoted to a public primitive, `classify_cause_chain(exc) -> CaughtException` (the ordered `chain`, the derived `category`, and its `message` — the same record the event carries), exported from `openarmature.graph` for use in a custom `predicate`, a router, a metric, or a full-chain retry classifier. The default retry classifier stays deliberately single-level (it classifies at re-attempt granularity); this is now documented, with no behavior change. Conformance fixture 072 (catch matches through an instance-placement carrier and degrades; a non-matching catch propagates with no event). The optional native-exception-type `catch` form (spec MAY) is not shipped.
15
16
16
17
### Changed
17
18
18
-
- **Pinned spec advances v0.60.0 → v0.64.0** across the v0.15.0 cycle: v0.61.0 (proposal 0061, the detached-trace invocation span above), v0.62.0 (proposal 0064, the Langfuse session/user population above), v0.63.0 (proposal 0072, the prompt cache control above), the v0.63.1 patch (pipeline-utilities coverage fixtures 070/071 for the already-implemented 0069 / 0070 behavior, no new proposal), and v0.64.0 (proposal 0073, GenAI semconv adoption reconciliation: OA retains `gen_ai.system` despite the upstream rename to `gen_ai.provider.name`; textual-only, with no emitted-attribute or fixture change, so the existing `gen_ai.*` fixtures stand as the retention regression). `conformance.toml` records 0061 / 0072 `implemented`, 0064 `partial` (its `sessionId` half is dormant pending the sessions capability), and 0073 `textual-only`. Proposal 0050 needed no pin bump of its own (it was already within the pin from its v0.42.0 acceptance); its v0.14.0 `partial` entry flips to `implemented` with the per-attempt span surface above.
19
+
- **Pinned spec advances v0.60.0 → v0.65.0** across the v0.15.0 cycle: v0.61.0 (proposal 0061, the detached-trace invocation span above), v0.62.0 (proposal 0064, the Langfuse session/user population above), v0.63.0 (proposal 0072, the prompt cache control above), the v0.63.1 patch (pipeline-utilities coverage fixtures 070/071 for the already-implemented 0069 / 0070 behavior, no new proposal), and v0.64.0 (proposal 0073, GenAI semconv adoption reconciliation: OA retains `gen_ai.system` despite the upstream rename to `gen_ai.provider.name`; textual-only, with no emitted-attribute or fixture change, so the existing `gen_ai.*` fixtures stand as the retention regression), and v0.65.0 (proposal 0074, the failure-isolation `catch` gate above). `conformance.toml` records 0061 / 0072 / 0074 `implemented`, 0064 `partial` (its `sessionId` half is dormant pending the sessions capability), and 0073 `textual-only`. Proposal 0050 needed no pin bump of its own (it was already within the pin from its v0.42.0 acceptance); its v0.14.0 `partial` entry flips to `implemented` with the per-attempt span surface above.
Copy file name to clipboardExpand all lines: conformance.toml
+8-1Lines changed: 8 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -32,7 +32,7 @@
32
32
33
33
[manifest]
34
34
implementation = "openarmature-python"
35
-
spec_pin = "v0.64.0"
35
+
spec_pin = "v0.65.0"
36
36
37
37
# Status values:
38
38
# implemented — shipped behavior matches the proposal's contract
@@ -719,3 +719,10 @@ note = "PromptBackend.fetch / PromptManager.fetch / get gain an optional cache_t
719
719
status = "textual-only"
720
720
since = "0.15.0"
721
721
note = "Governance + observability §5.5 rationale change: reconciles the gen_ai.* adoption with upstream reality (the whole GenAI semconv surface is at Development status, and gen_ai.system was removed upstream in favor of gen_ai.provider.name). Adds a GenAI-scoped de-facto-interoperability carve-out (OA adopts the recognized core gen_ai.* names directly even at Development; peripheral attributes are mirrored to openarmature.*) and a post-adoption RETENTION rule (an adopted name is kept through an upstream rename / removal). No emitted-attribute change and no conformance-expectation change: python already emits the recognized core gen_ai.* set (including gen_ai.system, now RETAINED despite the upstream rename), so the existing gen_ai.* observability fixtures (e.g. 019-021) stand as the retention regression coverage. No python code and no new fixtures. The gen_ai.system -> gen_ai.provider.name migration is a deferred follow-on."
# gate (§6.3) + public cause-chain classification primitive (§6.4).
725
+
[proposals."0074"]
726
+
status = "implemented"
727
+
since = "0.15.0"
728
+
note = "FailureIsolationMiddleware gains an optional `catch` set of error categories (§6.3): an exception is caught only if the DERIVED category of its cause chain (the outermost non-carrier link, resolved THROUGH node_exception carriers -- the same value reported as caught_exception.category) is in the set, composing with `predicate` as a conjunction (both default permissive, both unset = catch-all; a null derived category never matches a non-empty set). This classifies a carrier-wrapped failure correctly at a wrapping placement where a surface check sees only the carrier. The §6.4 cause-chain classification walk is promoted to a public primitive classify_cause_chain(exc) -> CaughtException (the existing failure-isolation record: chain + derived category + message) in openarmature.graph, shared by the catch gate, the emitted event, and any consumer. §6.1: the default retry classifier's single-level depth is documented as deliberate (re-run granularity vs §6.3 full-chain degrade); no behavior change. Fixture 072 (catch matches through an instance-placement carrier and degrades; a non-matching catch propagates with no event). The optional native-exception-type catch sugar (spec MAY) is not shipped."
Copy file name to clipboardExpand all lines: src/openarmature/AGENTS.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# OpenArmature — Agent documentation
2
2
3
-
*This is the agent guide bundled with the openarmature Python package, version 0.14.0 (spec v0.64.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
3
+
*This is the agent guide bundled with the openarmature Python package, version 0.14.0 (spec v0.65.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
4
4
5
5
## TL;DR
6
6
@@ -10,7 +10,7 @@ OpenArmature is a workflow framework for LLM pipelines and tool-calling agents:
10
10
11
11
## Capability contracts
12
12
13
-
_Sourced from openarmature-spec v0.64.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
13
+
_Sourced from openarmature-spec v0.65.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
0 commit comments