Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
- **Failure-isolation events report the originating cause's category at non-node placements** (proposal 0065, pipeline-utilities §6.3). When `FailureIsolationMiddleware` runs as instance middleware (§9.7), branch middleware (§11.7), or parent-node middleware on a fan-out / parallel-branches node, the graph engine has already wrapped the originating error as a `node_exception` carrier before the middleware catches it. `FailureIsolatedEvent.caught_exception.category` now resolves through that carrier (and any nested carriers) to the nearest categorized originating cause and reports its category instead of the masking `node_exception`, so the reported category agrees with what the §6.1 retry classifier acted on. For example, an instance whose retries exhaust on `provider_unavailable` now surfaces `provider_unavailable` rather than `node_exception`. The `message` tracks the resolved cause for category/message coherence. Node-level placement was already faithful and is unchanged, and catch/degrade behavior is unchanged at every site (only the event's reported cause changes). The wrapped-instance/branch lineage SHOULD (`fan_out_index` / `branch_name`) is deferred to a follow-up, since it needs the engine to surface per-instance identity to the wrapping-site middleware. `conformance.toml` marks proposal 0065 `implemented`, and conformance fixture 064 (three cases: the §9.7 instance and §11.7 branch sites plus an uncategorized cause) passes.
- **Observer privacy flag `disable_llm_payload` renamed to `disable_provider_payload`** (proposal 0059, observability §5.5.4, spec v0.54.0). The observer-level flag on both bundled observers (`OTelObserver` and `LangfuseObserver`) is renamed, and its scope broadens from LLM-completion payload to any provider-call payload (LLM completion today; embedding and rerank when those land). This is a breaking change to both observer constructors: config passing `disable_llm_payload=True` (or `False`) updates to `disable_provider_payload=...` with no other change. The default stays `True` (payload suppressed), and the gating behavior for `LlmCompletionEvent` / `LlmFailedEvent` rendering is unchanged at every existing site. The rename is the only part of proposal 0059 adopted this cycle: the retrieval-provider capability itself (the `EmbeddingProvider` protocol, the `EmbeddingEvent` / `EmbeddingFailedEvent` typed variants, and the embedding span / observation mapping) is not yet implemented and rides as `not-yet` in `conformance.toml`. The §5.5.4 rename touches existing LLM-payload gating, so it lands with the pin.
- **Fan-out failure-isolation degrade contribution implemented** (proposal 0066, pipeline-utilities §9.3 / §9.8 / §11.7, spec v0.56.0). When `FailureIsolationMiddleware` degrades a fan-out instance, that instance is a success whose contribution is its `degraded_update`, read in subgraph-field-name space and never merged onto the failed instance's pre-failure state. This also fixes a latent bug: an instance `degraded_update`'s `extra_outputs` values were previously looked up by the parent field name and silently dropped (`collect_field` was unaffected). A static `degraded_update` that omits the node's `collect_field` is now a compile-time error (`FanOutDegradedUpdateMissingCollectField`); a callable `degraded_update` that omits it yields a graceful null slot rather than raising, preserving one collection slot per item. The parallel-branches counterpart (a branch `degraded_update` omitting a projected `outputs` field skips that field) was already correct as of the parallel-branches fix above and is now pinned by fixture 065. Success-path and resume behavior for correctly-configured fan-outs is unchanged.
- **Pinned spec advances v0.53.0 → v0.56.0 across the v0.14.0 cycle**, in three steps: v0.54.0 (proposal 0059, the observer-flag rename above), v0.55.1 (proposal 0065 above; the v0.55.1 patch also carries an observability §11 span-links text reconciliation that narrows an *Out of scope* bullet, with no python-observable change), and v0.56.0 (proposal 0066, the fan-out degrade contribution above). `conformance.toml` records 0065 and 0066 as `implemented` and 0059 as `not-yet` (only its cross-spec flag rename was adopted).
- **Failure-isolation events carry the full structured cause chain** (proposal 0068, pipeline-utilities §6.3, spec v0.57.0). `FailureIsolatedEvent.caught_exception` gains a `chain`: an ordered list of `CauseLink` records (each carrying `category`, `message`, and a `carrier` flag), from the caught exception (outermost) to the originating raise (innermost), with graph-engine `node_exception` carrier wrappers flagged `carrier=True`. The existing `category` and `message` are retained and redefined as a derivation over the chain: the category of the outermost non-carrier link whose category is a non-empty string (else `category` is `null` and `message` is the outermost non-carrier link's message). This supersedes proposal 0065's single "originating cause" representation, which was ambiguous once the post-carrier chain held more than one non-carrier link; the derivation reproduces 0065's single-carrier values, so fixture 064 is unchanged. A new `CauseLink` type is exported from `openarmature.graph`. The bundled OTel and Langfuse observers continue to render the derived `category`; surfacing the full chain is left to custom observers. The change is additive to the event shape, and catch/degrade behavior is unchanged. Conformance fixture 066 (three cases: an instance-site carrier chain, a node-level single non-carrier link, and an uncategorized null-category cause) passes.
- **Pinned spec advances v0.53.0 → v0.57.0 across the v0.14.0 cycle**, in four steps: v0.54.0 (proposal 0059, the observer-flag rename above), v0.55.1 (proposal 0065 above; the v0.55.1 patch also carries an observability §11 span-links text reconciliation that narrows an *Out of scope* bullet, with no python-observable change), v0.56.0 (proposal 0066, the fan-out degrade contribution above), and v0.57.0 (proposal 0068, the failure-isolation cause chain above). `conformance.toml` records 0065, 0066, and 0068 as `implemented` and 0059 as `not-yet` (only its cross-spec flag rename was adopted).

### Fixed

Expand Down
15 changes: 14 additions & 1 deletion conformance.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@

[manifest]
implementation = "openarmature-python"
spec_pin = "v0.56.0"
spec_pin = "v0.57.0"

# Status values:
# implemented — shipped behavior matches the proposal's contract
Expand Down Expand Up @@ -635,3 +635,16 @@ since = "0.14.0"
[proposals."0066"]
status = "implemented"
since = "0.14.0"

# Spec v0.57.0 (proposal 0068). Failure-isolation event structured cause
# chain (pipeline-utilities §6.3). ``caught_exception`` gains a ``chain`` of
# cause links (``{category, message, carrier}``, outermost->innermost), with
# graph-engine §4 ``node_exception`` carriers flagged. The existing
# ``category`` / ``message`` are redefined as a derivation over the chain (the
# outermost non-carrier link carrying a category), superseding 0065's single
# "originating cause" prose; the derivation reproduces 0065's values, so
# fixture 064 is unchanged. Fixture 066 (three cases: instance-site carrier
# chain, node-level single link, uncategorized null category) passes.
[proposals."0068"]
status = "implemented"
since = "0.14.0"
9 changes: 6 additions & 3 deletions docs/concepts/middleware.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,9 +256,12 @@ Like `RetryMiddleware`, it catches `Exception` only; `BaseException`
On a catch, the middleware dispatches a `FailureIsolatedEvent` onto the
observer stream. It is a distinct event variant, not a node event: it
carries the `event_name`, the wrapped node's lineage identity, the input
and degraded states, and a `CaughtException` record holding the
exception's `category` (when it has one) and message. Observers narrow
on it with `isinstance(event, FailureIsolatedEvent)`. The bundled OTel
and degraded states, and a `CaughtException` record. That record holds a
derived `category` (when the cause has one) and `message` for simple
consumers, plus a `chain` of cause links (`CauseLink`) from the caught
exception down to the originating raise, with graph-engine carrier
wrappers flagged so a consumer can skip them. Observers narrow on it with
`isinstance(event, FailureIsolatedEvent)`. The bundled OTel
and Langfuse observers render it as a marker span / observation so the
catch shows up alongside the node's own span. The default emission path
is the observer stream only, with no logging-library dependency;
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Specification = "https://github.com/LunarCommand/openarmature-spec"
openarmature = "openarmature.cli:main"

[tool.openarmature]
spec_version = "0.56.0"
spec_version = "0.57.0"

[dependency-groups]
dev = [
Expand Down
2 changes: 1 addition & 1 deletion src/openarmature/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"""

__version__ = "0.13.0"
__spec_version__ = "0.56.0"
__spec_version__ = "0.57.0"
# Proposal 0052 (spec observability §5.1 / §8.4.1): canonical
# package-registry name for this implementation. Surfaces on every
# OTel invocation span as ``openarmature.implementation.name`` and on
Expand Down
2 changes: 2 additions & 0 deletions src/openarmature/graph/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
)
from .events import (
CaughtException,
CauseLink,
FailureIsolatedEvent,
InvocationCompletedEvent,
InvocationStartedEvent,
Expand Down Expand Up @@ -71,6 +72,7 @@
__all__ = [
"END",
"CaughtException",
"CauseLink",
"CompileError",
"CompiledGraph",
"ConditionalEdge",
Expand Down
46 changes: 40 additions & 6 deletions src/openarmature/graph/events.py
Original file line number Diff line number Diff line change
Expand Up @@ -659,20 +659,52 @@ class LlmFailedEvent:
caller_invocation_metadata: Mapping[str, AttributeValue] | None = None


# Spec: pipeline-utilities §6.3 cause chain (proposal 0068). A ``carrier``
# link is a graph-engine §4 ``node_exception`` wrapper the engine applies at a
# non-node placement (§9.7 instance / §11.7 branch / §9.6 / §11.6 parent-node
# middleware); consumers grouping by the originating failure skip carriers via
# the flag.
@dataclass(frozen=True)
class CauseLink:
"""One link in a caught exception's resolved cause chain.

- ``category``: the link's failure category when it carries one (a
string), else ``None``.
- ``message``: the link's own message (the ``str`` of the exception).
- ``carrier``: ``True`` when the link is an engine-applied
``node_exception`` carrier wrapper, ``False`` for an ordinary
(non-carrier) exception.
"""

category: str | None
message: str
carrier: bool


# Spec: pipeline-utilities §6.3 (proposals 0050, 0065, 0068). ``chain`` is the
# full ordered cause chain; ``category`` / ``message`` are a derivation over it
# — the outermost non-carrier link whose category is a non-empty string (else
# ``None`` and the outermost non-carrier link's message). The derivation
# reproduces 0065's single-value results; the chain adds the full provenance.
@dataclass(frozen=True)
class CaughtException:
"""Structured record of an exception caught by
``FailureIsolationMiddleware``.

- ``category``: the exception's failure category when it carries
one (e.g. an llm-provider error's ``category`` attribute), else
``None`` for a bare exception that carries no category.
- ``message``: the human-readable exception message (``str(exc)``);
the empty string when the exception carried no message.
- ``category``: the caught failure's category (the derived single
value for simple consumers), or ``None`` when no non-carrier link
in the chain carries a category.
- ``message``: the message of the link ``category`` is derived from,
or (when no link carries a category) of the outermost non-carrier
link.
- ``chain``: the ordered cause chain, outermost (the caught
exception, index 0) to innermost (the originating raise), one
:class:`CauseLink` per exception.
"""

category: str | None
message: str
chain: tuple[CauseLink, ...]
Comment thread
chris-colinsky marked this conversation as resolved.


# Spec: realizes pipeline-utilities §6.3 failure-isolation middleware
Expand Down Expand Up @@ -706,7 +738,8 @@ class FailureIsolatedEvent:
- ``post_state``: the degraded partial update the middleware
returned in place of the node's output.
- ``caught_exception``: a :class:`CaughtException` record of the
caught exception (category + message).
caught exception (its derived category / message and the full
cause ``chain``).
"""

event_name: str
Expand All @@ -721,6 +754,7 @@ class FailureIsolatedEvent:

__all__ = [
"CaughtException",
"CauseLink",
"FailureIsolatedEvent",
"FanOutEventConfig",
"InvocationCompletedEvent",
Expand Down
Loading
Loading