Skip to content

Commit fd5b18b

Browse files
Clarify failure-isolation docs from PR review
Two doc-only changes from CoPilot review of PR #149, no behavior change: - The module docstring now states that degraded_update is resolved once at catch time (which populates the event's post_state), with the numbered steps covering only the observable order after that. - A comment at the FailureIsolatedEvent dispatch documents that attempt_index is the intentional node-level baseline (not a per-attempt index) and that span parenting is unaffected.
1 parent 4a6dcee commit fd5b18b

1 file changed

Lines changed: 16 additions & 3 deletions

File tree

src/openarmature/graph/middleware/failure_isolation.py

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,17 @@
1414
(``asyncio.CancelledError``, ``KeyboardInterrupt``) propagates so
1515
cancellation works as expected — the same rule as ``RetryMiddleware``.
1616
17-
On a caught exception the middleware:
17+
On a caught exception the middleware first resolves ``degraded_update``
18+
(a static mapping, or a callable taking the pre-call state; invoked
19+
once, at catch time, which is also what populates the dispatched
20+
event's ``post_state``), then in order:
1821
1922
1. Dispatches a ``FailureIsolatedEvent`` onto the engine's serial
2023
observer-delivery queue (a framework-emitted event; the bundled
2124
OTel and Langfuse observers render the catch). The default emission
2225
path is the observer event, with no logging-library dependency.
2326
2. Awaits the optional ``on_caught`` hook.
24-
3. Resolves ``degraded_update`` (static mapping or callable taking the
25-
pre-call state) and returns it as the node's partial update.
27+
3. Returns the resolved degraded update as the node's partial update.
2628
2729
Composition with ``RetryMiddleware``: failure isolation MUST be the
2830
OUTER middleware (it only sees what escapes retry); retry MUST be INNER
@@ -154,6 +156,17 @@ def _emit_event(self, state: Any, exc: Exception, degraded: Mapping[str, Any]) -
154156
cause = getattr(exc, "__cause__", None)
155157
cause_category = getattr(cause, "category", None) if cause is not None else None
156158
category = cause_category if isinstance(cause_category, str) else None
159+
# ``attempt_index`` here is deliberately the NODE-level baseline,
160+
# not a per-attempt wire index: failure isolation is a node-level
161+
# concern ("the node, across its retries, was isolated"). When
162+
# this middleware is OUTER of RetryMiddleware, retry has already
163+
# reset the attempt ContextVar to that baseline (0) in its
164+
# ``finally`` by the time the terminal exception reaches this
165+
# catch, which is the frame we want (spec-confirmed). Parenting is
166+
# unaffected: the node's attempt spans are already closed by
167+
# delivery time (their completed event precedes this one on the
168+
# serial queue), so observers parent the marker under the
169+
# invocation span and correlate by ``namespace`` + node name.
157170
dispatch(
158171
FailureIsolatedEvent(
159172
event_name=self.event_name,

0 commit comments

Comments
 (0)