|
14 | 14 | (``asyncio.CancelledError``, ``KeyboardInterrupt``) propagates so |
15 | 15 | cancellation works as expected — the same rule as ``RetryMiddleware``. |
16 | 16 |
|
17 | | -On a caught exception the middleware: |
| 17 | +On a caught exception the middleware first resolves ``degraded_update`` |
| 18 | +(a static mapping, or a callable taking the pre-call state; invoked |
| 19 | +once, at catch time, which is also what populates the dispatched |
| 20 | +event's ``post_state``), then in order: |
18 | 21 |
|
19 | 22 | 1. Dispatches a ``FailureIsolatedEvent`` onto the engine's serial |
20 | 23 | observer-delivery queue (a framework-emitted event; the bundled |
21 | 24 | OTel and Langfuse observers render the catch). The default emission |
22 | 25 | path is the observer event, with no logging-library dependency. |
23 | 26 | 2. Awaits the optional ``on_caught`` hook. |
24 | | -3. Resolves ``degraded_update`` (static mapping or callable taking the |
25 | | - pre-call state) and returns it as the node's partial update. |
| 27 | +3. Returns the resolved degraded update as the node's partial update. |
26 | 28 |
|
27 | 29 | Composition with ``RetryMiddleware``: failure isolation MUST be the |
28 | 30 | OUTER middleware (it only sees what escapes retry); retry MUST be INNER |
@@ -154,6 +156,17 @@ def _emit_event(self, state: Any, exc: Exception, degraded: Mapping[str, Any]) - |
154 | 156 | cause = getattr(exc, "__cause__", None) |
155 | 157 | cause_category = getattr(cause, "category", None) if cause is not None else None |
156 | 158 | category = cause_category if isinstance(cause_category, str) else None |
| 159 | + # ``attempt_index`` here is deliberately the NODE-level baseline, |
| 160 | + # not a per-attempt wire index: failure isolation is a node-level |
| 161 | + # concern ("the node, across its retries, was isolated"). When |
| 162 | + # this middleware is OUTER of RetryMiddleware, retry has already |
| 163 | + # reset the attempt ContextVar to that baseline (0) in its |
| 164 | + # ``finally`` by the time the terminal exception reaches this |
| 165 | + # catch, which is the frame we want (spec-confirmed). Parenting is |
| 166 | + # unaffected: the node's attempt spans are already closed by |
| 167 | + # delivery time (their completed event precedes this one on the |
| 168 | + # serial queue), so observers parent the marker under the |
| 169 | + # invocation span and correlate by ``namespace`` + node name. |
157 | 170 | dispatch( |
158 | 171 | FailureIsolatedEvent( |
159 | 172 | event_name=self.event_name, |
|
0 commit comments