Surface failure-isolation cause in the fan-out-with-retry example#160
Merged
Merged
Conversation
The fan-out-with-retry example already runs FailureIsolation wrapping Retry at a fan-out instance, but it only showed the degraded state, never the FailureIsolatedEvent. Add a failure_isolation_observer that captures the events and, in degrade mode, print each one's event_name, caught_exception.category, and attempt_index. The category is the point: at an instance placement it resolves to the originating cause (provider_unavailable) rather than the node_exception the engine wrapped it in, so the demo now shows the failure telemetry naming what actually failed. Update the example's docstring and doc page to describe the new block, and add a MODE=degrade line to the Run with section.
There was a problem hiding this comment.
Pull request overview
Updates the fan-out-with-retry example to surface why instances were degraded by FailureIsolationMiddleware (via FailureIsolatedEvent.caught_exception.category), so the example demonstrates cause-fidelity telemetry in MODE=degrade.
Changes:
- Add a
failure_isolation_observerto captureFailureIsolatedEventinstances and print a “Failure-isolation events” block when present. - Update the example’s docstring/run instructions to include
MODE=degrade. - Extend the docs page to explain and show the new degrade-mode output block.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| examples/fan-out-with-retry/main.py | Captures and prints FailureIsolatedEvent details (event name, resolved cause category, attempt index) in degrade mode. |
| docs/examples/fan-out-with-retry.md | Documents what the example teaches and adds degrade-mode output demonstrating cause-fidelity. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
fan-out-with-retryexample already composesFailureIsolationMiddlewareOUTER ofRetryMiddlewareat a fan-out instance, but in degrade mode it only showed the degraded state (the(unavailable)placeholder) and never theFailureIsolatedEventitself. This adds afailure_isolation_observerthat captures the events, and in degrade mode prints one line per degraded instance.What it surfaces
causeis the resolved originating category,provider_unavailable, rather than thenode_exceptionthe engine wraps the failure in before isolation catches it at a non-node placement.attempt_indexis the final exhausting attempt. This gives the example a runnable demonstration of the cause-fidelity telemetry, which previously had no example coverage.Changes
examples/fan-out-with-retry/main.py: importFailureIsolatedEvent; add_isolated+failure_isolation_observer(mirroring the existing_timings/fan_out_config_observerpatterns); attach it inmain(); print the block in the render path; a docstring "what it teaches" bullet; aMODE=degradeline in "Run with".docs/examples/fan-out-with-retry.md: a "What it teaches" bullet and the degrade-output block with explanation.Validation
Verified against a live endpoint in both modes:
fail_fast(default path unregressed, no block) andMODE=degrade(the block renderscause=provider_unavailable,attempt_index=2). ruff / pyright / mkdocs clean. Examples-only, no library or spec change.