Skip to content

Surface failure-isolation cause in the fan-out-with-retry example#160

Merged
chris-colinsky merged 1 commit into
mainfrom
chore/fan-out-isolation-event-surfacing
Jun 15, 2026
Merged

Surface failure-isolation cause in the fan-out-with-retry example#160
chris-colinsky merged 1 commit into
mainfrom
chore/fan-out-isolation-event-surfacing

Conversation

@chris-colinsky

Copy link
Copy Markdown
Member

Summary

The fan-out-with-retry example already composes FailureIsolationMiddleware OUTER of RetryMiddleware at a fan-out instance, but in degrade mode it only showed the degraded state (the (unavailable) placeholder) and never the FailureIsolatedEvent itself. This adds a failure_isolation_observer that captures the events, and in degrade mode prints one line per degraded instance.

What it surfaces

Failure-isolation events (1):
  event='headline_degraded'  cause=provider_unavailable  attempt_index=2

cause is the resolved originating category, provider_unavailable, rather than the node_exception the engine wraps the failure in before isolation catches it at a non-node placement. attempt_index is the final exhausting attempt. This gives the example a runnable demonstration of the cause-fidelity telemetry, which previously had no example coverage.

Changes

  • examples/fan-out-with-retry/main.py: import FailureIsolatedEvent; add _isolated + failure_isolation_observer (mirroring the existing _timings / fan_out_config_observer patterns); attach it in main(); print the block in the render path; a docstring "what it teaches" bullet; a MODE=degrade line in "Run with".
  • docs/examples/fan-out-with-retry.md: a "What it teaches" bullet and the degrade-output block with explanation.

Validation

Verified against a live endpoint in both modes: fail_fast (default path unregressed, no block) and MODE=degrade (the block renders cause=provider_unavailable, attempt_index=2). ruff / pyright / mkdocs clean. Examples-only, no library or spec change.

The fan-out-with-retry example already runs FailureIsolation wrapping
Retry at a fan-out instance, but it only showed the degraded state,
never the FailureIsolatedEvent. Add a failure_isolation_observer that
captures the events and, in degrade mode, print each one's event_name,
caught_exception.category, and attempt_index.

The category is the point: at an instance placement it resolves to the
originating cause (provider_unavailable) rather than the node_exception
the engine wrapped it in, so the demo now shows the failure telemetry
naming what actually failed.

Update the example's docstring and doc page to describe the new block,
and add a MODE=degrade line to the Run with section.
Copilot AI review requested due to automatic review settings June 15, 2026 15:47

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the fan-out-with-retry example to surface why instances were degraded by FailureIsolationMiddleware (via FailureIsolatedEvent.caught_exception.category), so the example demonstrates cause-fidelity telemetry in MODE=degrade.

Changes:

  • Add a failure_isolation_observer to capture FailureIsolatedEvent instances and print a “Failure-isolation events” block when present.
  • Update the example’s docstring/run instructions to include MODE=degrade.
  • Extend the docs page to explain and show the new degrade-mode output block.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
examples/fan-out-with-retry/main.py Captures and prints FailureIsolatedEvent details (event name, resolved cause category, attempt index) in degrade mode.
docs/examples/fan-out-with-retry.md Documents what the example teaches and adds degrade-mode output demonstrating cause-fidelity.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@chris-colinsky chris-colinsky merged commit 44b38f9 into main Jun 15, 2026
7 checks passed
@chris-colinsky chris-colinsky deleted the chore/fan-out-isolation-event-surfacing branch June 15, 2026 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants