|
| 1 | +# feat(plugins): LlmResiliencePlugin – configurable retries/backoff and model fallbacks |
| 2 | + |
| 3 | +## Motivation |
| 4 | +Production agents need first-class resilience to transient LLM/API failures (timeouts, 429/5xx). Today, retry/fallback logic is ad-hoc and duplicated across projects. This PR introduces a plugin-based, opt-in resilience layer for LLM calls that aligns with ADK's extensibility philosophy and addresses recurring requests: |
| 5 | + |
| 6 | +- #1214 Add built-in retry mechanism |
| 7 | +- #2561 Retry mechanism gaps for common network errors (httpx…) |
| 8 | +- Discussions: #2292, #3199 on fallbacks and max retries |
| 9 | + |
| 10 | +## Summary |
| 11 | +Adds a new plugin `LlmResiliencePlugin` which intercepts model errors and performs: |
| 12 | +- Configurable retries with exponential backoff + jitter |
| 13 | +- Transient error detection (HTTP 429/500/502/503/504, httpx timeouts/connect errors, asyncio timeouts) |
| 14 | +- Optional model fallbacks (try a sequence of models if primary continues to fail) |
| 15 | +- Works for standard `generate_content_async` flows; supports SSE streaming by consuming to final response |
| 16 | + |
| 17 | +No core runner changes; this is a pure plugin. Default behavior remains unchanged unless the plugin is configured. |
| 18 | + |
| 19 | +## Implementation Details |
| 20 | +- File: `src/google/adk/plugins/llm_resilience_plugin.py` |
| 21 | +- Hooks into `on_model_error_callback` to decide whether to handle an error |
| 22 | +- Retries use exponential backoff with jitter (configurable): |
| 23 | + - `max_retries`, `backoff_initial`, `backoff_multiplier`, `max_backoff`, `jitter` |
| 24 | +- Fallbacks use `LLMRegistry.new_llm(model)` to instantiate alternative models on failure |
| 25 | +- Robust handling of provider return types: |
| 26 | + - Async generator (iterates until final non-partial response) |
| 27 | + - Coroutine (some providers may return a single `LlmResponse`) |
| 28 | +- Avoids circular imports using duck-typed access to InvocationContext (works with Context alias) |
| 29 | +- Maintains clean separation; no modification to runners or flows |
| 30 | + |
| 31 | +## Tests |
| 32 | +- `tests/unittests/plugins/test_llm_resilience_plugin.py` |
| 33 | + - `test_retry_success_on_same_model`: transient error triggers retry → success |
| 34 | + - `test_fallback_model_used_after_retries`: failing primary uses fallback model → success |
| 35 | + - `test_non_transient_error_bubbles`: non-transient error is ignored by plugin (propagate) |
| 36 | + |
| 37 | +All tests in this module pass locally: |
| 38 | + |
| 39 | +``` |
| 40 | +PYTHONPATH=src pytest -q tests/unittests/plugins/test_llm_resilience_plugin.py |
| 41 | +# 3 passed |
| 42 | +``` |
| 43 | + |
| 44 | +## Sample |
| 45 | +- `samples/resilient_agent.py` demonstrates configuring the plugin with an in-memory runner and a demo model that fails once then succeeds. |
| 46 | + |
| 47 | +Run sample: |
| 48 | + |
| 49 | +``` |
| 50 | +PYTHONPATH=$(pwd)/src python samples/resilient_agent.py |
| 51 | +``` |
| 52 | + |
| 53 | +## Backwards Compatibility |
| 54 | +- Non-breaking: users opt-in by passing the plugin into `Runner(..., plugins=[LlmResiliencePlugin(...)])` |
| 55 | +- No changes to public APIs beyond exporting the plugin in `google.adk.plugins` |
| 56 | + |
| 57 | +## Limitations & Future Work |
| 58 | +- Focused on LLM failures. Tool-level resilience is addressed by `ReflectAndRetryToolPlugin`. |
| 59 | +- Circuit-breaking and per-exception policies could be added in a follow-up (`dev_3` item). |
| 60 | +- Live bidi streaming not yet handled by this plugin; future work may extend to `BaseLlmConnection` flows. |
| 61 | + |
| 62 | +## Docs |
| 63 | +- Exported via `google.adk.plugins.__all__` to ease discovery |
| 64 | +- Included inline docstrings and sample; can be integrated into the docs site in a separate PR |
| 65 | + |
| 66 | +## Checklist |
| 67 | +- [x] Unit tests for new behavior |
| 68 | +- [x] Sample demonstrating usage |
| 69 | +- [x] No changes to core runner/flow logic |
| 70 | +- [x] Code formatted and linted per repository standards |
0 commit comments