Skip to content

Commit 679f7ba

Browse files
author
agent
committed
docs: add PR_BODY.md describing LlmResiliencePlugin motivation, design, tests, and usage
1 parent 971984b commit 679f7ba

1 file changed

Lines changed: 70 additions & 0 deletions

File tree

PR_BODY.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# feat(plugins): LlmResiliencePlugin – configurable retries/backoff and model fallbacks
2+
3+
## Motivation
4+
Production agents need first-class resilience to transient LLM/API failures (timeouts, 429/5xx). Today, retry/fallback logic is ad-hoc and duplicated across projects. This PR introduces a plugin-based, opt-in resilience layer for LLM calls that aligns with ADK's extensibility philosophy and addresses recurring requests:
5+
6+
- #1214 Add built-in retry mechanism
7+
- #2561 Retry mechanism gaps for common network errors (httpx…)
8+
- Discussions: #2292, #3199 on fallbacks and max retries
9+
10+
## Summary
11+
Adds a new plugin `LlmResiliencePlugin` which intercepts model errors and performs:
12+
- Configurable retries with exponential backoff + jitter
13+
- Transient error detection (HTTP 429/500/502/503/504, httpx timeouts/connect errors, asyncio timeouts)
14+
- Optional model fallbacks (try a sequence of models if primary continues to fail)
15+
- Works for standard `generate_content_async` flows; supports SSE streaming by consuming to final response
16+
17+
No core runner changes; this is a pure plugin. Default behavior remains unchanged unless the plugin is configured.
18+
19+
## Implementation Details
20+
- File: `src/google/adk/plugins/llm_resilience_plugin.py`
21+
- Hooks into `on_model_error_callback` to decide whether to handle an error
22+
- Retries use exponential backoff with jitter (configurable):
23+
- `max_retries`, `backoff_initial`, `backoff_multiplier`, `max_backoff`, `jitter`
24+
- Fallbacks use `LLMRegistry.new_llm(model)` to instantiate alternative models on failure
25+
- Robust handling of provider return types:
26+
- Async generator (iterates until final non-partial response)
27+
- Coroutine (some providers may return a single `LlmResponse`)
28+
- Avoids circular imports using duck-typed access to InvocationContext (works with Context alias)
29+
- Maintains clean separation; no modification to runners or flows
30+
31+
## Tests
32+
- `tests/unittests/plugins/test_llm_resilience_plugin.py`
33+
- `test_retry_success_on_same_model`: transient error triggers retry → success
34+
- `test_fallback_model_used_after_retries`: failing primary uses fallback model → success
35+
- `test_non_transient_error_bubbles`: non-transient error is ignored by plugin (propagate)
36+
37+
All tests in this module pass locally:
38+
39+
```
40+
PYTHONPATH=src pytest -q tests/unittests/plugins/test_llm_resilience_plugin.py
41+
# 3 passed
42+
```
43+
44+
## Sample
45+
- `samples/resilient_agent.py` demonstrates configuring the plugin with an in-memory runner and a demo model that fails once then succeeds.
46+
47+
Run sample:
48+
49+
```
50+
PYTHONPATH=$(pwd)/src python samples/resilient_agent.py
51+
```
52+
53+
## Backwards Compatibility
54+
- Non-breaking: users opt-in by passing the plugin into `Runner(..., plugins=[LlmResiliencePlugin(...)])`
55+
- No changes to public APIs beyond exporting the plugin in `google.adk.plugins`
56+
57+
## Limitations & Future Work
58+
- Focused on LLM failures. Tool-level resilience is addressed by `ReflectAndRetryToolPlugin`.
59+
- Circuit-breaking and per-exception policies could be added in a follow-up (`dev_3` item).
60+
- Live bidi streaming not yet handled by this plugin; future work may extend to `BaseLlmConnection` flows.
61+
62+
## Docs
63+
- Exported via `google.adk.plugins.__all__` to ease discovery
64+
- Included inline docstrings and sample; can be integrated into the docs site in a separate PR
65+
66+
## Checklist
67+
- [x] Unit tests for new behavior
68+
- [x] Sample demonstrating usage
69+
- [x] No changes to core runner/flow logic
70+
- [x] Code formatted and linted per repository standards

0 commit comments

Comments
 (0)