-
Notifications
You must be signed in to change notification settings - Fork 3.4k
RESOURCE_EXHAUSTED (429) errors when triggering ADK agents. #4323
Copy link
Copy link
Closed
Labels
core[Component] This issue is related to the core interface and implementation[Component] This issue is related to the core interface and implementationrequest clarification[Status] The maintainer need clarification or more information from the author[Status] The maintainer need clarification or more information from the authorstale[Status] Issues which have been marked inactive since there is no user response[Status] Issues which have been marked inactive since there is no user response
Metadata
Metadata
Assignees
Labels
core[Component] This issue is related to the core interface and implementation[Component] This issue is related to the core interface and implementationrequest clarification[Status] The maintainer need clarification or more information from the author[Status] The maintainer need clarification or more information from the authorstale[Status] Issues which have been marked inactive since there is no user response[Status] Issues which have been marked inactive since there is no user response
RESOURCE_EXHAUSTED (429) errors when triggering ADK agents concurrently via Vertex AI Reasoning EngineIssue Description
Describe the Bug:
When triggering an ADK agent multiple times in quick succession, the request fails with a streaming error that ultimately resolves to a
429 RESOURCE_EXHAUSTEDerror from Vertex AI. The error is surfaced by ADK as a500during response streaming.Observed error:
{ "error": "500: An error occurred while streaming the response: 429 Too Many Requests." "details": { "message": "Resource exhausted. Please try again later.", "status": "RESOURCE_EXHAUSTED" } }The error message points to ADK and Vertex AI 429 documentation but it’s unclear where the actual bottleneck is and how it should be handled when using ADK in production.
Steps to Reproduce:
us-central1).429 RESOURCE_EXHAUSTEDerrors surfaced as streaming failures.Expected Behavior:
Requests may slow down or queue, but should not fail with a hard error during streaming. Ideally, retries or backoff would be handled gracefully.
Observed Behavior:
Requests fail with
429 RESOURCE_EXHAUSTED, wrapped as a500streaming error by ADK.Environment Details:
Model Information:
gemini-2.5-pro,gemini-2.5-flashus-central1❓ Questions / Clarification Needed
Is this error strictly caused by the Vertex AI quota below?
Will increasing this quota fully resolve the issue, or are there additional ADK-level or Reasoning Engine concurrency limits/bottlenecks?
Does ADK provide any built-in retry, backoff, or queueing mechanism for
429 RESOURCE_EXHAUSTEDerrors?Are there recommended production patterns when using ADK + Reasoning Engine behind Cloud Run?
Is it possible to self host the reasoning engine locally inside my server and use ADK, so that I only need to worry about the Gemini LLM Request quotas?
I’m planning to launch this service soon and want to ensure the setup is production-safe under burst traffic.