feat: add request timeout to load_web_page#4887
feat: add request timeout to load_web_page#4887cchinchilla-dev wants to merge 5 commits intogoogle:mainfrom
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Hi @cchinchilla-dev , Thank you for your contribution! We appreciate you taking the time to submit this pull request. Your PR has been received by the team and is currently under review. We will provide feedback as soon as we have an update to share. |
|
Hi @Jacksunwei , can you please review this |
|
@rohityan, @Jacksunwei — Just following up to see if anything else is needed from my end. Happy to make any adjustments. |
|
I independently reproduced this issue and can confirm the behavior described. Tested on Windows with Python 3.11 against three failure modes — non-routable IP, invalid DNS, and a slow endpoint. All three result in unhandled exceptions with connect timeout=None confirmed in the traceback. Happy to share full reproduction logs if helpful for the review. |
…timeout-and-url-validation # Conflicts: # src/google/adk/tools/load_web_page.py # tests/unittests/tools/test_load_web_page.py
Link to Issue or Description of Change
Closes #4886
Update — 2026-04-30
After merging the latest
upstream/main, the SSRF rewrite already covers URL-scheme validation and routes every fetch failure through a unifiedFailed to fetch urlmessage. The unique contribution of this PR is now the request timeout; the body and tests below have been updated to reflect the post-merge scope. No code added by this PR duplicates what upstream already provides.Problem
load_web_page()callsrequests.get()without atimeout. If the target server is unresponsive, the agent hangs indefinitely.Solution
Add
timeout=_DEFAULT_TIMEOUT_SECONDS(10 seconds) to both HTTP entry points in the module:requests.getin_fetch_response(proxy path).session.getin_fetch_direct_response(pinned-IP path).Extend the
exceptinload_web_pageto also catchrequests.RequestException, so timeout and connection errors return the standardFailed to fetch url: {url}message instead of propagating.Design note: the timeout is a module-level constant rather than a function parameter to keep it out of the LLM function-calling schema. It can be overridden via
load_web_page._DEFAULT_TIMEOUT_SECONDS = 30if needed.Testing Plan
Unit Tests
pytest tests/unittests/tools/test_load_web_page.py→ 10 passed).New/updated tests in
tests/unittests/tools/test_load_web_page.py:test_load_web_page_uses_proxy_for_unresolved_public_hostnames— updated to verifytimeout=10is forwarded on the proxy path.test_load_web_page_passes_timeout_to_pinned_session— verifies the timeout reaches the pinned-IP session.test_load_web_page_passes_timeout_to_proxied_get— verifies the timeout is forwarded when a proxy is configured.test_load_web_page_returns_failure_on_timeout— verifiesrequests.exceptions.Timeoutis converted intoFailed to fetch url.Manual E2E
N/A — internal hardening; function signature unchanged.
Checklist
Additional Context
This complements the existing SSRF protection (
allow_redirects=False, hostname/IP validation, pinned-IP adapter) already present in the module after upstream/main was merged.