Skip to content

[security] fix(bot): guard web_fetch remote targets#2051

Open
Hinotoi-agent wants to merge 1 commit into
volcengine:mainfrom
Hinotoi-agent:security/webfetch-network-guard
Open

[security] fix(bot): guard web_fetch remote targets#2051
Hinotoi-agent wants to merge 1 commit into
volcengine:mainfrom
Hinotoi-agent:security/webfetch-network-guard

Conversation

@Hinotoi-agent
Copy link
Copy Markdown
Contributor

Summary

This PR hardens the bot web_fetch tool against private-network SSRF by applying OpenViking's shared network target guard to the server-side fetch path.

web_fetch already checked basic URL shape and scheme, but it then handed the URL to httpx.AsyncClient(follow_redirects=True) without enforcing the same public-network boundary used by other remote-fetch surfaces. This meant a bot-triggered fetch could target loopback, private, link-local, or other non-public addresses, and redirect hops also needed to be validated before the outbound request was sent.

This change:

  • rejects non-public initial web_fetch URLs before constructing the HTTP client
  • installs the shared httpx request-validation hook so redirect hops are checked as well
  • adds focused regression tests for pre-request loopback rejection and redirect/request-hook validation

Security issues covered

Issue Affected surface Impact
Bot web_fetch SSRF bot/vikingbot/agent/tools/web.py A bot-accessible server-side fetch tool could be pointed at internal or otherwise non-public network targets.

Before this PR

  • web_fetch accepted http:// and https:// URLs after basic URL parsing.
  • The tool used httpx.AsyncClient(follow_redirects=True) to fetch content server-side.
  • Non-public targets such as loopback/private/link-local addresses were not rejected by the bot tool before fetching.
  • Redirect destinations were not protected by a per-request target validation hook.

After this PR

  • The initial URL is checked with ensure_public_remote_target() before any HTTP client is created.
  • The httpx client receives build_httpx_request_validation_hooks(ensure_public_remote_target) so each outbound request, including redirects, is validated before it is sent.
  • Blocked targets return a URL validation error instead of creating a server-side request.
  • Regression tests cover both the initial-target gate and the request-hook path.

Why this matters

Bot fetch tools run on the server side. If a user-controlled or model-influenced URL can reach loopback, RFC1918/private, link-local, or reserved ranges, the tool can become an SSRF primitive against services that are reachable from the bot runtime but not from the original user.

That can expose internal admin panels, metadata services, local development services, service credentials, or other network-adjacent resources depending on deployment topology. Redirect validation is also important because an apparently public URL can redirect to a private address after the initial validation step.

How this differs from related issue/PR

This is related to the same trust-boundary family as the earlier resource-ingestion SSRF hardening in #1133, but it covers a separate fetch surface.

  • fix security: feat(resources): harden HTTP resource ingestion against private-network SSRF #1133 hardened HTTP resource ingestion paths and introduced/reused network-guard coverage for resource ingestion.
  • This PR applies that shared boundary to the bot web_fetch tool in bot/vikingbot/agent/tools/web.py.
  • The vulnerable trigger here is the bot tool's server-side HTTP fetch, not resource ingestion.
  • The fix intentionally reuses the existing shared openviking.utils.network_guard helpers instead of adding a second network-policy implementation.

#1596 also touched bot/vikingbot/agent/tools/web.py, but that change focused on HTML parsing/runtime cleanup and did not add network target validation for web_fetch.

Attack flow

  1. A user or model-controlled tool call supplies a URL to the bot web_fetch tool.
  2. The URL points directly to a non-public host, or starts on a public host that redirects to a non-public host.
  3. The server-side bot runtime performs the outbound request from its own network position.
  4. The response content is returned through the tool result, exposing information from an internal network target.

Affected code

  • bot/vikingbot/agent/tools/web.py
    • WebFetchTool.execute()
    • httpx.AsyncClient(..., follow_redirects=True, ...) fetch path

Root cause

  • The bot tool had URL syntax/scheme validation, but not network-boundary validation.
  • The server-side fetch path trusted the caller-provided destination too early.
  • Redirects required validation at the HTTP client request layer, not only before the first request.

CVSS assessment

  • Issue: Bot web_fetch SSRF to non-public network targets
  • CVSS v3.1: 6.5
  • Vector: CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N
  • Rationale: The issue is network-reachable through the bot/tool interface and requires the ability to trigger web_fetch. Successful exploitation can expose confidential data reachable from the server-side runtime. Integrity and availability impact are not claimed by this PR.

Safe reproduction steps

On vulnerable code, call the bot web_fetch tool with a loopback URL such as:

http://127.0.0.1:8080/secret

A safe local harness can replace the HTTP client with a failing stub and verify whether the tool attempts to construct a client for the blocked destination.

For redirect coverage, use a public-looking URL that redirects to a private target and verify that the request hook validates the redirect destination before the redirected request is sent.

Expected vulnerable behavior

Before this patch:

  • A loopback/private target is not rejected by the bot tool's network-boundary logic before the HTTP client is created.
  • Redirect hops are not checked by a per-request network guard.

After this patch:

  • Loopback/private targets return a URL validation error before the HTTP client is created.
  • Redirect/request hook validation is installed on the httpx client and validates subsequent request URLs.

Changes in this PR

  • Import and reuse ensure_public_remote_target() and build_httpx_request_validation_hooks() from openviking.utils.network_guard.
  • Validate the initial web_fetch URL before constructing httpx.AsyncClient.
  • Pass httpx request-validation hooks to cover redirect hops and other per-request destinations.
  • Add focused regression tests for the blocked loopback path and redirect/request-hook validation.

Files changed

Category Files What changed
Bot fetch hardening bot/vikingbot/agent/tools/web.py Adds shared network-guard validation before and during web_fetch requests.
Regression tests bot/tests/test_web_fetch_network_guard.py Covers loopback rejection before client construction and request-hook validation for redirects.

Maintainer impact

  • Public http:// and https:// fetches continue to work when they resolve to allowed public network targets.
  • Non-public network destinations are rejected consistently with the shared OpenViking network guard policy.
  • The patch is intentionally narrow and does not change HTML extraction, content conversion, or response formatting behavior.

Fix rationale

The correct boundary is the outbound network target, not only URL syntax. Applying the shared guard at both layers avoids drift:

  • pre-client validation blocks obviously unsafe initial targets before any network request can be attempted
  • request-hook validation protects redirect hops and keeps the enforcement attached to the actual outbound request path
  • reusing openviking.utils.network_guard keeps the bot tool aligned with existing OpenViking SSRF policy instead of creating a duplicate implementation

Type of change

  • Security fix
  • Regression tests
  • Documentation update
  • Refactor only

Test plan

Run locally:

PYTHONPATH=bot:. python3 -m pytest -q -o addopts='' bot/tests/test_web_fetch_network_guard.py tests/misc/test_network_guard.py
python3 -m ruff format --check bot/vikingbot/agent/tools/web.py bot/tests/test_web_fetch_network_guard.py
python3 -m ruff check bot/vikingbot/agent/tools/web.py bot/tests/test_web_fetch_network_guard.py
git diff --check

Result:

68 passed, 4 warnings
2 files already formatted
All checks passed

The targeted pytest command uses -o addopts='' to avoid unrelated local coverage-plugin/addopts issues while exercising the new bot regression tests and the shared network-guard test suite.

Disclosure notes

This PR is intentionally bounded to the bot web_fetch server-side fetch path. It does not claim to audit every remote-fetch surface in the repository. It reuses the existing shared network guard and covers the separate bot tool surface that was not addressed by the earlier resource-ingestion SSRF hardening.

@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

**🎫 Ticket compliance analysis **

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🏅 Score: 90
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Broad Exception Catching

The code uses except Exception: to catch errors from ensure_public_remote_target(), which may swallow unrelated errors. It should catch the specific exception type raised by ensure_public_remote_target() instead of all exceptions, to avoid hiding unexpected failures.

try:
    ensure_public_remote_target(url)
except Exception as e:
    return json.dumps({"error": f"URL validation failed: {e}", "url": url})

@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

1 participant