Cap HTTP response body size to bound adversarial-target memory use (CWE-770)#161
Open
OffByQuant wants to merge 1 commit into
Open
Cap HTTP response body size to bound adversarial-target memory use (CWE-770)#161OffByQuant wants to merge 1 commit into
OffByQuant wants to merge 1 commit into
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Author
|
@googlebot I signed it! |
The scanner's HTTP egress paths buffered response bodies into memory with
no size ceiling on three independent code paths:
* common/.../OkHttpHttpClient.java :: parseResponse — used by every
detector going through send / sendAsync. Calls
okResponse.body().bytes(), which only enforces a 2 GB cap on bodies
advertising a Content-Length and that ceiling is bypassed entirely
by chunked encoding with no Content-Length header.
* common/.../OkHttpHttpClient.java :: sendAsIs — used by the
crafted-URL probes that fire against hostile targets. Calls
ByteString.readFrom on the raw HttpURLConnection input stream with
no ceiling at all.
* plugin_server/py/.../requests_http_client.py :: send +
_parse_response — uses session.send() with the default
stream=False, which buffers the entire body into the Response
object before returning, then reads res.content unconditionally.
Because Tsunami probes adversarial endpoints by definition, an
attacker-controlled target can detect the TsunamiSecurityScanner
User-Agent on inbound requests and serve an unbounded chunked response.
Each chunk arrives within the read-timeout window so the read timeout
never fires, and the body buffer grows until the JVM heap or Python
process is exhausted (CWE-770 / CWE-400). The result: a compromised
asset can deterministically crash the scanner that is probing it,
producing the false-negative "scanned, no findings" outcome an attacker
wants from a detection tool.
Fix: introduce a configurable max-response-body cap (default 100 MB)
enforced at every body-ingestion point. Reads are drained chunk by
chunk into a bounded buffer; once the cap is exceeded, the underlying
connection is closed and the call fails with IOException / IOError so
the affected probe errors out cleanly while the surrounding scan
continues.
Wiring:
* Java: new HttpClientModule.@MaxResponseBodyBytes provider resolves
--http-client-max-response-body-mb (CLI) → maxResponseBodyMb
(config) → 100 MB default, mirroring the existing timeout flag
pattern. New OkHttpHttpClient constructor + builder setter expose
the cap; the legacy 6-arg constructor is preserved with the
default value to keep external callers source-compatible.
* Python: new --max_response_body_mb absl flag plumbed through
RequestsHttpClientBuilder.set_max_response_body_bytes into
RequestsHttpClient. session.send() now passes stream=True so the
body never lands in the Response object before we get a chance to
cap it.
Tests:
* OkHttpHttpClientTest: send / sendAsIs throw IOException when the
body exceeds the cap; bodies within the cap pass through unchanged.
* requests_http_client_test: send raises IOError on advertised
Content-Length over cap, on actual streamed bytes over cap, and
passes through small bodies unchanged.
Disclosure note: the unbounded reads were originally surfaced by an
external security review. The patch is local to the HTTP-client layer;
no detector code changes and no detector behavior changes for any
response under the cap. Couldn't run gradle/pytest locally (no wrapper
checked in, no project-wide pytest harness configured); relying on CI.
a0035e7 to
42e39b4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Tsunami's HTTP egress paths buffered response bodies into memory with no size cap on three independent code paths. Because Tsunami probes adversarial endpoints by definition, a compromised target can detect the
TsunamiSecurityScannerUser-Agent on inbound requests and serve an unbounded chunked response — each chunk arrives within the read-timeout window so the read timeout never fires, and the body buffer grows until the JVM heap or the Python plugin server process is exhausted (CWE-770 / CWE-400). The end result: a compromised asset can deterministically crash the scanner that is probing it, producing the exact false-negative"scanned, no findings"outcome an attacker wants from a detection tool.This patch caps response-body reads on all three paths.
Affected sites
common/.../OkHttpHttpClient.java :: parseResponse— backs every detector going throughsend/sendAsync.okResponse.body().bytes()only enforces a 2 GB ceiling on bodies whoseContent-Lengthis advertised; chunked-encoded responses with noContent-Lengthbypass it entirely.common/.../OkHttpHttpClient.java :: sendAsIs— used by the crafted-URL probes (path-traversal probes, percent-encoded edge cases) that are most likely to fire against hostile targets.ByteString.readFromdrains the rawHttpURLConnectionstream until EOF with no ceiling at all.plugin_server/py/.../requests_http_client.py—session.send()defaults tostream=False, which buffers the entire body inside theResponseobject before returning.res.contentthen returns those cached bytes unconditionally; a malicious chunked peer OOMs the Python plugin server before_parse_responseis ever entered.Fix
Introduce a configurable
max_response_body_bytescap (default 100 MB) enforced at every body-ingestion point.Content-Length, when advertised, is checked against the cap up front (cheap pre-flight rejection).IOException(Java) /IOError(Python).Java wiring
HttpClientModule.@MaxResponseBodyBytesGuice qualifier + provider resolves CLI → config → default in the same null-coalescing pattern already used for the timeout fields:--http-client-max-response-body-mb(inHttpClientCliOptions)common.net.http.maxResponseBodyMb(inHttpClientConfigProperties)OkHttpHttpClient.DEFAULT_MAX_RESPONSE_BODY_BYTES)OkHttpHttpClientconstructor +OkHttpHttpClientBuilder.setMaxResponseBodyBytessetter expose the cap. The legacy 6-arg constructor is preserved (it now delegates to the 7-arg form with the default), so external callers stay source-compatible.Python wiring
--max_response_body_mbabsl flag inplugin_server.pyplumbed throughRequestsHttpClientBuilder.set_max_response_body_bytesintoRequestsHttpClient.session.send()now passesstream=Trueon both the sync and async paths so the body never lands in the Response object before we have a chance to cap it.Tests
OkHttpHttpClientTest:send_whenResponseBodyExceedsCap_throwsIOExceptionsendAsIs_whenResponseBodyExceedsCap_throwsIOExceptionsend_whenResponseBodyWithinCap_returnsExpectedHttpResponserequests_http_client_test:test_send_when_response_body_exceeds_cap_raises_io_error(cap hit on streamed bytes)test_send_when_advertised_content_length_exceeds_cap_raises_io_error(cap hit on advertised Content-Length pre-flight)test_send_when_response_body_within_cap_returns_expected_response(golden-path regression)Compatibility
OkHttpHttpClientconstructor preserved with default cap, so any out-of-tree caller (unlikely given the package-private constructor, but listed for completeness) continues to compile.--http-client-max-response-body-mb(Java) or--max_response_body_mb(Python) — or fall to a smaller value to harden further against malicious targets.Test plan
./gradlew testagainst the patched Java sources (I was unable to run gradle locally — no wrapper checked into the repo and no system gradle on my dev box; relying on this PR's CI).requests_http_client_test.py(requests-mock+absl-pynot in my local env; relying on CI).Content-Lengthand verify the affected probe errors out with the cap message rather than crashing the JVM / Python process.References
ResponseBodydocumentation — note on body size and streaming