Fix CWE-190 - DoS could happen with a specially crafted boundary parameter in a multipart request#445
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses a security issue (CWE-190) in multipart request handling by constraining and hardening boundary parsing to prevent integer overflow / DoS scenarios during multipart body parsing.
Changes:
- Add validation to reject multipart boundaries that are empty or exceed RFC 2046’s 70-character limit.
- Change multipart boundary parsing state tracking (
_boundaryPosition) tosize_tand update a related loop to avoid overflow issues.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/WebRequest.cpp | Adds boundary length validation and updates multipart parsing loop indexing to use size_t. |
| src/ESPAsyncWebServer.h | Updates _boundaryPosition type to size_t to safely track boundary parsing progress. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
658afa6 to
ec13b20
Compare
|
@me-no-dev : updated the PR and rebased. |
300d91c to
c7114c0
Compare
3c7ad90 to
4753be6
Compare
_boundaryPosition was declared as uint8_t. A remote attacker could send a
Content-Type header with a boundary string of exactly 256 bytes, causing the
increment in _parseMultipartPostByte() to overflow back to 0. The parser
would then loop indefinitely in the BOUNDARY_OR_DATA state, consuming 100%
CPU and triggering the FreeRTOS watchdog on ESP32/ESP8266.
Fixes:
- ESPAsyncWebServer.h: change _boundaryPosition from uint8_t to size_t,
eliminating the overflow entirely.
- WebRequest.cpp (_parseMultipartPostByte): change all rewind loop variables
from uint8_t to size_t (BOUNDARY_OR_DATA, DASH3_OR_RETURN2, EXPECT_FEED2).
- WebRequest.cpp (_parseMultipartPostByte / EXPECT_BOUNDARY): add a comment
explaining the intentional unsigned underflow when _parsedLength < 2.
- WebRequest.cpp (_parseReqHeader): validate the boundary value extracted
from the Content-Type header before any body parsing begins:
* Media type comparison changed from case-sensitive startsWith() to a
case-insensitive check on a pre-lowercased copy of the value, so that
e.g. 'Multipart/form-data' is correctly recognised (RFC 2045 §5.1).
* Require the 'boundary=' token to be a proper RFC 2045 parameter
(immediately preceded by ';'), preventing false matches on tokens
whose name merely ends in 'boundary=' (e.g. 'x-boundary=').
* Parse the boundary value as either a token or a quoted-string per
RFC 2046 §5.1. Quoted-string form correctly handles backslash-escaped
quotes ("), pairs of backslashes (\), and rejects unterminated
quoted strings. The closing-quote detector counts consecutive
backslashes before the quote so that an even count (backslashes
cancelling each other) correctly identifies the real closing quote.
* Reject boundaries that are empty or longer than 70 characters
(RFC 2046 §5.1 hard limit); drop the connection immediately with
PARSE_REQ_FAIL + abort() instead of continuing.
* Trim _contentType defensively after extraction.
- AsyncWebHeader.cpp (parse): fix OWS handling per RFC 7230 §3.2.3.
The previous code only skipped a single space after the colon; the
correct rule is OWS = *( SP / HTAB ). This affected every parsed
header: Content-Type, Upgrade, Expect, Transfer-Encoding, Host, etc.
The fix is applied at the source so all header branches benefit.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
All good with me @mathieucarbou . Feel free to merge and move this forward |
willmmiles
left a comment
There was a problem hiding this comment.
Sorry I'm late to review, been a bit busy of late. The implementation is technically workable but inefficient in a few ways. A tokenizer might be more practical. std::string_view<> could go a long way here too - it'd substantially reduce the number of allocations.
| // would mishandle the \\" case. | ||
| int endQuote = 1; | ||
| while (true) { | ||
| endQuote = _boundary.indexOf('"', endQuote); |
There was a problem hiding this comment.
This would be better implemented by re-using the quote handling logic above -- here we declare a second parser to do exactly the same thing with a different, less efficient algorithm.
Worse, we could be building the final boundary value and handling escapes as we iterate through the string, instead of doing it over again.
| // BOUNDARY_OR_DATA loop from ever reaching the end of the boundary and | ||
| // causing unbounded CPU consumption (watchdog reset on ESP32/ESP8266). | ||
| // Reject boundaries outside the valid range before any parsing begins. | ||
| if (_boundary.length() == 0 || _boundary.length() > 70) { |
There was a problem hiding this comment.
This would be better implemented in the search loops above, without having to make a copy of the string first if it's out of spec.
There was a problem hiding this comment.
Yes the check can be done while looping but still we need to accumulate the characters into _boundary. But at least we will not accumulate more than 70 doing that.
| // trailing whitespace. The value may be either a token or a | ||
| // quoted-string (RFC 2045 §5.1 / RFC 2046 §5.1). | ||
| _boundary = value.substring(bpos + (int)T_BOUNDARY_LEN); | ||
| _boundary.trim(); |
There was a problem hiding this comment.
The trim here is redundant and wasteful. We're guaranteed there's no leading whitespace because it's already been processed by the search loop above; and any trailing whitespace will be trimmed in the value analysis below. (If it's even trailing to the boundary value -- it could belong to some other parameter entirely!)
| // optional OWS) so that e.g. 'x-boundary=' is not matched. | ||
| int bpos = -1; | ||
| bool inQuotes = false; | ||
| for (int i = 0; i < (int)lowcase.length(); i++) { |
There was a problem hiding this comment.
A little tweak: the loop end condition can be lowcase.length() - (T_BOUNDARY_LEN+2), eg. enough space for the '=' and a data byte. If there aren't enough bytes left for a successful match, we can stop up here, and then we also don't have to length-check on line 513 (in fact we can even use strcmp for greater speed!).
I agree I completely forgot about it. This PR was an iterative improvement with copilot and it grew to the point where indeed perf improvements could be done. I will open a new PR with them and maybe some others. |
Tighten multipart boundary parsing in _parseReqHeader(): - Replace String::charAt() inner loop with a raw C-string pointer and pre-computed length for faster, allocation-free scanning. - Add an early-exit upper bound on the scan loop so that positions where 'boundary=' cannot possibly fit are skipped entirely. - Replace the three-pass quoted-string extraction (find close-quote, substring, then unescape) with a single-pass approach that writes directly into _boundary using a std::string_view window into the original header value — no intermediate heap allocations. - Enforce the RFC 2046 §5.1 70-character limit on boundary length in both token and quoted-string paths; abort with PARSE_REQ_FAIL on violation to prevent CWE-190 integer-overflow / DoS. - Replace strncmp length guard with a position-based early break that is clearer and avoids the redundant cast. Ref: #445
|
@willmmiles : opened a PR hereL #447 |
Tighten multipart boundary parsing in _parseReqHeader(): - Replace String::charAt() inner loop with a raw C-string pointer and pre-computed length for faster, allocation-free scanning. - Add an early-exit upper bound on the scan loop so that positions where 'boundary=' cannot possibly fit are skipped entirely. - Replace the three-pass quoted-string extraction (find close-quote, substring, then unescape) with a single-pass approach that writes directly into _boundary using a raw pointer+length pair — no intermediate heap allocations and no C++17 dependency. - Enforce the RFC 2046 §5.1 70-character limit on boundary length in both token and quoted-string paths; abort with PARSE_REQ_FAIL on violation to prevent CWE-190 integer-overflow / DoS. - Replace strncmp length guard with a position-based early break that is clearer and avoids the redundant cast. - Fix ESP8266 build: String(const char*, size_t) is unavailable; use a 71-byte stack buffer + String(const char*) instead (#ifdef ESP8266). - Fix LibreTiny/C++14 build: remove std::string_view (C++17 only); replaced with plain const char* + size_t throughout. Ref: #445
Tighten multipart boundary parsing in _parseReqHeader(): - Replace String::charAt() inner loop with a raw C-string pointer and pre-computed length for faster, allocation-free scanning. - Add an early-exit upper bound on the scan loop so that positions where 'boundary=' cannot possibly fit are skipped entirely. - Replace the three-pass quoted-string extraction (find close-quote, substring, then unescape) with a single-pass approach that writes directly into _boundary using a raw pointer+length pair — no intermediate heap allocations and no C++17 dependency. - Enforce the RFC 2046 §5.1 70-character limit on boundary length in both token and quoted-string paths; abort with PARSE_REQ_FAIL on violation to prevent CWE-190 integer-overflow / DoS. - Replace strncmp length guard with a position-based early break that is clearer and avoids the redundant cast. - Fix ESP8266 build: String(const char*, size_t) is unavailable; use a 71-byte stack buffer + String(const char*) instead (#ifdef ESP8266). - Fix LibreTiny/C++14 build: remove std::string_view (C++17 only); replaced with plain const char* + size_t throughout. Ref: #445
Tighten multipart boundary parsing in _parseReqHeader(): - Replace String::charAt() inner loop with a raw C-string pointer and pre-computed length for faster, allocation-free scanning. - Add an early-exit upper bound on the scan loop so that positions where 'boundary=' cannot possibly fit are skipped entirely. - Replace the three-pass quoted-string extraction (find close-quote, substring, then unescape) with a single-pass approach that writes directly into _boundary using a raw pointer+length pair — no intermediate heap allocations and no C++17 dependency. - Enforce the RFC 2046 §5.1 70-character limit on boundary length in both token and quoted-string paths; abort with PARSE_REQ_FAIL on violation to prevent CWE-190 integer-overflow / DoS. - Replace strncmp length guard with a position-based early break that is clearer and avoids the redundant cast. - Fix ESP8266 build: String(const char*, size_t) is unavailable; use a 71-byte stack buffer + String(const char*) instead (#ifdef ESP8266). - Fix LibreTiny/C++14 build: remove std::string_view (C++17 only); replaced with plain const char* + size_t throughout. Ref: #445
Tighten multipart boundary parsing in _parseReqHeader(): - Replace String::charAt() inner loop with a raw C-string pointer and pre-computed length for faster, allocation-free scanning. - Add an early-exit upper bound on the scan loop so that positions where 'boundary=' cannot possibly fit are skipped entirely. - Replace the three-pass quoted-string extraction (find close-quote, substring, then unescape) with a single-pass approach that writes directly into _boundary using a raw pointer+length pair — no intermediate heap allocations and no C++17 dependency. - Enforce the RFC 2046 §5.1 70-character limit on boundary length in both token and quoted-string paths; abort with PARSE_REQ_FAIL on violation to prevent CWE-190 integer-overflow / DoS. - Replace strncmp length guard with a position-based early break that is clearer and avoids the redundant cast. - Fix ESP8266 build: String(const char*, size_t) is unavailable; use a 71-byte stack buffer + String(const char*) instead (#ifdef ESP8266). - Fix LibreTiny/C++14 build: remove std::string_view (C++17 only); replaced with plain const char* + size_t throughout. Ref: #445
See advisory: GHSA-4phx-fcj6-46r4
Tested with our UploadFlash example.
Fix CWE-190: integer overflow in multipart boundary parser (DoS)
_boundaryPosition was declared as uint8_t. A remote attacker could send a
Content-Type header with a boundary string of exactly 256 bytes, causing the
increment in _parseMultipartPostByte() to overflow back to 0. The parser
would then loop indefinitely in the BOUNDARY_OR_DATA state, consuming 100%
CPU and triggering the FreeRTOS watchdog on ESP32/ESP8266.
Fixes:
eliminating the overflow entirely.
from uint8_t to size_t (BOUNDARY_OR_DATA, DASH3_OR_RETURN2, EXPECT_FEED2).
explaining the intentional unsigned underflow when _parsedLength < 2.
from the Content-Type header before any body parsing begins:
case-insensitive check on a pre-lowercased copy of the value, so that
e.g. 'Multipart/form-data' is correctly recognised (RFC 2045 §5.1).
(immediately preceded by ';'), preventing false matches on tokens
whose name merely ends in 'boundary=' (e.g. 'x-boundary=').
RFC 2046 §5.1. Quoted-string form correctly handles backslash-escaped
quotes ("), pairs of backslashes (), and rejects unterminated
quoted strings. The closing-quote detector counts consecutive
backslashes before the quote so that an even count (backslashes
cancelling each other) correctly identifies the real closing quote.
(RFC 2046 §5.1 hard limit); drop the connection immediately with
PARSE_REQ_FAIL + abort() instead of continuing.
The previous code only skipped a single space after the colon; the
correct rule is OWS = *( SP / HTAB ). This affected every parsed
header: Content-Type, Upgrade, Expect, Transfer-Encoding, Host, etc.
The fix is applied at the source so all header branches benefit.