parse_mimetype: skip empty-key parameters and segments without '='#13052
Open
HrachShah wants to merge 3 commits into
Open
parse_mimetype: skip empty-key parameters and segments without '='#13052HrachShah wants to merge 3 commits into
HrachShah wants to merge 3 commits into
Conversation
added 2 commits
July 4, 2026 19:27
The MIME parameter parser in parse_mimetype treated every segment after the
type/subtype as a valid parameter, even when the segment was malformed. Three
real-world patterns were silently producing the spurious empty-key entry
{'': 'value'} in the parsed parameters dict, which downstream callers like
StringPayload read with .get('charset') and silently fall back to utf-8
without warning:
* 'text/html;' (bare trailing ';') was already skipped via the existing
'if not item: continue' check, but a whitespace-only segment slipped
through because ''.strip() is truthy in Python.
* 'text/html;charset=utf-8' (no space after ';') was treated as a
parameter named 'charset=utf-8' with an empty value, not as a
charset=utf-8 parameter.
* 'text/html;=value' (empty key, present '=') was treated as a parameter
with an empty key.
Per RFC 9110 section 5.6.5, a MIME type parameter is 'token = token' (or
quoted-string), so both the '=' and a non-empty key are required for a
segment to be a valid parameter. The fix:
* replace the unconditional partition() with a check for the '=' separator
(str.partition returns '' as the separator when the needle is absent);
* strip the key and skip segments whose key is empty after stripping;
* drop the now-redundant outer 'if not item: continue' check (a bare
';' is also a 'no =' segment and is caught by the new check).
The pre-existing 'text/plain;base64' case in the parametrize fixture, which
the old code parsed as {'base64': ''}, is removed: per the new behaviour
';base64' is a malformed segment (no '=') and is skipped, so the parsed
parameters are {}. The new behavior matches RFC 9110 and matches the
existing pre-fix behavior for a bare trailing ';'.
New test test_parse_mimetype_skips_empty_param_key_and_missing_equals
covers the trailing-';' case and asserts that the real 'charset' parameter
is still parsed when it precedes the malformed trailing segment, so the
fix does not regress the happy path.
Tests:
python3 -m pytest tests/test_helpers.py -k parse_mimetype
=> 9 passed (8 pre-existing + 1 new)
python3 -m pytest tests/test_helpers.py
=> 1 pre-existing failure (re_chunked_parse) and the rest pass
This was referenced Jul 4, 2026
for more information, see https://pre-commit.ci
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #13052 +/- ##
==========================================
- Coverage 98.96% 98.95% -0.01%
==========================================
Files 131 131
Lines 48156 48165 +9
Branches 2499 2501 +2
==========================================
+ Hits 47656 47661 +5
- Misses 376 378 +2
- Partials 124 126 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. |
Merging this PR will not alter performance
Comparing Footnotes
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The MIME parameter parser in
parse_mimetypetreated every segment after the type/subtype as a valid parameter, even when the segment was malformed. Three real-world patterns were silently producing the spurious empty-key entry{'': 'value'}in the parsed parameters dict, which downstream callers likeStringPayloadread with.get('charset')and silently fall back to utf-8 without warning:text/html;(whitespace-only segment after;) was treated as a parameter named''with value''. The pre-existingif not item: continueonly skipped a bare trailing;.text/html;charset=utf-8(no space after;) was treated as a parameter named'charset=utf-8'with an empty value, not as acharset=utf-8parameter.text/html;=value(empty key, present=) was treated as a parameter named''with value'value'.text/html;charset(no=at all) was treated as a parameter named'charset'with an empty value, instead of being skipped as malformed.The fix in
aiohttp/helpers.py:partition("=")and checks the separator explicitly. A missing=causes the segment to be skipped, matching the existing behaviour for the empty-segment case.;=valueor; =valuesegment does not produce a{'': 'value'}entry.Tests in
tests/test_helpers.py:text/plain;base64parametrized case (which relied on the old behaviour of producing{'base64': ''}) is removed.text/plain;base64is a malformed MIME header per RFC 9110; the new behaviour is to skip it.test_parse_mimetype_skips_empty_param_key_and_missing_equalscovers the;charset=utf-8;(trailing-;after a real param) case to make sure the real parameter is still parsed.Verification:
python3 -m pytest tests/test_helpers.py -k parse_mimetypepasses 9/9.; charset=utf-8;segment inserts''into parameters, and the pre-existing test still expects{'base64': ''}for thetext/plain;base64case).Refs: #13009 (whitespace-only segments), #13002 (OWS before semicolon, related).