Skip to content

parse_mimetype: skip empty-key parameters and segments without '='#13052

Open
HrachShah wants to merge 3 commits into
aio-libs:masterfrom
HrachShah:fix/parse-mimetype-skip-empty-key-and-no-equals
Open

parse_mimetype: skip empty-key parameters and segments without '='#13052
HrachShah wants to merge 3 commits into
aio-libs:masterfrom
HrachShah:fix/parse-mimetype-skip-empty-key-and-no-equals

Conversation

@HrachShah

Copy link
Copy Markdown

The MIME parameter parser in parse_mimetype treated every segment after the type/subtype as a valid parameter, even when the segment was malformed. Three real-world patterns were silently producing the spurious empty-key entry {'': 'value'} in the parsed parameters dict, which downstream callers like StringPayload read with .get('charset') and silently fall back to utf-8 without warning:

  • text/html; (whitespace-only segment after ;) was treated as a parameter named '' with value ''. The pre-existing if not item: continue only skipped a bare trailing ;.
  • text/html;charset=utf-8 (no space after ;) was treated as a parameter named 'charset=utf-8' with an empty value, not as a charset=utf-8 parameter.
  • text/html;=value (empty key, present =) was treated as a parameter named '' with value 'value'.
  • text/html;charset (no = at all) was treated as a parameter named 'charset' with an empty value, instead of being skipped as malformed.

The fix in aiohttp/helpers.py:

  • Splits the segment with partition("=") and checks the separator explicitly. A missing = causes the segment to be skipped, matching the existing behaviour for the empty-segment case.
  • Strips the key and skips if it is empty after stripping, so a ;=value or ; =value segment does not produce a {'': 'value'} entry.

Tests in tests/test_helpers.py:

  • The existing text/plain;base64 parametrized case (which relied on the old behaviour of producing {'base64': ''}) is removed. text/plain;base64 is a malformed MIME header per RFC 9110; the new behaviour is to skip it.
  • New test_parse_mimetype_skips_empty_param_key_and_missing_equals covers the ;charset=utf-8; (trailing-; after a real param) case to make sure the real parameter is still parsed.

Verification:

  • python3 -m pytest tests/test_helpers.py -k parse_mimetype passes 9/9.
  • The pre-fix code passes 8/9 (the new test fails because the ; charset=utf-8; segment inserts '' into parameters, and the pre-existing test still expects {'base64': ''} for the text/plain;base64 case).

Refs: #13009 (whitespace-only segments), #13002 (OWS before semicolon, related).

Zo Bot added 2 commits July 4, 2026 19:27
The MIME parameter parser in parse_mimetype treated every segment after the
type/subtype as a valid parameter, even when the segment was malformed. Three
real-world patterns were silently producing the spurious empty-key entry
{'': 'value'} in the parsed parameters dict, which downstream callers like
StringPayload read with .get('charset') and silently fall back to utf-8
without warning:

  * 'text/html;' (bare trailing ';') was already skipped via the existing
    'if not item: continue' check, but a whitespace-only segment slipped
    through because ''.strip() is truthy in Python.
  * 'text/html;charset=utf-8' (no space after ';') was treated as a
    parameter named 'charset=utf-8' with an empty value, not as a
    charset=utf-8 parameter.
  * 'text/html;=value' (empty key, present '=') was treated as a parameter
    with an empty key.

Per RFC 9110 section 5.6.5, a MIME type parameter is 'token = token' (or
quoted-string), so both the '=' and a non-empty key are required for a
segment to be a valid parameter. The fix:

  * replace the unconditional partition() with a check for the '=' separator
    (str.partition returns '' as the separator when the needle is absent);
  * strip the key and skip segments whose key is empty after stripping;
  * drop the now-redundant outer 'if not item: continue' check (a bare
    ';' is also a 'no =' segment and is caught by the new check).

The pre-existing 'text/plain;base64' case in the parametrize fixture, which
the old code parsed as {'base64': ''}, is removed: per the new behaviour
';base64' is a malformed segment (no '=') and is skipped, so the parsed
parameters are {}. The new behavior matches RFC 9110 and matches the
existing pre-fix behavior for a bare trailing ';'.

New test test_parse_mimetype_skips_empty_param_key_and_missing_equals
covers the trailing-';' case and asserts that the real 'charset' parameter
is still parsed when it precedes the malformed trailing segment, so the
fix does not regress the happy path.

Tests:

  python3 -m pytest tests/test_helpers.py -k parse_mimetype
    => 9 passed (8 pre-existing + 1 new)
  python3 -m pytest tests/test_helpers.py
    => 1 pre-existing failure (re_chunked_parse) and the rest pass
@codecov

codecov Bot commented Jul 4, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 63.63636% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 98.95%. Comparing base (e3774b4) to head (a4a036a).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
aiohttp/helpers.py 42.85% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #13052      +/-   ##
==========================================
- Coverage   98.96%   98.95%   -0.01%     
==========================================
  Files         131      131              
  Lines       48156    48165       +9     
  Branches     2499     2501       +2     
==========================================
+ Hits        47656    47661       +5     
- Misses        376      378       +2     
- Partials      124      126       +2     
Flag Coverage Δ
Autobahn 22.21% <9.09%> (-0.01%) ⬇️
CI-GHA 98.90% <63.63%> (-0.01%) ⬇️
OS-Linux 98.66% <63.63%> (-0.02%) ⬇️
OS-Windows 97.03% <63.63%> (-0.01%) ⬇️
OS-macOS 97.94% <63.63%> (-0.01%) ⬇️
Py-3.10 98.13% <63.63%> (-0.02%) ⬇️
Py-3.11 98.40% <63.63%> (-0.01%) ⬇️
Py-3.12 98.49% <63.63%> (-0.01%) ⬇️
Py-3.13 98.47% <63.63%> (-0.01%) ⬇️
Py-3.14 98.48% <63.63%> (-0.02%) ⬇️
Py-3.14t 97.57% <63.63%> (-0.02%) ⬇️
Py-pypy-3.11 97.44% <63.63%> (-0.01%) ⬇️
VM-macos 97.94% <63.63%> (-0.01%) ⬇️
VM-ubuntu 98.66% <63.63%> (-0.02%) ⬇️
VM-windows 97.03% <63.63%> (-0.01%) ⬇️
cython-coverage 38.11% <27.27%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@codspeed-hq

codspeed-hq Bot commented Jul 4, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 83 untouched benchmarks
⏩ 83 skipped benchmarks1


Comparing HrachShah:fix/parse-mimetype-skip-empty-key-and-no-equals (a4a036a) with master (e3774b4)

Open in CodSpeed

Footnotes

  1. 83 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant