Skip to content

fix: handle non-UTF-8 bytes in redirect Location header#12441

Closed
NIK-TIGER-BILL wants to merge 1 commit intoaio-libs:masterfrom
NIK-TIGER-BILL:fix-redirect-non-utf8-location
Closed

fix: handle non-UTF-8 bytes in redirect Location header#12441
NIK-TIGER-BILL wants to merge 1 commit intoaio-libs:masterfrom
NIK-TIGER-BILL:fix-redirect-non-utf8-location

Conversation

@NIK-TIGER-BILL
Copy link
Copy Markdown

What do these changes do?

Fixes #10047

When a server sends a Location header containing non-UTF-8 bytes (e.g. latin-1 encoded characters like ø), aiohttp's multidict decodes them as surrogates. This causes yarl.URL() to create a malformed URL, leading to a 404 or other errors on redirect.

This PR detects surrogate characters in the decoded Location/URI header and falls back to latin-1 decoding (per RFC 7230) using the raw header bytes before parsing the redirect URL.

Are there changes in behavior for the user?

Yes — redirects with non-UTF-8 bytes in the Location header will now be followed correctly instead of producing a malformed URL.

Related issue number

Fixes #10047

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
  • Add a news fragment into the CHANGES folder

When a server sends a Location header containing non-UTF-8 bytes (e.g.
latin-1 encoded characters like ø), aiohttp's multidict decodes them as
surrogates, which then causes yarl.URL() to create a malformed URL.

This fix detects surrogate characters in the decoded Location/URI header
and falls back to latin-1 decoding (per RFC 7230) using the raw header
bytes before parsing the redirect URL.

Regression test added for:
aio-libs#10047

Signed-off-by: NIK-TIGER-BILL <nik.tiger.bill@github.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 3, 2026

Codecov Report

❌ Patch coverage is 88.57143% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 98.91%. Comparing base (8b10afd) to head (b182035).
⚠️ Report is 213 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
aiohttp/client.py 42.85% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #12441      +/-   ##
==========================================
- Coverage   99.10%   98.91%   -0.20%     
==========================================
  Files         130      134       +4     
  Lines       45446    46786    +1340     
  Branches     2398     2434      +36     
==========================================
+ Hits        45040    46279    +1239     
- Misses        275      376     +101     
  Partials      131      131              
Flag Coverage Δ
CI-GHA 98.97% <88.57%> (+0.01%) ⬆️
OS-Linux 98.71% <88.57%> (+<0.01%) ⬆️
OS-Windows 96.97% <88.57%> (-0.01%) ⬇️
OS-macOS 97.87% <88.57%> (+<0.01%) ⬆️
Py-3.10.11 97.38% <88.57%> (-0.04%) ⬇️
Py-3.10.20 97.85% <88.57%> (-0.05%) ⬇️
Py-3.11.15 98.10% <88.57%> (-0.01%) ⬇️
Py-3.11.9 97.64% <88.57%> (+0.01%) ⬆️
Py-3.12.10 97.72% <88.57%> (+<0.01%) ⬆️
Py-3.12.13 98.19% <88.57%> (-0.01%) ⬇️
Py-3.13.12 ?
Py-3.13.13 98.43% <88.57%> (?)
Py-3.14.3 ?
Py-3.14.3t ?
Py-3.14.4 98.50% <88.57%> (?)
Py-3.14.4t 97.51% <88.57%> (?)
Py-pypy3.11.13-7.3.20 ?
Py-pypy3.11.15-7.3.21 97.34% <88.57%> (-0.20%) ⬇️
VM-macos 97.87% <88.57%> (+<0.01%) ⬆️
VM-ubuntu 98.71% <88.57%> (+<0.01%) ⬆️
VM-windows 96.97% <88.57%> (-0.01%) ⬇️
cython-coverage 38.11% <88.57%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 3, 2026

Merging this PR will not alter performance

✅ 67 untouched benchmarks
⏩ 4 skipped benchmarks1


Comparing NIK-TIGER-BILL:fix-redirect-non-utf8-location (b182035) with master (da50f24)

Open in CodSpeed

Footnotes

  1. 4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@Dreamsorcerer
Copy link
Copy Markdown
Member

This looks the same as the existing open PR which also does not do what I suggested in the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

On redirects, middle URL with ø char gets parsed wrongly - leading to a 404

2 participants