Commit 41c7489
committed
HTML API: Ensure that code points always encode to UTF-8
This was brought up during fuzz testing of the HTML API. After
polyfilling `mb_chr()` and relying on it in the HTML decoder, it became
possible that for sites with a non-UTF-8 charset selected, then the
creation of text from code points when decoding numeric character
references might produce corrupted text, or text which encodes to
non-UTF-8 bytes.
While for these sites, there are broader issues with non-UTF-8 support,
this change ensures that code point encoding remains deterministic.
Developed in: WordPress#12155
Discussed in: https://core.trac.wordpress.org/ticket/65372
Follow-up to [62424].
Props dmsnell, jonsurrell.
See #65372.
git-svn-id: https://develop.svn.wordpress.org/trunk@62487 602fd350-edb4-49c9-b593-d223f7449a821 parent b5abaff commit 41c7489
1 file changed
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
424 | 424 | | |
425 | 425 | | |
426 | 426 | | |
427 | | - | |
| 427 | + | |
428 | 428 | | |
429 | 429 | | |
430 | 430 | | |
| |||
0 commit comments