Skip to content

[3.14] gh-74902: Add Unicode Grapheme Cluster Break algorithm (GH-143076)#148247

Draft
StanFromIreland wants to merge 3 commits intopython:3.14from
StanFromIreland:backport-bab1d7a-3.14
Draft

[3.14] gh-74902: Add Unicode Grapheme Cluster Break algorithm (GH-143076)#148247
StanFromIreland wants to merge 3 commits intopython:3.14from
StanFromIreland:backport-bab1d7a-3.14

Conversation

@StanFromIreland
Copy link
Copy Markdown
Member

@StanFromIreland StanFromIreland commented Apr 8, 2026

Add the unicodedata._iter_graphemes() function to iterate over grapheme clusters according to rules defined in Unicode Standard Annex #29.

Add unicodedata._grapheme_cluster_break(), unicodedata._indic_conjunct_break() and unicodedata._extended_pictographic() functions to get the properties of the character which are related to the above algorithm.

(cherry picked from commits bab1d7a, 85013d7 and 58ccf21)

serhiy-storchaka and others added 3 commits April 8, 2026 13:21
…ythonGH-143076)

Add the unicodedata.iter_graphemes() function to iterate over grapheme
clusters according to rules defined in Unicode Standard Annex #29.

Add unicodedata.grapheme_cluster_break(), unicodedata.indic_conjunct_break()
and unicodedata.extended_pictographic() functions to get the properties
of the character which are related to the above algorithm.
(cherry picked from commit bab1d7a)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Guillaume "Vermeille" Sanchez <guillaume.v.sanchez@gmail.com>
@StanFromIreland
Copy link
Copy Markdown
Member Author

StanFromIreland commented Apr 8, 2026

This is an alternative proposal for #148218.

This would also require updating testpython.net with the file for UCD 16. Due to this, the network test currently fails.

@vstinner
Copy link
Copy Markdown
Member

vstinner commented Apr 8, 2026

I don't think that it's a good idea to backport so much code in a stable branch. I don't think that fixing the issue in Python 3.14 is important enough to justify this backport. Python had this bug for many years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants