[codex] Improve Unicode width test coverage#13
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 84adab0be5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| assert_str_width("\u{644}\u{605}\u{623}", 1, 1) | ||
| assert_str_width("\u{6B8}\u{605}\u{627}", 1, 1) |
There was a problem hiding this comment.
Correct U+0605 lam-alef expectations
For sequences that contain U+0605 between Lam and Alef, this test locks in the ligature width of 1 even though U+0605 is explicitly generated in non_transparent_zero_widths (tables.mbt) from zero_widths - DerivedJoiningType=T (scripts/unicode.py::load_non_transparent_zero_widths). In that context U+0605 should interrupt the Lam-Alef ligature while contributing zero width itself, so these cases should total 2; leaving them at 1 will make a correct fix to the transparent-character handling fail the regression suite.
Useful? React with 👍 / 👎.
6a70db0 to
84adab0
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 84adab0be5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| assert_str_width("🧑🤝🧑", 2, 2) | ||
| assert_str_width("🇮🇱🕊️🇵🇸", 6, 6) | ||
| assert_str_width("🇵🇸\u{200D}🕊️\u{200D}🇮🇱", 2, 2) | ||
| assert_str_width("#\u{FE0F}\u{20E3}\u{200D}Ⓜ️", 3, 3) |
There was a problem hiding this comment.
Expect keycap ZWJ sequences to collapse
This assertion locks in the current width of #️⃣Ⓜ️ as 3, but the generated state machine already has a dedicated VS16_KEYCAP_ZWJ_EMOJI_PRESENTATION branch for \uFE0F \u20E3 \u200D Emoji_Presentation that should collapse the ASCII keycap base into the ZWJ emoji sequence; it is currently bypassed because # takes the c <= '\u{A0}' early return before that branch. With this expectation in place, making the existing keycap-ZWJ branch reachable for #, *, and digits would fail the regression suite instead of producing the intended width 2.
Useful? React with 👍 / 👎.
35833e7 to
916ac29
Compare
916ac29 to
3e71662
Compare
Summary
Impact
This is test-only. It does not change the public API or runtime behavior.
Validation
moon fmtmoon infomoon checkmoon testmoon coverage analyze -- -f summary->Total: 358/384Split out
The transparent zero-width ligature bug fix and its
U+0605regression tests were moved to #14.