Skip to content

fix: Unescape Unicode Characters accepts 4-6 hex digits for U+ prefix#2287

Merged
GCHQDeveloper581 merged 3 commits into
gchq:masterfrom
williballenthin:fix-2242
Jun 20, 2026
Merged

fix: Unescape Unicode Characters accepts 4-6 hex digits for U+ prefix#2287
GCHQDeveloper581 merged 3 commits into
gchq:masterfrom
williballenthin:fix-2242

Conversation

@williballenthin

Copy link
Copy Markdown
Contributor

The U+ prefix regex was hardcoded to exactly 4 hex digits, rejecting valid astral plane codepoints like U+1F600 and zero-padded forms like U+000041. Widen the quantifier to {4,6} for U+ only; \u and %u retain their fixed 4-digit requirement.

Closes #2242

AI disclosure
Claude Code Opus 4.6

The U+ prefix regex was hardcoded to exactly 4 hex digits, rejecting
valid astral plane codepoints like U+1F600 and zero-padded forms like
U+000041. Widen the quantifier to {4,6} for U+ only; \u and %u retain
their fixed 4-digit requirement.

Closes gchq#2242
@GCHQDeveloper581 GCHQDeveloper581 enabled auto-merge (squash) June 20, 2026 07:53

@GCHQDeveloper581 GCHQDeveloper581 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Thank you for your contribution.

@GCHQDeveloper581 GCHQDeveloper581 merged commit 9a8a279 into gchq:master Jun 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug report: Unescape Unicode Characters only accepts exactly 4 hex digits for U+

2 participants