Skip to content

fix: prevent surrogate pairs from being split by the selected range#1226

Open
Thomaash wants to merge 2 commits intobasecamp:mainfrom
Thomaash:main
Open

fix: prevent surrogate pairs from being split by the selected range#1226
Thomaash wants to merge 2 commits intobasecamp:mainfrom
Thomaash:main

Conversation

@Thomaash
Copy link
Copy Markdown
Contributor

This is already handled in functions like expandSelectionInDirection which won't select a half of emoji etc. but always the whole surrogate pair. The solution here is the same, if the cursor is placed in the middle of a surrogate pair, it will be moved to the end of the pair. Similarly a selection will either select both or neither.

@Thomaash
Copy link
Copy Markdown
Contributor Author

Hi @jorgemanrubia, could I ask for a code review?

@Thomaash
Copy link
Copy Markdown
Contributor Author

Hi @jorgemanrubia or @flavorjones, is there any chance for this to be reviewed and merged/rejected?

This is already handled in functions like `expandSelectionInDirection`
which won't select a half of emoji etc. but always the whole surrogate
pair. The solution here is the same, if the cursor is placed in the
middle of a surrogate pair, it will be moved to the end of the pair.
Similarly a selection will either select both or neither.
@flavorjones
Copy link
Copy Markdown
Member

I've rebased this onto current main and verified that the tests pass locally. Code seems find to me. But I'm not sure I understand the problem that's being solved.

@Thomaash can you say a bit more about how this problem occurs in an application -- how is the cursor ending up in the middle of a multi-byte code point?

@flavorjones
Copy link
Copy Markdown
Member

@Thomaash bump! 🙏

@Thomaash
Copy link
Copy Markdown
Contributor Author

Thomaash commented May 4, 2026

This happens when you have text with surrogate pairs and programmatically move the cursor. Some functions will move over the pair as a single unit (expandSelectionInDirection will select both characters, e.g., it will add to the selection both characters of 🙂) but setSelectedRange selects only one character of the pair which can leave the surrogate pair broken. With this change setSelectedRange no longer allows that and you always have either the whole pair or none of it.

Copilot AI review requested due to automatic review settings May 4, 2026 12:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens Composition#setSelectedRange so selection boundaries won’t land inside a UTF-16 surrogate pair (e.g., half an emoji), aligning setSelectedRange behavior with existing cursor/selection expansion logic.

Changes:

  • Normalize setSelectedRange boundaries via UTF-16-aware position translation to avoid splitting surrogate pairs.
  • Add system tests covering collapsed and expanded selections around surrogate pairs in multiple text positions.
  • Register the new system test file in the system test entrypoint.

Tip

If you aren't ready for review, convert to a draft PR.
Click "Convert to draft" or run gh pr ready --undo.
Click "Ready for review" or run gh pr ready to reengage.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/trix/models/composition.js Adjusts setSelectedRange to coerce selection endpoints onto surrogate-pair-safe boundaries.
src/test/system/set_selected_range_test.js Adds coverage for selection normalization behavior around surrogate pairs.
src/test/system.js Includes the new system test module in the test suite.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +525 to +529

// Make sure that surrogate pairs are always selected both or neither.
const correctedSelectedRange = [
this.translateUTF16PositionFromOffset(start, 0),
this.translateUTF16PositionFromOffset(end, 0),
assert.selectedRange([ 2, 2 ])
})

test("collapssed selection in the middle of the surrogate pair", () => {
assert.selectedRange([ 5, 5 ])
})

test("collapssed selection in the middle of the surrogate pair", () => {
assert.selectedRange([ 5, 5 ])
})

test("collapssed selection in the middle of the surrogate pair", () => {
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants