Skip to content

fix(diffing): gate paragraph identity on structural depth (SD-3174)#3355

Merged
caio-pizzol merged 1 commit into
mainfrom
caio/IT-1065-diffing-table-text
May 17, 2026
Merged

fix(diffing): gate paragraph identity on structural depth (SD-3174)#3355
caio-pizzol merged 1 commit into
mainfrom
caio/IT-1065-diffing-table-text

Conversation

@caio-pizzol
Copy link
Copy Markdown
Contributor

@caio-pizzol caio-pizzol commented May 17, 2026

Diff replay used to overwrite table-cell content with body text whenever a comparison surfaced both a structural change (table add/remove) and trailing paragraph edits. The IT-1065 Google Docs fixtures renumber w14:paraId values across the structural change, so cell-paragraph IDs in the source collide with top-level paragraph IDs in the target. The flat sequence diff was pairing them as the same paragraph and anchoring modification ops inside the now-deleted table.

Reported in #3347 with a before/after fixture pair that reproduces on main.

Fixes #3347

  • paragraphComparator and canTreatAsModification now reject pairing across different structural depths before any paraId or content-signature check. Cross-depth content-signature matches were the same hazard for blank paragraphs and short repeated labels, so the guard is hoisted to the top of both functions.
  • Fixture pair added (diff_before_it1065.docx / diff_after_it1065.docx) and wired into the replayDiffs fixture suite.
  • Four unit tests cover paraId, content-signature, and similarity cross-depth rejection.
  • A tracked-mode replay regression test covers the playground path where compare/replay applies tracked changes.

Review: confirm depth gating is the right granularity vs an ancestor-path key. Depth handles the reporter's fixture and likely sibling cases; a path key would also catch same-depth-different-parent collisions if a fixture ever surfaces one.

Verified: pnpm exec vitest run --root packages/super-editor src/editors/v1/extensions/diffing → 247 passed across 26 files. Stashing the source change makes the new fixture test fail with the expected text-content divergence after "Key Milestones", confirming regression coverage.

@caio-pizzol caio-pizzol requested a review from a team as a code owner May 17, 2026 10:44
@linear-code
Copy link
Copy Markdown

linear-code Bot commented May 17, 2026

IT-1065

SD-3174

@caio-pizzol caio-pizzol self-assigned this May 17, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 28f88a9866

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@caio-pizzol caio-pizzol changed the title fix(diffing): gate paragraph identity on structural depth (IT-1065) fix(diffing): gate paragraph identity on structural depth (SD-3174) May 17, 2026
Diff replay was overwriting table-cell content with body-paragraph text
when a comparison contained both a structural change (table removed)
and trailing text changes. DOCX importers renumber w14:paraId values
across structural changes, so cell-paragraph IDs in the source can
collide with top-level paragraph IDs in the target. The flat sequence
diff then paired them as the same paragraph and emitted modification
ops anchored inside the table.

Gate paragraphComparator and canTreatAsModification on depth equality
before any identity signal. Cross-depth content-signature matches were
the same hazard for blank or repeated-label paragraphs.

Add a fixture pair derived from the IT-1065 reporter's documents and
four unit tests covering paraId, content-signature, and similarity
cross-depth rejection.
@caio-pizzol caio-pizzol force-pushed the caio/IT-1065-diffing-table-text branch from 28f88a9 to ff97a7b Compare May 17, 2026 10:52
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@caio-pizzol caio-pizzol merged commit 57a3891 into main May 17, 2026
69 checks passed
@caio-pizzol caio-pizzol deleted the caio/IT-1065-diffing-table-text branch May 17, 2026 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Diffing: wrong paragraph removed when diff contains both a table and a text change

2 participants