Skip to content

fix(converter): preserve tracked-change-wrapped fields during DOCX import (SD-2858)#3175

Merged
luccas-harbour merged 9 commits intomainfrom
luccas/sd-2858-bug-docx-with-tracked-change-wrapped-field-split-across
May 6, 2026
Merged

fix(converter): preserve tracked-change-wrapped fields during DOCX import (SD-2858)#3175
luccas-harbour merged 9 commits intomainfrom
luccas/sd-2858-bug-docx-with-tracked-change-wrapped-field-split-across

Conversation

@luccas-harbour
Copy link
Copy Markdown
Contributor

Summary

Fixes DOCX import handling for fields wrapped in tracked changes when the field spans multiple nodes or paragraphs.

This preserves the original raw OOXML for tracked-change-wrapped fields instead of trying to collapse/process them as normal field references. That avoids import failures fordocuments where a field begins inside one tracked deletion/insertion wrapper and ends inside another.

Also makes header/footer relationship filtering more defensive by tolerating missing relationship elements or relationship nodes without attributes.

Changes

  • Preserve raw field nodes when field processing crosses w:del or w:ins wrappers.
  • Handle unpaired field ends bubbling up from child nodes without throwing.
  • Add regression coverage for a tracked-deletion-wrapped hyperlink field split across paragraphs.
  • Guard header/footer relationship lookup against missing elements or attributes.

…aragraphs

When a field (w:fldChar begin/separate/end) is wrapped in a w:del or w:ins
tracked-change element and split across paragraphs, the field pre-processor
would crash on the unpaired end or strip the tracked-change wrapper. Detect
track-change wrappers and flag the field as `preserveRaw` so the raw nodes
are emitted untouched, and handle an unpaired end at the top level by
passing the node through with `unpairedEnd = true`.
Guard against a missing `Relationships` element and entries without an
`attributes` map when scanning `document.xml.rels` for headers/footers,
so docs that omit either no longer throw during import.
@luccas-harbour luccas-harbour requested a review from a team as a code owner May 5, 2026 21:17
@linear
Copy link
Copy Markdown

linear Bot commented May 5, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

The ecma-spec MCP tools need permission to be granted. Could you approve them? Without that I can still do a manual spec review — let me know which you'd prefer.

In the meantime, I can share what I see from direct spec knowledge:

The diff touches a fairly narrow set of OOXML constructs:

  • w:del / w:ins with w:id, w:author, w:date attributes
  • w:fldChar with w:fldCharType = begin / separate / end
  • w:instrText with xml:space="preserve"
  • w:delText with xml:space="preserve"
  • OPC relationship Type URIs for header/footer

One thing worth checking once the tools are approved: the test places w:instrText inside a w:del > w:r run. Per the spec, deleted instruction text should use w:delInstrText, not w:instrText. That could be a spec deviation (in the test fixture, not in the handling code itself). I'd want to confirm with ooxml_children on w:del and ooxml_attributes on w:instrText / w:delInstrText before calling it a hard failure.

Please approve the ecma-spec tool calls and I'll do the full verification.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

… tracked-change wrapper

When a field's `end` fldChar is wrapped in a tracked-change element
(`w:del`/`w:ins`), the field cannot be safely re-emitted from the
collected instruction tokens — round-tripping would drop the tracked
deletion markup. Propagate a new `unpairedEndPreserveRaw` flag up
through nested children so the active field is marked `preserveRaw`
once it finalizes, keeping the original w:r/w:del nodes intact.
…SD-2858)

When a field's `end` fldChar surfaces through nested wrappers
(e.g. w:sdt/w:sdtContent or tracked-change elements) and there is
no active field still collecting at this level, push the original
`rawNode` rather than the processed `node`. The processed copy can
have its child runs rewritten by the recursive pass, dropping the
fldChar and breaking round-trip; the raw subtree preserves the
input verbatim. Adds a regression test covering the non-collecting
wrapper path.
Copy link
Copy Markdown
Contributor

@caio-pizzol caio-pizzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @luccas-harbour! lgtm :)

follow-up: same shape but with an inserted (instead of deleted) field also used to drop on import. now the text shows up but the link is gone - user sees plain text where Word shows a clickable link. fine for deletes and moves since that text goes away on accept, but for inserts it stays visible. tracked in SD-2973.

@luccas-harbour luccas-harbour merged commit c64a05d into main May 6, 2026
68 checks passed
@luccas-harbour luccas-harbour deleted the luccas/sd-2858-bug-docx-with-tracked-change-wrapped-field-split-across branch May 6, 2026 16:42
@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 6, 2026

🎉 This PR is included in @superdoc-dev/mcp v0.3.0-next.62

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 6, 2026

🎉 This PR is included in @superdoc-dev/react v1.2.0-next.104

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 6, 2026

🎉 This PR is included in vscode-ext v2.3.0-next.106

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 6, 2026

🎉 This PR is included in superdoc-cli v0.8.0-next.78

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 6, 2026

🎉 This PR is included in superdoc v1.30.0-next.60

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 6, 2026

🎉 This PR is included in superdoc-sdk v1.8.0-next.60

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 7, 2026

🎉 This PR is included in superdoc-cli v0.9.0

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 7, 2026

🎉 This PR is included in superdoc v1.32.0

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 7, 2026

🎉 This PR is included in @superdoc-dev/mcp v0.4.0

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 7, 2026

🎉 This PR is included in @superdoc-dev/react v1.3.0

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 7, 2026

🎉 This PR is included in vscode-ext v2.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants