Skip to content

fix(super-editor): preserve generated line breaks in DOCX export#3630

Open
caio-pizzol wants to merge 1 commit into
mainfrom
caio/sd-3278-docx-export-collapses-generated-line-breaks-in-word
Open

fix(super-editor): preserve generated line breaks in DOCX export#3630
caio-pizzol wants to merge 1 commit into
mainfrom
caio/sd-3278-docx-export-collapses-generated-line-breaks-in-word

Conversation

@caio-pizzol
Copy link
Copy Markdown
Contributor

Fixes SD-3278.

Generated multiline text could look correct in SuperDoc but export as raw newlines inside <w:t>, which Word and LibreOffice do not treat as manual line breaks.

This change:

  • converts generated \n / \r\n / \r text into soft line break nodes
  • exports any remaining raw newlines as <w:br/>
  • keeps structural lineBreak edits as soft breaks, not page breaks
  • preserves deleted tracked text as <w:delText> when runs are split around breaks

Follow-ups are tracked separately for import-side normalization and tracked inserted breaks inside <w:ins>.

@linear-code
Copy link
Copy Markdown

linear-code Bot commented Jun 4, 2026

SD-3278

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

The ecma-spec tools are unavailable to me this session (permission denied on every call), so I verified against my knowledge of ECMA-376 Part 1 rather than live schema lookups. Flagging that up front so you can re-run with the spec tools if you want the schema citations confirmed. That said, the elements/attributes touched here are well-established and the diff lines up cleanly with the spec.

Status: PASS

Here's what I checked and why it holds:

w:tw:delText rename inside w:del (del-translator.js:104-121)
w:delText (ECMA-376 §17.3.3.7) is the correct deleted-text counterpart to w:t, and it's the element required for text content inside a <w:del>. Renaming every direct w:t (not just the first) is the right call: w:r's content model (EG_RunInnerContent) permits any number of run-content children in any order, so <w:delText>Alpha</w:delText><w:br/><w:delText>Beta</w:delText> is valid, and a leftover <w:t> inside <w:del> would indeed not be treated as deleted. Leaving w:br/w:tab/w:noBreakHyphen untouched is also correct — the <w:del> wrapper conveys the deletion; those structural atoms have no "deleted" variant. Attributes (e.g. xml:space) carry over fine since w:delText is also CT_Text.

w:br as a soft line break (translate-text-node.js:62)
w:br with no w:type defaults to textWrapping (ST_BrType), i.e. a soft line break — exactly the intent. The test asserting w:type is absent (rather than emitting w:type="page", which is hardBreak) is spec-correct. w:br is valid as a direct child of w:r interleaved with w:t.

xml:space="preserve" on segment w:t (translate-text-node.js:75)
Valid — xml:space is the lone attribute on CT_Text, and gating it on edge-whitespace segments is appropriate. Word collapses leading/trailing whitespace without it, so this is correct preservation behavior.

lineBreak (→ <w:br/>) vs hardBreak (→ <w:br w:type="page"/>) (node-materializer.ts:920)
Preferring lineBreak for a kind: 'lineBreak' item is right — using hardBreak would emit page breaks (w:type="page"), which is a different element semantic. Good catch on the original bug.

One non-blocking note (not a spec violation): the split path replaces the original nodeAttrs with only the xml:space logic. Since CT_Text carries no other attributes, that's harmless — just calling it out so it doesn't surprise anyone expecting other attrs to survive.

If you'd like the schema citations hardened, re-run with the ecma-spec tools authorized and I'll confirm EG_RunInnerContent's child set and ST_BrType's default directly against the XSD.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@caio-pizzol caio-pizzol force-pushed the caio/sd-3278-docx-export-collapses-generated-line-breaks-in-word branch from 0a2be53 to 8a8d6f5 Compare June 4, 2026 01:13
@caio-pizzol caio-pizzol marked this pull request as ready for review June 4, 2026 01:20
@caio-pizzol caio-pizzol requested a review from a team as a code owner June 4, 2026 01:20
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8a8d6f5379

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +74 to +77
} else if (part === '\n' && lineBreakNodeType) {
// `lineBreak` disallows marks (`marks: ''` in its schema); a break carries
// no run formatting, so create it bare.
nodes.push(lineBreakNodeType.create());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Count lineBreaks in rewrite diffs

When this helper materializes \n as a lineBreak node, later rewrites over that same content use the executor's diff model, which still calls textBetweenWithTabs(..., '', '') and charOffsetToDocPos without counting lineBreak nodes. For example, rewriting an existing Alpha<lineBreak/>Beta range to the same visible text Alpha\nBeta is not detected as a no-op: the prefix/suffix trim maps both sides of the \n to the position before the existing break and inserts another lineBreak. Please update the text/offset accounting for lineBreak at the same time as creating these nodes.

Useful? React with 👍 / 👎.

Multi-line text passed into text-mode mutations stored newlines as a raw
\n inside one <w:t>, which Word collapses on open while SuperDoc renders
it as a break. Convert newlines to lineBreak nodes at creation, and split
any residual raw newline into <w:t>/<w:br/> within one run on export, so
the break serializes as a Word-native <w:br/> (ECMA-376 17.3.3.1).

- buildTextWithTabs: normalize \n, \r\n, \r to lineBreak nodes, gated on
  parent admission for text*-only parents (e.g. total-page-number)
- materializeLineBreak: prefer lineBreak over hardBreak so a structural
  kind:'lineBreak' is a soft break, not a page break
- del-translator: rename every <w:t> in a split run to <w:delText>
  (17.3.3.7 requires delText for all deleted text)
@caio-pizzol caio-pizzol force-pushed the caio/sd-3278-docx-export-collapses-generated-line-breaks-in-word branch from 8a8d6f5 to fa210d4 Compare June 4, 2026 01:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants