Skip to content

Commit ef9821e

Browse files
authored
fix(converter): preserve theme rFonts on DOCX round-trip (SD-2894) (#3225)
* fix(converter): preserve theme rFonts on DOCX round-trip (SD-2894) The cascade's font-slot dedup only handled the ascii/asciiTheme pair, so a higher-priority style with hAnsiTheme/eastAsiaTheme/cstheme would still leave the lower-priority concrete name in place. Word reads the concrete value as an override that defeats per-script theme resolution — Latin body text mapped to majorBidi rendered as Calibri Light instead of the per-script font (Times New Roman for Hebrew/Arab). Extend the dedup to all four <w:rFonts> slots, and add the same logic to calculateInlineRunPropertiesPlugin so theme refs from the imported run survive the mark-driven recompute that runs on initial load. Verified with a round-trip integration test on the customer fixture: asciiTheme count 65 → 65 (was dropping to ~25 before the fix), no concrete font substitutions on runs that originally carried theme references. * fix(super-editor): respect explicit font changes on themed runs (SD-2894 review) Addresses Luccas's PR review feedback on #3225. P1: when text imported with w:*Theme rFonts is later given a concrete font by the user, mergeFontFamilyPreservingThemeRefs unconditionally restored the old theme reference and dropped the user's choice. setFontFamily('Arial') on themed text appeared to do nothing on export. Fix: gate the merge on whether the mark-derived value still encodes to the same OOXML mark as the run's existing fontFamily. If they match, the marks are a faithful re-derivation of the import and we preserve the theme. If they differ, the user has overridden — drop the theme and keep the new concrete value. DRY: FONT_FAMILY_THEME_PAIRS in the plugin duplicated FONT_SLOT_THEME_PAIRS in style-engine cascade.ts. Export the cascade constant through @superdoc/style-engine/ooxml and consume it from the plugin so the four-slot mapping has a single source of truth. Tests: two new AAA cases in calculateInlineRunPropertiesPlugin.test.js lock both directions of the contract — preserve theme when marks re-encode the import, drop theme on user override. * fix(super-editor): narrow user-override gate to primary fontFamily slot (SD-2894) The previous gate (f046df9) compared the full encoded textStyle attrs between markFromMarks and markFromExisting. That over-fired on imported runs whose existing.fontFamily mixed theme refs with a concrete per-script slot (e.g. { asciiTheme, hAnsiTheme, cstheme, eastAsia: 'Arial' }): the mark re-decode produces a full concrete shape including a `cs:` slot, which the encoder turns into `csFontFamily` on markFromMarks but not on markFromExisting, so JSON.stringify(attrs) mismatched and the theme was dropped. Browser round-trip of the SD-2894 customer fixture: 65/65/65 -> 62/62/62 theme attrs, with 3 `w:ascii="Calibri Light"` substitutions. Regression of the original PR. Fix: compare only `markFromMarks.attrs.fontFamily` against `markFromExisting.attrs.fontFamily`. That is the primary user-override signal; per-script extras don't indicate user intent (the toolbar sets all four slots to the same family). Confirmed in browser: - customer fixture round-trip: 65/65/65, zero CL substitutions - Luccas user-override repro: setFontFamily('Arial') -> exported <w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:eastAsia="Arial" w:cs="Arial"/> Adds a third regression test pinning the mixed theme+concrete-per-script case that the suite previously missed (because the integration test uses isHeadless: true which bypasses the plugin entirely). * test(super-editor): mutation matrix for SD-2894 theme rFonts round-trip 13 parametrized scenarios pin every shape of existing.fontFamily x mark-decoded shape that the plugin must handle on round-trip: Group A — theme preservation when marks re-encode the imported value: A1: all 4 theme slots A2: 3 theme slots (ascii/hAnsi/cs) — SD-2894 customer fixture shape A3: 3 themes + concrete eastAsia — the regression case the gate now handles A4: ascii-only theme A5: hAnsi-only theme A6: cs-only theme (lowercase `cstheme` OOXML quirk) A7: eastAsia-only theme Group B — user override drops theme refs: B1: all 4 themes + Arial B2: customer shape + Arial (Luccas's case) B3: mixed theme+concrete + Arial Group C — no-theme baselines: C1: pure concrete unchanged C2: pure concrete + different concrete C3: empty existing The universal encoder mock mirrors resolveDocxFontFamily's behaviour: any theme ref wins for the primary slot, falling back to concrete ascii/hAnsi/etc. Per-script extras (eastAsiaFontFamily/csFontFamily) attach only when concrete slot value differs from primary fontFamily — same emission rule as the real encoder in super-converter/styles.js. These scenarios are the basis for catching any future drift in the gate logic or merge helper — both the import-preservation path and the user-override path are locked at every shape we ship. * test(style-engine): pin 4-slot theme/concrete dedup; document plugin limitation Addresses two codex review findings on PR #3225: (1) Cascade test coverage: the original SD-2894 bug was that combineRunProperties' slot dedup only handled (ascii, asciiTheme) correctly; hAnsi/eastAsia/cs leaked. Existing cascade tests didn't directly exercise the four-slot behaviour. Adds: - one test per slot for the "theme drops concrete" direction - one test per slot for the "concrete drops theme" direction - an Athenaintelligence-customer-shape test - an export assertion pinning FONT_SLOT_THEME_PAIRS so it stays in sync with the super-editor plugin that re-imports it These prevent any single-slot regression of the original SD-2894 fix. (2) Per-script gate limitation: the plugin's gate + merge pair operate at the fontFamily-key level, not per-slot. A user action that mutates only eastAsiaFontFamily or csFontFamily without changing the primary fontFamily will pass the gate. Documented as an AIDEV-NOTE on mergeFontFamilyPreservingThemeRefs. We accept the trade-off because (a) the toolbar's setFontFamily always changes the primary, (b) no current API or paste path produces this shape against a themed import, (c) the prior attempt at increased precision (f046df9) regressed the customer fixture (3 of 65 runs); narrowing to per-slot would re-open that risk without a concrete reproduction. If a real customer scenario surfaces, file a follow-up with the fixture and rework into per-slot decisions then.
1 parent 1515843 commit ef9821e

7 files changed

Lines changed: 740 additions & 13 deletions

File tree

packages/layout-engine/style-engine/src/cascade.test.ts

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,109 @@ describe('cascade - combineRunProperties', () => {
152152
bold: true,
153153
});
154154
});
155+
156+
// SD-2894: each `<w:rFonts>` script slot has both a concrete form (`w:ascii`,
157+
// `w:hAnsi`, `w:eastAsia`, `w:cs`) and a theme form (`w:asciiTheme`,
158+
// `w:hAnsiTheme`, `w:eastAsiaTheme`, `w:cstheme` — note `cstheme` lowercase).
159+
// When a higher-priority source supplies one form for a slot, the cascade must
160+
// drop the other form from lower-priority sources, or Word resolves the concrete
161+
// name as an override and defeats per-script theme resolution.
162+
//
163+
// The original SD-2894 bug was that only the (ascii, asciiTheme) pair was
164+
// dropping correctly; hAnsi/eastAsia/cs leaked through. These tests pin the
165+
// full 4-slot dedup so the bug cannot regress for any single slot.
166+
describe('SD-2894 four-slot theme/concrete dedup', () => {
167+
it('drops concrete `ascii` from lower when higher supplies `asciiTheme`', () => {
168+
const result = combineRunProperties([
169+
{ fontFamily: { ascii: 'Calibri' } },
170+
{ fontFamily: { asciiTheme: 'majorBidi' } },
171+
]);
172+
expect(result.fontFamily).toEqual({ asciiTheme: 'majorBidi' });
173+
});
174+
175+
it('drops concrete `hAnsi` from lower when higher supplies `hAnsiTheme`', () => {
176+
const result = combineRunProperties([
177+
{ fontFamily: { hAnsi: 'Calibri' } },
178+
{ fontFamily: { hAnsiTheme: 'majorHAnsi' } },
179+
]);
180+
expect(result.fontFamily).toEqual({ hAnsiTheme: 'majorHAnsi' });
181+
});
182+
183+
it('drops concrete `eastAsia` from lower when higher supplies `eastAsiaTheme`', () => {
184+
const result = combineRunProperties([
185+
{ fontFamily: { eastAsia: 'SimSun' } },
186+
{ fontFamily: { eastAsiaTheme: 'majorEastAsia' } },
187+
]);
188+
expect(result.fontFamily).toEqual({ eastAsiaTheme: 'majorEastAsia' });
189+
});
190+
191+
it('drops concrete `cs` from lower when higher supplies `cstheme` (lowercase)', () => {
192+
const result = combineRunProperties([
193+
{ fontFamily: { cs: 'Arial' } },
194+
{ fontFamily: { cstheme: 'majorBidi' } },
195+
]);
196+
expect(result.fontFamily).toEqual({ cstheme: 'majorBidi' });
197+
});
198+
199+
it('drops theme `asciiTheme` from lower when higher supplies concrete `ascii`', () => {
200+
const result = combineRunProperties([
201+
{ fontFamily: { asciiTheme: 'majorBidi' } },
202+
{ fontFamily: { ascii: 'Arial' } },
203+
]);
204+
expect(result.fontFamily).toEqual({ ascii: 'Arial' });
205+
});
206+
207+
it('drops theme `hAnsiTheme` from lower when higher supplies concrete `hAnsi`', () => {
208+
const result = combineRunProperties([
209+
{ fontFamily: { hAnsiTheme: 'majorHAnsi' } },
210+
{ fontFamily: { hAnsi: 'Arial' } },
211+
]);
212+
expect(result.fontFamily).toEqual({ hAnsi: 'Arial' });
213+
});
214+
215+
it('drops theme `eastAsiaTheme` from lower when higher supplies concrete `eastAsia`', () => {
216+
const result = combineRunProperties([
217+
{ fontFamily: { eastAsiaTheme: 'majorEastAsia' } },
218+
{ fontFamily: { eastAsia: 'Arial' } },
219+
]);
220+
expect(result.fontFamily).toEqual({ eastAsia: 'Arial' });
221+
});
222+
223+
it('drops theme `cstheme` from lower when higher supplies concrete `cs`', () => {
224+
const result = combineRunProperties([
225+
{ fontFamily: { cstheme: 'majorBidi' } },
226+
{ fontFamily: { cs: 'Arial' } },
227+
]);
228+
expect(result.fontFamily).toEqual({ cs: 'Arial' });
229+
});
230+
231+
it('Athenaintelligence customer shape: all 4 concretes from defaults dropped by inline themes', () => {
232+
// Mirrors the customer fixture: docDefaults supply concrete fonts (Arial),
233+
// inline rPr supplies theme refs on ascii/hAnsi/cs (no eastAsiaTheme). The
234+
// cascade must keep only the theme refs on those three slots; eastAsia
235+
// concrete from defaults is independent.
236+
const result = combineRunProperties([
237+
{ fontFamily: { ascii: 'Arial', hAnsi: 'Arial', cs: 'Arial', eastAsia: 'Arial' } },
238+
{ fontFamily: { asciiTheme: 'majorBidi', hAnsiTheme: 'majorBidi', cstheme: 'majorBidi' } },
239+
]);
240+
expect(result.fontFamily).toEqual({
241+
asciiTheme: 'majorBidi',
242+
hAnsiTheme: 'majorBidi',
243+
cstheme: 'majorBidi',
244+
eastAsia: 'Arial',
245+
});
246+
});
247+
248+
it('exports FONT_SLOT_THEME_PAIRS so callers (super-editor plugin) can stay in sync', async () => {
249+
const { FONT_SLOT_THEME_PAIRS } = await import('./cascade.js');
250+
expect(FONT_SLOT_THEME_PAIRS).toEqual([
251+
['ascii', 'asciiTheme'],
252+
['hAnsi', 'hAnsiTheme'],
253+
['eastAsia', 'eastAsiaTheme'],
254+
['cs', 'cstheme'],
255+
]);
256+
});
257+
});
155258
});
156259

157260
describe('cascade - combineIndentProperties', () => {

packages/layout-engine/style-engine/src/cascade.ts

Lines changed: 39 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -104,20 +104,51 @@ function isObject(item: unknown): item is PropertyObject {
104104
* @param propertiesArray - Ordered list of run property objects.
105105
* @returns Combined run property object.
106106
*/
107+
// SD-2894: `<w:rFonts>` exposes four independent script slots (ascii / hAnsi / eastAsia / cs).
108+
// Each slot can be filled either by a concrete font name (`w:ascii="Calibri"`) or a theme
109+
// reference (`w:asciiTheme="majorBidi"`). When a higher-priority style supplies one form for a
110+
// slot, the other form from lower-priority sources is no longer applicable and must be
111+
// dropped — otherwise the merged run carries both, and Word resolves the concrete name as an
112+
// override that defeats per-script theme resolution (Latin body text mapped to `majorBidi`
113+
// then renders as Calibri Light instead of Times New Roman for Hebrew/Arab).
114+
//
115+
// Pair `<slot>` with `<slot>Theme`, except for the cs slot whose theme attribute is the
116+
// lowercase `cstheme` (OOXML inconsistency, not a typo).
117+
// Exported so super-editor's calculateInlineRunPropertiesPlugin can consume the same
118+
// list instead of duplicating it (SD-2894 follow-up).
119+
export const FONT_SLOT_THEME_PAIRS: Array<[keyof RunFontFamilyProperties, keyof RunFontFamilyProperties]> = [
120+
['ascii', 'asciiTheme'],
121+
['hAnsi', 'hAnsiTheme'],
122+
['eastAsia', 'eastAsiaTheme'],
123+
['cs', 'cstheme'],
124+
];
125+
126+
function dropConflictingFontSlots(
127+
target: RunFontFamilyProperties,
128+
source: RunFontFamilyProperties,
129+
): RunFontFamilyProperties {
130+
const result = { ...target };
131+
for (const [concreteKey, themeKey] of FONT_SLOT_THEME_PAIRS) {
132+
if (source[themeKey] != null) {
133+
delete result[concreteKey];
134+
delete result[themeKey];
135+
} else if (source[concreteKey] != null) {
136+
delete result[themeKey];
137+
}
138+
}
139+
return result;
140+
}
141+
107142
export function combineRunProperties(propertiesArray: RunProperties[]): RunProperties {
108143
return combineProperties(propertiesArray, {
109144
fullOverrideProps: ['color'],
110145
specialHandling: {
111146
fontFamily: (target: Record<string, unknown>, source: Record<string, unknown>): unknown => {
112147
const fontFamilySource = { ...(source.fontFamily as object) } as RunFontFamilyProperties;
113-
const fontFamilyTarget = { ...(target.fontFamily as object) } as RunFontFamilyProperties;
114-
if (fontFamilySource.asciiTheme != null) {
115-
delete fontFamilyTarget.ascii;
116-
delete fontFamilyTarget.asciiTheme;
117-
}
118-
if (fontFamilySource.ascii != null) {
119-
delete fontFamilyTarget.asciiTheme;
120-
}
148+
const fontFamilyTarget = dropConflictingFontSlots(
149+
(target.fontFamily as RunFontFamilyProperties) ?? {},
150+
fontFamilySource,
151+
);
121152
return { ...(fontFamilyTarget as object), ...(fontFamilySource as object) };
122153
},
123154
},

packages/layout-engine/style-engine/src/ooxml/index.ts

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,12 @@
55
* This module is format-aware (docx), but translator-agnostic.
66
*/
77

8-
import { combineIndentProperties, combineProperties, combineRunProperties } from '../cascade.js';
8+
import {
9+
combineIndentProperties,
10+
combineProperties,
11+
combineRunProperties,
12+
FONT_SLOT_THEME_PAIRS,
13+
} from '../cascade.js';
914
import type { PropertyObject } from '../cascade.js';
1015
import type { ParagraphConditionalFormatting, ParagraphProperties, ParagraphTabStop, RunProperties } from './types.ts';
1116
import type { NumberingProperties } from './numbering-types.ts';
@@ -18,7 +23,7 @@ import type {
1823
TableCellProperties,
1924
} from './styles-types.ts';
2025

21-
export { combineIndentProperties, combineProperties, combineRunProperties };
26+
export { combineIndentProperties, combineProperties, combineRunProperties, FONT_SLOT_THEME_PAIRS };
2227
export type { PropertyObject };
2328
export type * from './types.ts';
2429
export type * from './numbering-types.ts';

packages/super-editor/src/editors/v1/extensions/run/calculateInlineRunPropertiesPlugin.js

Lines changed: 96 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ import { Plugin, TextSelection } from 'prosemirror-state';
22
import { Fragment } from 'prosemirror-model';
33
import { TableMap } from 'prosemirror-tables';
44
import { decodeRPrFromMarks, encodeMarksFromRPr, resolveRunProperties } from '@converter/styles.js';
5+
import { FONT_SLOT_THEME_PAIRS } from '@superdoc/style-engine/ooxml';
56
import {
67
calculateResolvedParagraphProperties,
78
getResolvedParagraphProperties,
@@ -28,6 +29,85 @@ const RUN_PROPERTIES_DERIVED_FROM_MARKS = new Set([
2829

2930
export const TRANSIENT_HYPERLINK_STYLE_IDS = new Set(['Hyperlink', 'FollowedHyperlink']);
3031

32+
/**
33+
* Merge mark-derived concrete fontFamily slots with theme references preserved on the
34+
* existing run. For each slot where the existing fontFamily has a theme reference, keep
35+
* the theme and drop the concrete value from marks; otherwise take the mark value.
36+
*
37+
* Only safe to call when the caller has already verified that marks are a re-derivation
38+
* of `existing` rather than a user override — otherwise this silently reverts user font
39+
* changes. See `marksMatchExistingFontFamily` for the gate.
40+
*
41+
* SD-2894: when the plugin overrides a run's fontFamily from mark-derived values, the
42+
* mark only carries the resolved CSS font name (`ascii: "Calibri Light"`) and forgets the
43+
* theme reference (`asciiTheme: "majorBidi"`) that came from the imported `<w:rFonts>`.
44+
* Without this merge we'd export concrete font names that defeat Word's per-script theme
45+
* resolution.
46+
*
47+
* AIDEV-NOTE: The gate + merge pair operate at the fontFamily-key level (whole rPr
48+
* fontFamily object), not at the per-slot level. A user action that mutates only a
49+
* per-script mark attr (`eastAsiaFontFamily` or `csFontFamily`) without changing the
50+
* primary `fontFamily` will pass the gate, and this merge will silently restore the
51+
* stale theme on that slot. We accept this trade-off because (a) the toolbar's
52+
* setFontFamily writes all four slots to the same family, so a real user override
53+
* always changes the primary; (b) no current API or paste path produces per-script-only
54+
* mark changes against an imported theme run; (c) a per-slot model adds intricate
55+
* encoder probes that risk regressing the SD-2894 customer fixture again (see commit
56+
* bec7e63ed for the prior per-script regression we just fixed). If a real customer
57+
* reproduction surfaces — e.g. a paste-from-Word flow with per-script fonts being
58+
* silently reverted — file a follow-up ticket with that fixture and rework this helper
59+
* into a per-slot decision then. Codex flagged this on the PR-3225 review.
60+
*
61+
* @param {Record<string, unknown>|null|undefined} fromMarks
62+
* @param {Record<string, unknown>|null|undefined} existing
63+
* @returns {Record<string, unknown>}
64+
*/
65+
function mergeFontFamilyPreservingThemeRefs(fromMarks, existing) {
66+
const merged = { ...(fromMarks || {}) };
67+
if (!existing || typeof existing !== 'object') return merged;
68+
for (const [concreteKey, themeKey] of FONT_SLOT_THEME_PAIRS) {
69+
if (existing[themeKey] != null) {
70+
merged[themeKey] = existing[themeKey];
71+
delete merged[concreteKey];
72+
}
73+
}
74+
return merged;
75+
}
76+
77+
/**
78+
* True when the mark-derived fontFamily value encodes to the same primary ASCII font as
79+
* the run's existing (imported) fontFamily — i.e., marks are a faithful re-derivation,
80+
* not a user override.
81+
*
82+
* The encoder resolves theme references to their CSS font name. So:
83+
* - import case: existing = { asciiTheme: 'majorBidi' }, marks = { ascii: 'Calibri Light' }
84+
* → both encode to fontFamily: 'Calibri Light' → match → preserve theme.
85+
* - user-edit case: existing = { asciiTheme: 'majorBidi' }, marks = { ascii: 'Arial' }
86+
* → existing encodes to 'Calibri Light', marks encode to 'Arial' → mismatch → user
87+
* has overridden, drop the theme and respect the new value.
88+
*
89+
* Compare ONLY the primary `fontFamily` attr, not the full mark attrs object. Per-script
90+
* extras (`csFontFamily`, `eastAsiaFontFamily`) on the mark are derived only when the
91+
* input has a *concrete* per-script slot. Existing run-props can carry mixed shapes
92+
* (e.g. `{ asciiTheme, hAnsiTheme, cstheme, eastAsia: 'Arial' }`) that encode without
93+
* `csFontFamily` while the mark re-decode adds a concrete `cs` and therefore does
94+
* include it. A full-attrs JSON compare would falsely flag this as user override and
95+
* drop the theme — the regression caught on the SD-2894 customer fixture round-trip.
96+
*
97+
* @param {{ attrs?: Record<string, unknown> } | null | undefined} markFromMarks
98+
* @param {Record<string, unknown>|null|undefined} existingFontFamily
99+
* @param {(props: Record<string, unknown>, docx: Record<string, unknown>) => Array<{ attrs?: Record<string, unknown> }>} encode
100+
* @param {Record<string, unknown>} docx
101+
* @returns {boolean}
102+
*/
103+
function marksMatchExistingFontFamily(markFromMarks, existingFontFamily, encode, docx) {
104+
if (!existingFontFamily || typeof existingFontFamily !== 'object') return false;
105+
if (!markFromMarks?.attrs) return false;
106+
const markFromExisting = encode({ fontFamily: existingFontFamily }, docx)?.[0];
107+
if (!markFromExisting?.attrs) return false;
108+
return markFromMarks.attrs.fontFamily === markFromExisting.attrs.fontFamily;
109+
}
110+
31111
const RUN_PROPERTY_PRESERVE_META_KEY = 'sdPreserveRunPropertiesKeys';
32112
const COMPANION_INLINE_KEYS = {
33113
fontSizeCs: 'fontSize',
@@ -536,10 +616,23 @@ function getInlineRunProperties(
536616
const valueFromStyles = runPropertiesFromStyles[key];
537617
if (JSON.stringify(valueFromMarks) !== JSON.stringify(valueFromStyles)) {
538618
if (key === 'fontFamily') {
539-
const markFromStyles = encodeMarksFromRPr({ [key]: valueFromStyles }, editor.converter?.convertedXml ?? {})[0];
540-
const markFromMarks = encodeMarksFromRPr({ [key]: valueFromMarks }, editor.converter?.convertedXml ?? {})[0];
619+
const docx = editor.converter?.convertedXml ?? {};
620+
const markFromStyles = encodeMarksFromRPr({ [key]: valueFromStyles }, docx)[0];
621+
const markFromMarks = encodeMarksFromRPr({ [key]: valueFromMarks }, docx)[0];
541622
if (JSON.stringify(markFromMarks?.attrs) !== JSON.stringify(markFromStyles?.attrs)) {
542-
inlineRunProperties[key] = valueFromMarks;
623+
// SD-2894 follow-up (PR #3225 review): only preserve theme refs from `existing`
624+
// when the marks are a faithful re-derivation of the imported value. If marks
625+
// diverge from existing (user picked a new font), respect the user's choice
626+
// and drop the stale theme reference.
627+
const existingFontFamily = existingRunProperties?.[key];
628+
inlineRunProperties[key] = marksMatchExistingFontFamily(
629+
markFromMarks,
630+
existingFontFamily,
631+
encodeMarksFromRPr,
632+
docx,
633+
)
634+
? mergeFontFamilyPreservingThemeRefs(valueFromMarks, existingFontFamily)
635+
: valueFromMarks;
543636
}
544637
} else {
545638
inlineRunProperties[key] = valueFromMarks;

0 commit comments

Comments
 (0)