Skip to content

Commit 6436d86

Browse files
feat: support TIFF images in DOCX rendering (#2284)
* feat: support TIFF images in DOCX rendering TIFF images in DOCX files rendered as broken icons because browsers cannot natively display image/tiff. Convert TIFF to PNG at import time using utif2, following the existing EMF/WMF → SVG conversion pattern. Closes #2064 * fix: enforce pixel limit before TIFF decode to prevent DoS Reject TIFF images exceeding 100M pixels before allocating RGBA buffers or canvas, preventing a malicious TIFF with extreme dimensions from freezing or crashing the tab during import. * fix: validate TIFF dimensions before decoding and correct .tif MIME type Move MAX_PIXEL_COUNT check before UTIF.decodeImage/toRGBA8 so oversized TIFFs are rejected before allocating the RGBA buffer. Map .tif extension to image/tiff in Content_Types.xml generation to avoid emitting the invalid MIME type image/tif. * fix: read TIFF dimensions from raw IFD tags for pre-decode validation UTIF.decode populates raw tag entries (t256/t257) but .width/.height are only set after decodeImage. Read from raw tags so the pixel limit guard works before the expensive decode step without rejecting valid files. * fix: address PR review feedback for TIFF support - Use mimeTypeForExt mapping for .tif data URIs (image/tiff not image/tif) - Remove unused size arg from convertTiffToPng call - Add happy-path test asserting valid TIFF produces PNG data URI * fix: clean up convertTiffToPng signature and add wiring test - Remove unused Uint8Array/ArrayBufferView branches (only strings are passed) - Add handleImageNode test verifying convertTiffToPng is called for .tif files * test: add integration test for TIFF image loading pipeline Adds a Playwright test that loads a minimal DOCX containing a TIFF image and verifies the full pipeline: DocxZipper → convertTiffToPng → rendered PNG. * refactor: replace utif2 with image-js/tiff and deduplicate DocxZipper constants Replace unmaintained utif2 with actively maintained image-js/tiff for TIFF decoding. Extract duplicated IMAGE_EXTS and MIME_TYPE_FOR_EXT mappings in DocxZipper.js to module-level constants. * fix: pre-decode size guard and 16-bit TIFF normalization Use decode(buffer, { ignoreImageData: true }) to check dimensions before allocating pixel data, preventing DoS from small compressed TIFFs with huge dimensions. Normalize Uint16Array and Float32Array pixel data to 8-bit for canvas compatibility. * test: add TIFF MIME, fallback, and branch coverage tests; extract shared dataUriToArrayBuffer helper Address remaining PR review feedback: add tests for .tif → image/tiff MIME mapping (import data URI and export Content_Types), TIFF conversion failure fallback alt text, greyscale/grey+alpha/Uint16/Float32 toRGBA branches, and extract duplicate data-URI-stripping logic from metafile-converter and tiff-converter into shared dataUriToArrayBuffer in helpers.js. * fix: revert to utif2 for TIFF decoding image-js/tiff lacks support for PackBits, JPEG, and CCITT compression formats commonly found in Word documents. utif2 handles all TIFF compression types via its toRGBA8 pipeline. Updated tests to match utif2 API (decode → decodeImage → toRGBA8). * fix: prefer domEnvironment over global document in createCanvas - createCanvas() now checks domEnvironment first, fixing silent failures in JSDOM environments where global document lacks canvas support - Add dataUriToArrayBuffer unit tests covering all input branches and both throw paths - Add explanatory comment for query-string module re-imports in tests --------- Co-authored-by: G Pardhiv Varma <gpardhivvarma@gmail.com>
1 parent 66ad683 commit 6436d86

16 files changed

Lines changed: 545 additions & 73 deletions

packages/super-editor/package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,7 @@
101101
"remark-parse": "catalog:",
102102
"remark-stringify": "catalog:",
103103
"unified": "catalog:",
104+
"utif2": "catalog:",
104105
"uuid": "catalog:",
105106
"vue": "catalog:",
106107
"xml-js": "catalog:"

packages/super-editor/src/core/DocxZipper.js

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,12 @@ import { ensureXmlString, isXmlLike } from './encoding-helpers.js';
55
import { DOCX } from '@superdoc/common';
66
import { COMMENT_FILE_BASENAMES } from './super-converter/constants.js';
77

8+
/** Image file extensions recognized during import and export. */
9+
const IMAGE_EXTS = new Set(['png', 'jpg', 'jpeg', 'gif', 'bmp', 'tiff', 'tif', 'emf', 'wmf', 'svg', 'webp']);
10+
11+
/** Map file extensions to correct MIME sub-types where they differ. */
12+
const MIME_TYPE_FOR_EXT = { tif: 'tiff' };
13+
814
/**
915
* Class to handle unzipping and zipping of docx files
1016
*/
@@ -63,19 +69,18 @@ class DocxZipper {
6369
const fileBase64 = await zipEntry.async('base64');
6470
let extension = this.getFileExtension(name)?.toLowerCase();
6571
// Only build data URIs for images; keep raw base64 for other binaries (e.g., xlsx)
66-
const imageTypes = new Set(['png', 'jpg', 'jpeg', 'gif', 'bmp', 'tiff', 'emf', 'wmf', 'svg', 'webp']);
67-
6872
// For unknown extensions (like .tmp), try to detect the image type from content
6973
let detectedType = null;
70-
if (!imageTypes.has(extension) || extension === 'tmp') {
74+
if (!IMAGE_EXTS.has(extension) || extension === 'tmp') {
7175
detectedType = detectImageType(fileBase64);
7276
if (detectedType) {
7377
extension = detectedType;
7478
}
7579
}
7680

77-
if (imageTypes.has(extension)) {
78-
this.mediaFiles[name] = `data:image/${extension};base64,${fileBase64}`;
81+
if (IMAGE_EXTS.has(extension)) {
82+
const mimeSubtype = MIME_TYPE_FOR_EXT[extension] || extension;
83+
this.mediaFiles[name] = `data:image/${mimeSubtype};base64,${fileBase64}`;
7984
const blob = await zipEntry.async('blob');
8085
const fileObj = new File([blob], name, { type: blob.type });
8186
const imageUrl = URL.createObjectURL(fileObj);
@@ -105,10 +110,9 @@ class DocxZipper {
105110
*/
106111
async updateContentTypes(docx, media, fromJson, updatedDocs = {}) {
107112
const additionalPartNames = Object.keys(updatedDocs || {});
108-
const imageExts = new Set(['png', 'jpg', 'jpeg', 'gif', 'bmp', 'tiff', 'emf', 'wmf', 'svg', 'webp']);
109113
const newMediaTypes = Object.keys(media)
110114
.map((name) => this.getFileExtension(name))
111-
.filter((ext) => ext && imageExts.has(ext));
115+
.filter((ext) => ext && IMAGE_EXTS.has(ext));
112116

113117
const contentTypesPath = '[Content_Types].xml';
114118
let contentTypesXml;
@@ -131,7 +135,8 @@ class DocxZipper {
131135
if (defaultMediaTypes.includes(type)) continue;
132136
if (seenTypes.has(type)) continue;
133137

134-
const newContentType = `<Default Extension="${type}" ContentType="image/${type}"/>`;
138+
const mime = MIME_TYPE_FOR_EXT[type] || type;
139+
const newContentType = `<Default Extension="${type}" ContentType="image/${mime}"/>`;
135140
typesString += newContentType;
136141
seenTypes.add(type);
137142
}

packages/super-editor/src/core/DocxZipper.test.js

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -338,6 +338,64 @@ describe('DocxZipper - exportFromCollaborativeDocx media handling', () => {
338338
});
339339
});
340340

341+
describe('DocxZipper - .tif MIME type mapping', () => {
342+
it('produces image/tiff data URI for .tif files on import', async () => {
343+
const zipper = new DocxZipper();
344+
const zip = new JSZip();
345+
346+
const contentTypes = `<?xml version="1.0" encoding="UTF-8"?>
347+
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
348+
<Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
349+
<Default Extension="xml" ContentType="application/xml"/>
350+
<Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
351+
</Types>`;
352+
zip.file('[Content_Types].xml', contentTypes);
353+
zip.file('word/document.xml', '<w:document/>');
354+
355+
// Arbitrary binary data stored as a .tif file
356+
const tifData = new Uint8Array([0x49, 0x49, 0x2a, 0x00, 0x08, 0x00, 0x00, 0x00]);
357+
zip.file('word/media/image1.tif', tifData);
358+
359+
const buf = await zip.generateAsync({ type: 'arraybuffer' });
360+
await zipper.getDocxData(buf, false);
361+
362+
// Must use image/tiff, not image/tif
363+
expect(zipper.mediaFiles['word/media/image1.tif']).toMatch(/^data:image\/tiff;base64,/);
364+
});
365+
366+
it('writes image/tiff content type in [Content_Types].xml on export', async () => {
367+
const zipper = new DocxZipper();
368+
369+
const contentTypes = `<?xml version="1.0" encoding="UTF-8"?>
370+
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
371+
<Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
372+
<Default Extension="xml" ContentType="application/xml"/>
373+
<Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
374+
</Types>`;
375+
376+
const docx = [
377+
{ name: '[Content_Types].xml', content: contentTypes },
378+
{ name: 'word/document.xml', content: '<w:document/>' },
379+
];
380+
381+
const result = await zipper.updateZip({
382+
docx,
383+
updatedDocs: {},
384+
media: { 'word/media/image1.tif': 'AAAA' },
385+
fonts: {},
386+
isHeadless: true,
387+
});
388+
389+
const readBack = await new JSZip().loadAsync(result);
390+
const updatedContentTypes = await readBack.file('[Content_Types].xml').async('string');
391+
392+
// Should contain Extension="tif" with ContentType="image/tiff"
393+
expect(updatedContentTypes).toContain('Extension="tif"');
394+
expect(updatedContentTypes).toContain('ContentType="image/tiff"');
395+
expect(updatedContentTypes).not.toContain('ContentType="image/tif"');
396+
});
397+
});
398+
341399
describe('DocxZipper - .tmp image file detection', () => {
342400
it('detects and processes .tmp files with PNG signatures as PNG images', async () => {
343401
const zipper = new DocxZipper();

packages/super-editor/src/core/super-converter/helpers.js

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,33 @@ function base64ToUint8Array(base64) {
3333
return bytes;
3434
}
3535

36+
/**
37+
* Convert a base64 string or data URI to an ArrayBuffer.
38+
* Accepts ArrayBuffer, TypedArray, data URI, or raw base64 string.
39+
*
40+
* @param {string|ArrayBuffer|Uint8Array} data
41+
* @returns {ArrayBuffer}
42+
*/
43+
function dataUriToArrayBuffer(data) {
44+
if (data instanceof ArrayBuffer) return data;
45+
if (ArrayBuffer.isView(data)) return data.buffer.slice(data.byteOffset, data.byteOffset + data.byteLength);
46+
47+
if (typeof data !== 'string') {
48+
throw new Error('Unsupported data type for conversion to ArrayBuffer');
49+
}
50+
51+
let base64 = data;
52+
if (data.startsWith('data:')) {
53+
const commaIndex = data.indexOf(',');
54+
if (commaIndex === -1) {
55+
throw new Error('Invalid data URI: missing base64 content');
56+
}
57+
base64 = data.substring(commaIndex + 1);
58+
}
59+
60+
return base64ToUint8Array(base64).buffer;
61+
}
62+
3663
// CSS pixels per inch; used to convert between Word's inch-based measurements and DOM pixels.
3764
const PIXELS_PER_INCH = 96;
3865

@@ -720,5 +747,6 @@ export {
720747
resolveOpcTargetPath,
721748
computeCrc32Hex,
722749
base64ToUint8Array,
750+
dataUriToArrayBuffer,
723751
detectImageType,
724752
};

packages/super-editor/src/core/super-converter/helpers.test.js

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ import {
77
getArrayBufferFromUrl,
88
computeCrc32Hex,
99
base64ToUint8Array,
10+
dataUriToArrayBuffer,
1011
detectImageType,
1112
} from './helpers.js';
1213

@@ -385,6 +386,43 @@ describe('base64ToUint8Array', () => {
385386
});
386387
});
387388

389+
describe('dataUriToArrayBuffer', () => {
390+
it('returns the same ArrayBuffer when given an ArrayBuffer', () => {
391+
const buf = new ArrayBuffer(4);
392+
expect(dataUriToArrayBuffer(buf)).toBe(buf);
393+
});
394+
395+
it('slices a TypedArray into a new ArrayBuffer', () => {
396+
const bytes = new Uint8Array([10, 20, 30, 40]);
397+
const result = dataUriToArrayBuffer(bytes);
398+
expect(result).toBeInstanceOf(ArrayBuffer);
399+
expect(Array.from(new Uint8Array(result))).toEqual([10, 20, 30, 40]);
400+
});
401+
402+
it('decodes a data URI string', () => {
403+
const bytes = new Uint8Array([11, 22, 33]);
404+
const base64 = Buffer.from(bytes).toString('base64');
405+
const result = dataUriToArrayBuffer(`data:image/tiff;base64,${base64}`);
406+
expect(Array.from(new Uint8Array(result))).toEqual([11, 22, 33]);
407+
});
408+
409+
it('decodes a raw base64 string', () => {
410+
const bytes = new Uint8Array([55, 66, 77]);
411+
const base64 = Buffer.from(bytes).toString('base64');
412+
const result = dataUriToArrayBuffer(base64);
413+
expect(Array.from(new Uint8Array(result))).toEqual([55, 66, 77]);
414+
});
415+
416+
it('throws on a data URI missing the comma', () => {
417+
expect(() => dataUriToArrayBuffer('data:image/png;base64')).toThrow('Invalid data URI');
418+
});
419+
420+
it('throws on unsupported data types', () => {
421+
expect(() => dataUriToArrayBuffer(12345)).toThrow('Unsupported data type');
422+
expect(() => dataUriToArrayBuffer({})).toThrow('Unsupported data type');
423+
});
424+
});
425+
388426
describe('detectImageType', () => {
389427
it('detects PNG from magic bytes', () => {
390428
// PNG signature: 89 50 4E 47 0D 0A 1A 0A

packages/super-editor/src/core/super-converter/v3/handlers/w/p/helpers/legacy-handle-paragraph-node.test.js

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,16 @@ vi.mock('@converter/v2/importer/index.js', () => ({
1313
}));
1414

1515
// Simple and predictable conversion for positions
16-
vi.mock('@converter/helpers.js', () => ({
17-
twipsToPixels: (twips) => (twips === undefined ? undefined : Number(twips) / 20),
18-
twipsToInches: (twips) => (twips === undefined ? undefined : Number(twips) / 10),
19-
twipsToLines: (twips) => (twips === undefined ? undefined : Number(twips) / 240),
20-
pixelsToTwips: (pixels) => (pixels === undefined ? undefined : Math.round(Number(pixels) * 20)),
21-
}));
16+
vi.mock('@converter/helpers.js', async (importOriginal) => {
17+
const actual = await importOriginal();
18+
return {
19+
...actual,
20+
twipsToPixels: (twips) => (twips === undefined ? undefined : Number(twips) / 20),
21+
twipsToInches: (twips) => (twips === undefined ? undefined : Number(twips) / 10),
22+
twipsToLines: (twips) => (twips === undefined ? undefined : Number(twips) / 240),
23+
pixelsToTwips: (pixels) => (pixels === undefined ? undefined : Math.round(Number(pixels) * 20)),
24+
};
25+
});
2226

2327
import { handleParagraphNode } from './legacy-handle-paragraph-node.js';
2428
import { parseMarks, mergeTextNodes } from '@converter/v2/importer/index.js';

packages/super-editor/src/core/super-converter/v3/handlers/wp/helpers/decode-image-node-helpers.test.js

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -5,18 +5,22 @@ import {
55
import * as helpers from '@converter/helpers.js';
66
import * as annotationHelpers from '@converter/v3/handlers/w/sdt/helpers/translate-field-annotation.js';
77

8-
vi.mock('@converter/helpers.js', () => ({
9-
emuToPixels: vi.fn((v) => v / 9525), // 1 emu ≈ 1/9525 px
10-
pixelsToEmu: vi.fn((v) => v * 9525),
11-
getTextIndentExportValue: vi.fn((v) => v),
12-
inchesToTwips: vi.fn((v) => v),
13-
linesToTwips: vi.fn((v) => v),
14-
pixelsToEightPoints: vi.fn((v) => v),
15-
pixelsToTwips: vi.fn((v) => v),
16-
ptToTwips: vi.fn((v) => v),
17-
rgbToHex: vi.fn(() => '#000000'),
18-
degreesToRot: vi.fn((v) => v),
19-
}));
8+
vi.mock('@converter/helpers.js', async (importOriginal) => {
9+
const actual = await importOriginal();
10+
return {
11+
...actual,
12+
emuToPixels: vi.fn((v) => v / 9525), // 1 emu ≈ 1/9525 px
13+
pixelsToEmu: vi.fn((v) => v * 9525),
14+
getTextIndentExportValue: vi.fn((v) => v),
15+
inchesToTwips: vi.fn((v) => v),
16+
linesToTwips: vi.fn((v) => v),
17+
pixelsToEightPoints: vi.fn((v) => v),
18+
pixelsToTwips: vi.fn((v) => v),
19+
ptToTwips: vi.fn((v) => v),
20+
rgbToHex: vi.fn(() => '#000000'),
21+
degreesToRot: vi.fn((v) => v),
22+
};
23+
});
2024

2125
vi.mock('@converter/v3/handlers/w/sdt/helpers/translate-field-annotation.js', () => ({
2226
prepareTextAnnotation: vi.fn(() => ({ type: 'text', text: 'annotation' })),

packages/super-editor/src/core/super-converter/v3/handlers/wp/helpers/encode-image-node-helpers.js

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ import {
88
extractCustomGeometry,
99
} from './vector-shape-helpers';
1010
import { convertMetafileToSvg, isMetafileExtension, setMetafileDomEnvironment } from './metafile-converter.js';
11+
import { convertTiffToPng, isTiffExtension, setTiffDomEnvironment } from './tiff-converter.js';
1112
import {
1213
collectTextBoxParagraphs,
1314
preProcessTextBoxContent,
@@ -405,6 +406,22 @@ export function handleImageNode(node, params, isAnchor) {
405406
}
406407
}
407408

409+
// Convert TIFF images to PNG for display (browsers cannot render TIFF natively)
410+
if (!wasConverted && isTiffExtension(extension)) {
411+
const mediaData = converter?.media?.[path];
412+
if (mediaData) {
413+
if (converter?.domEnvironment) {
414+
setTiffDomEnvironment(converter.domEnvironment);
415+
}
416+
const conversionResult = convertTiffToPng(mediaData);
417+
if (conversionResult?.dataUri) {
418+
finalSrc = conversionResult.dataUri;
419+
finalExtension = conversionResult.format || 'png';
420+
wasConverted = true;
421+
}
422+
}
423+
}
424+
408425
// For converted metafile images (EMF+/WMF+ placeholders), we want them to render
409426
// as block-level images, not inline. We use the original wrap type if available,
410427
// otherwise default to the original wrap settings.
@@ -416,8 +433,8 @@ export function handleImageNode(node, params, isAnchor) {
416433
// originalXml: carbonCopy(node),
417434
src: finalSrc,
418435
alt:
419-
isMetafileExtension(extension) && !wasConverted
420-
? 'Unable to render EMF/WMF image'
436+
(isMetafileExtension(extension) || isTiffExtension(extension)) && !wasConverted
437+
? 'Unable to render image'
421438
: docPr?.attributes?.name || 'Image',
422439
extension: finalExtension,
423440
// Store original path and extension for potential round-tripping

packages/super-editor/src/core/super-converter/v3/handlers/wp/helpers/encode-image-node-helpers.test.js

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ import { describe, it, expect, vi, beforeEach } from 'vitest';
22
import { handleImageNode, getVectorShape } from './encode-image-node-helpers.js';
33
import { emuToPixels, polygonToObj, rotToDegrees } from '@converter/helpers.js';
44
import { extractFillColor, extractStrokeColor, extractStrokeWidth, extractLineEnds } from './vector-shape-helpers.js';
5+
import { convertTiffToPng } from './tiff-converter.js';
56

67
vi.mock('@converter/helpers.js', async (importOriginal) => {
78
const actual = await importOriginal();
@@ -21,6 +22,14 @@ vi.mock('./vector-shape-helpers.js', () => ({
2122
extractCustomGeometry: vi.fn(),
2223
}));
2324

25+
vi.mock('./tiff-converter.js', async (importOriginal) => {
26+
const actual = await importOriginal();
27+
return {
28+
...actual,
29+
convertTiffToPng: vi.fn(actual.convertTiffToPng),
30+
};
31+
});
32+
2433
describe('handleImageNode', () => {
2534
beforeEach(() => {
2635
vi.clearAllMocks();
@@ -221,6 +230,34 @@ describe('handleImageNode', () => {
221230
expect(result.attrs.size).toEqual({ width: 5, height: 6 }); // emuToPixels mocked
222231
});
223232

233+
it('calls convertTiffToPng for .tif images', () => {
234+
convertTiffToPng.mockReturnValue({ dataUri: 'data:image/png;base64,fake', format: 'png' });
235+
const node = makeNode();
236+
const params = {
237+
...makeParams('media/photo.tif'),
238+
converter: { media: { 'word/media/photo.tif': 'data:image/tiff;base64,AAAA' } },
239+
};
240+
const result = handleImageNode(node, params, false);
241+
242+
expect(convertTiffToPng).toHaveBeenCalledWith('data:image/tiff;base64,AAAA');
243+
expect(result.attrs.src).toBe('data:image/png;base64,fake');
244+
expect(result.attrs.extension).toBe('png');
245+
});
246+
247+
it('returns alt text when convertTiffToPng returns null', () => {
248+
convertTiffToPng.mockReturnValue(null);
249+
const node = makeNode();
250+
const params = {
251+
...makeParams('media/photo.tif'),
252+
converter: { media: { 'word/media/photo.tif': 'data:image/tiff;base64,AAAA' } },
253+
};
254+
const result = handleImageNode(node, params, false);
255+
256+
expect(convertTiffToPng).toHaveBeenCalledWith('data:image/tiff;base64,AAAA');
257+
expect(result.attrs.alt).toBe('Unable to render image');
258+
expect(result.attrs.extension).toBe('tif');
259+
});
260+
224261
it('captures unhandled drawing children for passthrough preservation', () => {
225262
const node = makeNode();
226263
node.elements.push({
@@ -292,7 +329,7 @@ describe('handleImageNode', () => {
292329
const node = makeNode();
293330
const params = makeParams('media/pic.emf');
294331
const result = handleImageNode(node, params, false);
295-
expect(result.attrs.alt).toBe('Unable to render EMF/WMF image');
332+
expect(result.attrs.alt).toBe('Unable to render image');
296333
expect(result.attrs.extension).toBe('emf');
297334
});
298335

0 commit comments

Comments
 (0)