Skip to content

Pr fix docx pdf and drop zone modes#23

Merged
MrChengLen merged 4 commits into
mainfrom
pr-fix-docx-pdf-and-drop-zone-modes
May 8, 2026
Merged

Pr fix docx pdf and drop zone modes#23
MrChengLen merged 4 commits into
mainfrom
pr-fix-docx-pdf-and-drop-zone-modes

Conversation

@MrChengLen
Copy link
Copy Markdown
Owner

No description provided.

MrChengLen and others added 4 commits May 8, 2026 15:01
The previous DOCX → PDF converter imported docx2pdf, which is not in
requirements.txt and crashed at runtime on every container deployment
(docx2pdf requires Microsoft Word installed locally on Windows or
LibreOffice on Linux — neither is available in the FileMorph container).

Replaces with a fully-Python pipeline:
  mammoth.convert_to_html(docx) → WeasyPrint.HTML(...).write_pdf()

mammoth inlines images as data: URIs and reduces footnotes/headers/OLE
to simplified blocks; WeasyPrint renders the resulting HTML to PDF
with the same `_deny_url_fetcher` SSRF guard the markdown→PDF path uses.

Tests cover happy path (text content survives), table layout (rows
render), and SSRF block (no outbound network during conversion). Tests
are skipped when WeasyPrint native deps (libgobject/pango) are missing
— typical on Windows dev hosts; CI and the Linux container do install
them so the gates fire there.

Adds `mammoth>=1.6` to requirements.txt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The homepage drop-zone advertised "Supported: HEIC · JPG · PNG · WebP
· BMP · TIFF · GIF · DOCX · PDF · TXT · MD · XLSX · CSV · JSON · MP4
· MOV · AVI · MKV · WebM · MP3 · WAV · FLAC · OGG · M4A" regardless
of the mode toggle, even though the compress path only accepts a much
narrower subset (JPG/PNG/WebP/TIFF + MP4/AVI/MOV/MKV/WebM). Users would
upload an MP3 in compress mode, see a 422, and bounce.

Adds two side-by-side hint blocks in index.html (`supported-convert`,
`supported-compress`) and toggles `hidden` from app.js::setMode based
on the active mode. Server-side validation is unchanged — this is
purely a UI affordance to set expectations correctly before upload.

New tests/test_homepage_drop_zone_modes.py asserts both blocks are
present with the expected `hidden` defaults so a future template
refactor can't silently revert this.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… troubleshooting

Pairs with the converter rewrite in fb7da9f^..ab796ef. The previous
formats.md row promised "Requires Microsoft Word on Windows, or
LibreOffice on Linux" which is no longer true — the pure-Python
pipeline runs the same on every host. Replaces the row text and
the long-form notes section, and removes the now-stale Linux
LibreOffice troubleshooting block from installation.md so a self-
hoster doesn't waste an apt install on a dead path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The drop-zone-mode commit (453cc96) added two new format-list strings
to index.html — `_('Supported: JPG · PNG · WebP · TIFF')` and
`_('MP4 · MOV · AVI · MKV · WebM')` — but the locale catalogs were
not regenerated. CI's i18n drift-check caught it: pot still mirrored
main's 608-line state without the new keys.

This commit:
- Runs `pybabel extract` to refresh `locale/messages.pot` (now 624 lines).
- Runs `pybabel update` to add the two msgids to `de.po` and `en.po`.
- Fixes Babel's fuzzy auto-match (it pulled the long-format DE string
  into the short-format slot) and writes the correct translations:
  "MP4 · MOV · AVI · MKV · WebM" → "MP4 · MOV · AVI · MKV · WebM"
  "Supported: JPG · PNG · WebP · TIFF" → "Unterstützt: JPG · PNG · WebP · TIFF"
- Re-compiles `.mo` files (185/185 DE translated).

Verified: drift-check exit 0, pytest 465 passed, ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MrChengLen MrChengLen merged commit e54f726 into main May 8, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant