security: always-on attribute + URL hardening; cap inline nesting by dereuromark · Pull Request #253 · php-collective/djot-php

dereuromark · 2026-06-24T23:25:56Z

Ports the confirmed default-output security hardening from the downstream carve-php fork.

Problem

All dangerous-content sanitization is gated on safe mode (if ($this->safeMode !== null) in HtmlRenderer), and safe mode is off by default. So a plain new DjotConverter() / new HtmlRenderer() emits, unfiltered:

javascript: / vbscript: / data: URLs in link href and image src
event-handler attributes (onclick, ...) and the injection sinks srcdoc / formaction
CSS expression() in style

Separately, deeply nested inline constructs (a bomb of nested links / emphasis) rescan balanced brackets at each recursion level - roughly O(n^2). A few thousand levels of nesting took ~10s of CPU, a parse-time DoS.

Change

HtmlRenderer - always-on baseline, independent of safe mode. Strip on* / srcdoc / formaction attribute names; blank a value carrying a dangerous URL scheme or a script-bearing CSS construct (expression() / url() / @import / behavior / -moz-binding). href / src run the dangerous-scheme denylist before the safe-mode allowlist. Scheme detection strips C0 controls + spaces first, so java\tscript: cannot evade. Safe mode still layers its stricter filtering (raw-HTML policy, style, allowlists) on top - this only changes the default (no-safe-mode) output, which previously did nothing.
InlineParser - cap inline recursion depth (100, far deeper than any real document). Beyond the cap the remaining text is emitted literally instead of recursing, bounding the O(n^2) blowup.
New AlwaysOnHardeningTest; two SafeModeTest cases that asserted dangerous URLs pass through with safe mode off were updated to the new baseline (the off-by-default distinction is now shown via raw-HTML passthrough).

Compatibility

This shifts the default rendering output: dangerous schemes / handlers that previously passed through are now neutralized even without safe mode. There is no legitimate reason to emit javascript: or onclick from untrusted markup, so the baseline is unconditional; callers who genuinely need a permissive renderer still control the rest via safe-mode configuration.

Scope notes

Confirmed-not-applicable and excluded: djot-php already drops a colliding {href=...} attribute-block override on links/images (no change needed). The HTML importer (HtmlToDjot) carrying javascript: through on import is a separate surface, left for a follow-up.

Dangerous content was only neutralized when safe mode was explicitly enabled, but safe mode is off by default - so a plain HtmlRenderer emitted javascript: hrefs, on* handlers, srcdoc/formaction and CSS expression() unfiltered. Apply a baseline that runs regardless of safe mode; safe mode still layers stricter filtering (raw-HTML policy, style, allowlists) on top. - HtmlRenderer: strip on*/srcdoc/formaction attribute names and blank values carrying a dangerous URL scheme (javascript/vbscript/data/file) or CSS expression()/url()/@import/behavior, always-on. href/src get the dangerous scheme denylist before the safe-mode allowlist. Scheme detection normalizes C0 controls + spaces to defeat java\tscript: evasion. - InlineParser: cap inline recursion (deeply nested links/emphasis rescanned balanced brackets at each level, ~O(n^2) - a few thousand levels took ~10s); beyond the cap the remaining text is emitted literally. - SafeModeTest: the two cases that asserted dangerous URLs pass through with safe mode off now assert the always-on baseline, demonstrating the off-by- default distinction via raw-HTML passthrough instead.

codecov · 2026-06-24T23:27:22Z

Codecov Report

❌ Patch coverage is 87.93103% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.32%. Comparing base (78f8aec) to head (445ec64).

Files with missing lines	Patch %	Lines
src/Renderer/HtmlRenderer.php	86.00%	7 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master     #253      +/-   ##
============================================
- Coverage     92.34%   92.32%   -0.02%     
- Complexity     3593     3617      +24     
============================================
  Files           107      107              
  Lines         10165    10221      +56     
============================================
+ Hits           9387     9437      +50     
- Misses          778      784       +6

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

dereuromark merged commit f066e5f into master Jun 26, 2026
4 of 6 checks passed

dereuromark deleted the security/always-on-hardening branch June 26, 2026 01:59

dereuromark mentioned this pull request Jun 27, 2026

security: harden URL schemes in non-HTML renderers #255

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

security: always-on attribute + URL hardening; cap inline nesting#253

security: always-on attribute + URL hardening; cap inline nesting#253
dereuromark merged 1 commit into
masterfrom
security/always-on-hardening

dereuromark commented Jun 24, 2026

Uh oh!

codecov Bot commented Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

dereuromark commented Jun 24, 2026

Problem

Change

Compatibility

Scope notes

Uh oh!

codecov Bot commented Jun 24, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant