Skip to content

security: always-on attribute + URL hardening; cap inline nesting#253

Merged
dereuromark merged 1 commit into
masterfrom
security/always-on-hardening
Jun 26, 2026
Merged

security: always-on attribute + URL hardening; cap inline nesting#253
dereuromark merged 1 commit into
masterfrom
security/always-on-hardening

Conversation

@dereuromark

Copy link
Copy Markdown
Contributor

Ports the confirmed default-output security hardening from the downstream carve-php fork.

Problem

All dangerous-content sanitization is gated on safe mode (if ($this->safeMode !== null) in HtmlRenderer), and safe mode is off by default. So a plain new DjotConverter() / new HtmlRenderer() emits, unfiltered:

  • javascript: / vbscript: / data: URLs in link href and image src
  • event-handler attributes (onclick, ...) and the injection sinks srcdoc / formaction
  • CSS expression() in style

Separately, deeply nested inline constructs (a bomb of nested links / emphasis) rescan balanced brackets at each recursion level - roughly O(n^2). A few thousand levels of nesting took ~10s of CPU, a parse-time DoS.

Change

  • HtmlRenderer - always-on baseline, independent of safe mode. Strip on* / srcdoc / formaction attribute names; blank a value carrying a dangerous URL scheme or a script-bearing CSS construct (expression() / url() / @import / behavior / -moz-binding). href / src run the dangerous-scheme denylist before the safe-mode allowlist. Scheme detection strips C0 controls + spaces first, so java\tscript: cannot evade. Safe mode still layers its stricter filtering (raw-HTML policy, style, allowlists) on top - this only changes the default (no-safe-mode) output, which previously did nothing.
  • InlineParser - cap inline recursion depth (100, far deeper than any real document). Beyond the cap the remaining text is emitted literally instead of recursing, bounding the O(n^2) blowup.
  • New AlwaysOnHardeningTest; two SafeModeTest cases that asserted dangerous URLs pass through with safe mode off were updated to the new baseline (the off-by-default distinction is now shown via raw-HTML passthrough).

Compatibility

This shifts the default rendering output: dangerous schemes / handlers that previously passed through are now neutralized even without safe mode. There is no legitimate reason to emit javascript: or onclick from untrusted markup, so the baseline is unconditional; callers who genuinely need a permissive renderer still control the rest via safe-mode configuration.

Scope notes

Confirmed-not-applicable and excluded: djot-php already drops a colliding {href=...} attribute-block override on links/images (no change needed). The HTML importer (HtmlToDjot) carrying javascript: through on import is a separate surface, left for a follow-up.

Dangerous content was only neutralized when safe mode was explicitly enabled,
but safe mode is off by default - so a plain HtmlRenderer emitted javascript:
hrefs, on* handlers, srcdoc/formaction and CSS expression() unfiltered. Apply a
baseline that runs regardless of safe mode; safe mode still layers stricter
filtering (raw-HTML policy, style, allowlists) on top.

- HtmlRenderer: strip on*/srcdoc/formaction attribute names and blank values
  carrying a dangerous URL scheme (javascript/vbscript/data/file) or CSS
  expression()/url()/@import/behavior, always-on. href/src get the dangerous
  scheme denylist before the safe-mode allowlist. Scheme detection normalizes
  C0 controls + spaces to defeat java\tscript: evasion.
- InlineParser: cap inline recursion (deeply nested links/emphasis rescanned
  balanced brackets at each level, ~O(n^2) - a few thousand levels took ~10s);
  beyond the cap the remaining text is emitted literally.
- SafeModeTest: the two cases that asserted dangerous URLs pass through with
  safe mode off now assert the always-on baseline, demonstrating the off-by-
  default distinction via raw-HTML passthrough instead.
@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.93103% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.32%. Comparing base (78f8aec) to head (445ec64).

Files with missing lines Patch % Lines
src/Renderer/HtmlRenderer.php 86.00% 7 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #253      +/-   ##
============================================
- Coverage     92.34%   92.32%   -0.02%     
- Complexity     3593     3617      +24     
============================================
  Files           107      107              
  Lines         10165    10221      +56     
============================================
+ Hits           9387     9437      +50     
- Misses          778      784       +6     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dereuromark dereuromark merged commit f066e5f into master Jun 26, 2026
4 of 6 checks passed
@dereuromark dereuromark deleted the security/always-on-hardening branch June 26, 2026 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant