Skip to content

Upstream sanitizer api#12395

Open
noamr wants to merge 71 commits into
whatwg:mainfrom
noamr:zcorpan/upstream-sanitizer-api
Open

Upstream sanitizer api#12395
noamr wants to merge 71 commits into
whatwg:mainfrom
noamr:zcorpan/upstream-sanitizer-api

Conversation

@noamr
Copy link
Copy Markdown
Collaborator

@noamr noamr commented Apr 21, 2026

Convert the incubated spec in https://wicg.github.io/sanitizer-api/ to the HTML format and make it part of the HTML standard.

(See WHATWG Working Mode: Changes for more details.)


/canvas.html ( diff )
/comms.html ( diff )
/dom.html ( diff )
/dynamic-markup-insertion.html ( diff )
/edits.html ( diff )
/embedded-content-other.html ( diff )
/form-elements.html ( diff )
/forms.html ( diff )
/grouping-content.html ( diff )
/iframe-embed-object.html ( diff )
/image-maps.html ( diff )
/imagebitmap-and-animations.html ( diff )
/index.html ( diff )
/indices.html ( diff )
/infrastructure.html ( diff )
/interaction.html ( diff )
/interactive-elements.html ( diff )
/microdata.html ( diff )
/parsing.html ( diff )
/references.html ( diff )
/rendering.html ( diff )
/scripting.html ( diff )
/sections.html ( diff )
/semantics.html ( diff )
/system-state.html ( diff )
/tables.html ( diff )
/text-level-semantics.html ( diff )
/timers-and-user-prompts.html ( diff )
/web-messaging.html ( diff )
/webstorage.html ( diff )
/workers.html ( diff )

@noamr noamr marked this pull request as draft April 21, 2026 13:16
@noamr noamr changed the base branch from zcorpan/upstream-sanitizer-api to main April 21, 2026 13:17
@noamr noamr changed the title WIP upstream sanitizer api Upstream sanitizer api Apr 21, 2026
@noamr noamr force-pushed the zcorpan/upstream-sanitizer-api branch from 223a4d1 to d2034e5 Compare April 21, 2026 19:42
@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 21, 2026

@zcorpan @evilpie @mozfreddyb @otherdaniel
initial review? :)
this is quite a big PR...

@evilpie
Copy link
Copy Markdown
Member

evilpie commented Apr 22, 2026

Amazing, thanks for working on this.

The built-in safe default configuration is pretty integral to the API, where did I go?

For anyone else looking at this, the gist of the changes are in dynamic-markup-insertion.html.

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 22, 2026

Amazing, thanks for working on this.

The built-in safe default configuration is pretty integral to the API, where did I go?

Oh you're right I had it on my todo list and forgot. Getting to it. Thanks!

Comment thread source
Comment thread source Outdated
@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 22, 2026

Amazing, thanks for working on this.
The built-in safe default configuration is pretty integral to the API, where did I go?

Oh you're right I had it on my todo list and forgot. Getting to it. Thanks!

Done.

@noamr noamr marked this pull request as ready for review April 22, 2026 10:56
Comment thread source Outdated
@annevk
Copy link
Copy Markdown
Member

annevk commented Apr 22, 2026

I thought as part of moving this into the HTML standard we'd also address the parser integration issue?

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 23, 2026

I thought as part of moving this into the HTML standard we'd also address the parser integration issue?

This is a huge PR so I thought doing it in two stages, the first one being a purely technical upstream, would be easier to review?

Open and happy to incorporate the stream-while-parsing changes in this PR if you and @zcorpan are ok to review that in one go.

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 23, 2026

@zcorpan @annevk can we align on whether we upstream the sanitizer as is and then change it to stream-while-parsing, or do it in one go? I'm perfectly happy with both options.

@noamr noamr added the agenda+ To be discussed at a triage meeting label Apr 23, 2026
@zcorpan
Copy link
Copy Markdown
Member

zcorpan commented Apr 23, 2026

I prefer doing the parser integration in a follow-up PR.

Comment thread source Outdated
@noamr noamr removed the agenda+ To be discussed at a triage meeting label Apr 23, 2026
@evilpie
Copy link
Copy Markdown
Member

evilpie commented Apr 23, 2026

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 23, 2026

I think these three PRs would be good to merge before merging into the HTML standard:

Since some security sensitive changes rely on "sanitizing while parsing", and that in turn relies on the current post-processing sanitizer being upstreamed, I don't think we should delay upstreaming any further.

Can we race it? If any of these go in before the upstream PR is in I'll incorporate them into the HTML PR.

@noamr noamr closed this Apr 23, 2026
@noamr noamr reopened this Apr 23, 2026
Comment thread source Outdated
data-x="dom-SanitizerProcessingInstruction-target">target</code> member.</p>
</div>

<div algorithm>
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These algorithms look like they belong in infra... would people be open to adding an optional comparator predicate to those, or to the definition of list/order set?

@annevk @zcorpan

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this to Infra SGTM.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened whatwg/infra#709 for now.
I'm not sure about the comparator thing - infra doesn't really say what it means that two items in a list are the same. Would it be enough to mention here whaht makes items of attributes/elements/... lists "equal"

Copy link
Copy Markdown
Contributor

@otherdaniel otherdaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, and I'm super happy to see this happening!


I wonder if we can link to the "Security Considerations" section in the current spec; or have them in a supplementary document somewhere?

Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
Copy link
Copy Markdown
Contributor

@otherdaniel otherdaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, and I'm super happy to see this happening!


I wonder if we can link to the "Security Considerations" section in the current spec; or have them in a supplementary document somewhere?

@noamr noamr force-pushed the zcorpan/upstream-sanitizer-api branch from ea79a5b to 1e065df Compare April 28, 2026 13:07
@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 28, 2026

Thank you, and I'm super happy to see this happening!

I wonder if we can link to the "Security Considerations" section in the current spec; or have them in a supplementary document somewhere?

I've upstreamed them instead into a security consideration subsection

Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 30, 2026

I've refactored some of the sanitization constants to go into each element's definition instead of being in one huge table. I think that makes it less error prone when we add new elements in the future. If that's undesirable I'm happy to revert.

Comment thread source Outdated
data-x="dom-SanitizerProcessingInstruction-target">target</code> member.</p>
</div>

<div algorithm>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this to Infra SGTM.

Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
Comment thread source Outdated
<dd>« »</dd>

<dt><code data-x="dom-SanitizerConfig-attributes">attributes</code></dt>
<dd>all <span>global attributes</span>, alongside the MathML and SVG presentation attributes</dd>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread source Outdated
@@ -16164,6 +16266,7 @@ interface <dfn interface>HTMLTitleElement</dfn> : <span>HTMLElement</span> {
data-x="concept-element-accessibility-considerations">Accessibility considerations</span>:</dt>
<dd><a href="https://w3c.github.io/html-aria/#el-base">For authors</a>.</dd>
<dd><a href="https://w3c.github.io/html-aam/#el-base">For implementers</a>.</dd>
<dd><span>Navigating URL attributes</span>: <code data-x="attr-base-href">href</code>.</dd>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing
<dt><span data-x="concept-element-sanitization">Safe sanitization</span>:</dt>

Same issue for some other elements.

Copy link
Copy Markdown
Collaborator Author

@noamr noamr May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need that for all elements? I only have it for removed/included by default elements

and there is this phrase:

  <p>When specified, the <dfn data-x="concept-element-sanitization">safe sanitization</dfn> criteria
  for each element defines whether the element is <dfn data-x="sanitizer-removed">removed</dfn> or
  <dfn data-x="sanitizer-included-by-default">included by default</dfn> when performing safe
  sanitization. When unspecified, the element is not included by default, but can still be added by
  a <code>SanitizerConfig</code></p>

Should I define "allowed" as a 3rd option and define all of these?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed for all elements, but as is now this line is under "accessibility considerations" which is not correct.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread source Outdated
steps">sanitize</span> <var>child</var>'s <span>shadow root</span> given
<var>configuration</var> and <var>handleJavascriptNavigationUrls</var>.</p></li>

<li><p>Let <var>elementWithLocalAttributes</var> be « ».</p></li>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be «[ ]», right?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread source Outdated
Comment on lines +23173 to +23175
<dd><span>Navigating URL attributes</span>: <code data-x="attr-hyperlink-href">href</code>, <code
data-x="attr-hyperlink-hreflang">hreflang</code>, <code
data-x="attr-hyperlink-type">type</code>.</dd>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hreflang and type should not be listed here.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread source
Comment on lines +128651 to +128652
<h5 id="sanitizer-security-script-gadgets">XSS with script gadgets</h5>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-normative?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread source Outdated
framework which then performs the execution of JavaScript based on that input.</p>

<p>The Sanitizer API can not prevent these attacks, but requires page authors to explicitly allow
unknown elements in general, and authors required to additionally explicitly configure unknown
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"required" is a normative keyword (and "authors required" is not grammatically correct).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded

Comment thread source Outdated
Comment thread source Outdated
<tr>
<td><code data-x="SVG defs">defs</code>
<td><span data-x="SVG namespace">SVG</span>
<td> <tr>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing linebreak (also in more places in this table)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@noamr noamr force-pushed the zcorpan/upstream-sanitizer-api branch from 5adaf4c to 6ba99b9 Compare May 20, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants