Skip to content

Icons Registry: improve SVG sanitizer#75550

Open
t-hamano wants to merge 7 commits into
trunkfrom
icon-registry-sanitizer
Open

Icons Registry: improve SVG sanitizer#75550
t-hamano wants to merge 7 commits into
trunkfrom
icon-registry-sanitizer

Conversation

@t-hamano
Copy link
Copy Markdown
Contributor

@t-hamano t-hamano commented Feb 15, 2026

What?

In the experimental icon registry, any string can be registered as SVG content. Unfortunately, there is still no core function to properly sanitize SVGs, so we've enhanced the private methods for sanitization and defined as many SVG-friendly tags and attributes as possible.

Testing Instructions

Please refer to the unit test to see what the function expects.

@t-hamano t-hamano self-assigned this Feb 15, 2026
@github-project-automation github-project-automation Bot moved this to 🔎 Needs Review in WordPress 7.0 Editor Tasks Feb 15, 2026
@t-hamano t-hamano added [Type] Experimental Experimental feature or API. [Block] Icon Affects the Icon block labels Feb 15, 2026
@t-hamano t-hamano force-pushed the icon-registry-sanitizer branch from e537f45 to 7bdd4c1 Compare February 15, 2026 08:58
Comment on lines +748 to +750
if ( ! preg_match( '/<svg\b/i', $filtered_content ) ) {
return '';
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wp_kses() removes disallowed tags but not the text inside them, so I added a logic that returns an empty string if the tag is not an SVG element.

Comment on lines +66 to +67
'<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="24" height="24" class="icon" aria-hidden="true"><path d="M0 0" fill="currentColor" /></svg>',
'<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24" width="24" height="24" class="icon" aria-hidden="true"><path d="M0 0" fill="currentColor" /></svg>',
Copy link
Copy Markdown
Contributor Author

@t-hamano t-hamano Feb 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised to see wp_kses() convert camel case attribute names (viewBox) to lower case (viewbox). Lower-case attribute names should be recognized correctly by most browsers, but I'm not sure if this should be allowed.

cc @dmsnell @sirreal

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I've tested, lowercase viewbox is automatically recognized as viewBox by major browsers, so as long as we're using inline SVG as HTML content, we shouldn't have any problems.

<!DOCTYPE html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Document</title>
</head>
<body>

<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 100 100" width="100">
  <rect x="0" y="0" width="100%" height="100%" />
  <circle cx="50%" cy="50%" r="4" fill="white" />
</svg>
<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 100 100" width="100">
  <rect x="0" y="0" width="100%" height="100%" />
  <circle cx="50%" cy="50%" r="4" fill="white" />
</svg>

</body>
</html>
viewbox

@t-hamano t-hamano changed the title Icons Regstry: improveme SVG sanitizer Icons Regstry: improve SVG sanitizer Feb 15, 2026
@t-hamano t-hamano marked this pull request as ready for review February 15, 2026 09:09
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 15, 2026

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: t-hamano <wildworks@git.wordpress.org>
Co-authored-by: tyxla <tyxla@git.wordpress.org>
Co-authored-by: dmsnell <dmsnell@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@mcsf mcsf changed the title Icons Regstry: improve SVG sanitizer Icons Registry: improve SVG sanitizer Feb 17, 2026
@dmsnell
Copy link
Copy Markdown
Member

dmsnell commented Feb 20, 2026

@t-hamano this seems like a good improvement since it retains the structure of the existing sanitization while adding extra attributes. did you verify that none of the newly-allowed attributes allows for script execution?

one note here is that SVG is complicated to parse, and even the search for <svg\b seems a bit suspicious. the HTML Processor can parse all valid inline SVGs, and one thing that might be useful here is to prepare the input using that and then dump the output into kses after normalization, where things that seem sneaky might be less prone to issues in kses.

$svg = '';
$processor = WP_HTML_Processor::create_fragment( $icon_content );
if ( ! $processor->next_token() || 'SVG' !== $processor->get_tag() ) {
	return '';
}

$svg .= $processor->serialize_token();
$svg_depth = $processor->get_current_depth();
while ( $processor->next_token() && $processor->get_current_depth() > $svg_depth ) {
	$svg .= $processor->serialize_token();
}

// Parsing has stopped even though there might be additional content.
$filtered = wp_kses( ... )

This could also be done after filtering with wp_kses() as it’s likely to corrupt the SVG, but normalizing first should help avoid a lot of problems. For instance, wp_kses() will be unaware of this but a single <p> terminates the SVG element. In fact, many tags that appear within inline SVG will end it and jump back into HTML. If we first extract the SVG in its entirety with the HTML Processor then we wouldn’t have to consider this.

Unfortunately there could be issues with self-closing tags, which exist within the SVG content but otherwise not in the HTML. In HTML, <svg/> is an entire SVG element, even though HTML contains no self-closing tags (the <svg/> tag is actually an SVG tag so the self-closing rule applies).

sticking with wp_kses() is appropriate apart from all of this. I have proposed dmsnell/wordpress-develop#20 and several other changes that will feed up into wp_kses() (WordPress/wordpress-develop#7409, WordPress/wordpress-develop#9270, and WordPress/wordpress-develop#9851 for example), so ultimately that should become more reliable.

for gradual improvement I think this is good as long as we have a thorough review of the attributes and tags we’re allowing.

* @return string The sanitized icon SVG content.
*/
private function sanitize_icon_content( $icon_content ) {
// Core attributes applicable to most elements.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be using the elements and attributes from my earlier comment, can you confirm?

Would be nice to validate those, as I haven't gone in-depth in that, and there are plenty of elements and attributes that SVGs can support. Also we know that any KSES-related changes shouldn't be taken lightly 😉

@t-hamano
Copy link
Copy Markdown
Contributor Author

Sorry for the late reply. I'm working with @mcsf to build the infrastructure to allow the Icon API to be extended on the Gutenberg plugin. If that works out, I'll come back to this PR.

@t-hamano t-hamano added [Type] Enhancement A suggestion for improvement. and removed [Type] Experimental Experimental feature or API. labels Feb 25, 2026
@t-hamano t-hamano added [Feature] Icons Related to Icon registration API and Icon REST API and removed [Block] Icon Affects the Icon block labels Mar 3, 2026
@t-hamano t-hamano moved this from 🔎 Needs Review to 🦵 Punted to 7.1 in WordPress 7.0 Editor Tasks Mar 3, 2026
@t-hamano t-hamano force-pushed the icon-registry-sanitizer branch from 7bdd4c1 to 26a51db Compare March 12, 2026 08:51
@t-hamano
Copy link
Copy Markdown
Contributor Author

Ok, I implemented the following two things based on your feedback:

  • Parse SVG through WP_HTML_Processor before sanitizing it with wp_kses
  • Add missing elements to $allowed_tags: switch, a, view, animateMotion, set
  • Enhance unit tests

There are three things that concern me about the operation of the sanitizer:

  • If HTML-like tags are present, they will be converted to empty SVG. Perhaps they should be converted to empty strings. (ref)
  • The xlink namespace is removed, I think this is because xmlns:xlink is converted to xlink by HTML_Tag_Processor and then stripped by wp_kses. (ref)
  • Attribute names are converted from camel case to lower case, e.g. viewBox, attributeName.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid duplicate work, I would like to create a core backport PR after this PR is approved.

@dmsnell
Copy link
Copy Markdown
Member

dmsnell commented Mar 12, 2026

There are three things that concern me about the operation of the sanitizer:

Let’s discuss one aspect here that I think partially addresses these concerns: the HTML Processor is parsing SVGs as they would appear inline in HTML. It is not attempting to process external SVGs as they would be loaded in a context like <img src="/assets/logo.svg">.

There’s a big distinction there because browsers handle inline and external SVGs in extremely different ways due to the legacy of browser parsers and the ongoing challenge of securing the web. In this mix is a third context, which is SVGs as part of data URIs and I don’t know off the top of my head where that lies, but it’s closer to external assets.

These distinctions appear as:

  • whether or not scripts run, or have access to the parent window
  • whether or not other references load, or can load from cross-origin domains
  • parsing as HTML vs. parsing as XML

A sanitization pipeline for external assets needs to be almost entirely different than those for assets bound to be included in the HTML, and unfortunately due to the extensible nature of WordPress, we cannot guarantee that external SVGs will not be inlined into the HTML via some plugin.

Therefore, we are basically left where it’s probably mostly appropriate to assume these are all bound for HTML.

  • If HTML-like tags are present, they will be converted to empty SVG. Perhaps they should be converted to empty strings. (ref)

I’m not sure I follow. If you are looking at something like <svg><p>Content!</p></svg> and seeing an empty SVG, that’s how an SVG inside HTML would parse. Any SVG content up until one of these HTML tags, however, would be present still in the SVG. Also, there are times where HTML is allowed, such as inside <desc> or <foreignobject> elements. You can see this in the linked PHP playground replacing $html with <svg><desc><p>test. With the <desc> tag there the <p>test</p> is contained within the SVG, but without it, the SVG closes and the <p>test</p> belongs as a sibling node to the SVG>

  • The xlink namespace is removed, I think this is because xmlns:xlink is converted to xlink by HTML_Tag_Processor and then stripped by wp_kses. (ref)

This relates entirely to HTML parsing of inline SVG, but the SVGs should continue working regardless. HTML elements and attributes don’t have namespacing like XML elements do. There are a few exceptions in HTML where these are faked for the sake of inline SVG and MathML, but that is why it might appear to unexpectedly strip out the xmlns part.

The rules are adjusting SVG attributes in the HTML spec. The function WP_HTML_Processor::get_qualified_attribute_name() can reveal some of this, but it must be used carefully since it separates the namespace from the attribute name with a space instead of a :.

  • Attribute names are converted from camel case to lower case, e.g. viewBox, attributeName.

See above, the same rules apply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Feature] Icons Related to Icon registration API and Icon REST API [Type] Enhancement A suggestion for improvement.

Projects

Status: 🦵 Punted to 7.1

Development

Successfully merging this pull request may close these issues.

3 participants