Expand comment introducing modifiable text.

dmsnell · dmsnell · commit 283df465f130 · 2024-01-16T09:36:15.000-06:00
diff --git a/src/wp-includes/html-api/class-wp-html-tag-processor.php b/src/wp-includes/html-api/class-wp-html-tag-processor.php
@@ -307,23 +307,34 @@
  *
  * #### Other tokens with modifiable text.
  *
- * There are also non-elements which are atomic in nature and contain modifiable text.
+ * There are also non-elements which are void/self-closing in nature and contain
+ * modifiable text that is part of that individual syntax token itself.
  *
  *  - `#text` nodes, whose entire token _is_ the modifiable text.
- *  - Comment nodes and nodes that became comments because of some syntax error. The
- *    text for these nodes is the portion of the comment inside of the syntax. E.g. for
- *    `<!-- comment -->` the text is `" comment "` (note that the spaces are part of it).
+ *  - HTML comments and tokens that become comments due to some syntax error. The
+ *    text for these tokens is the portion of the comment inside of the syntax.
+ *    E.g. for `<!-- comment -->` the text is `" comment "` (note the spaces are included).
  *  - `CDATA` sections, whose text is the content inside of the section itself. E.g. for
- *    `<![CDATA[some content]]>` the text is `"some content"`.
+ *    `<![CDATA[some content]]>` the text is `"some content"` (with restrictions [1]).
  *  - "Funky comments," which are a special case of invalid closing tags whose name is
  *    invalid. The text for these nodes is the text that a browser would transform into
- *    an HTML when parsing. E.g. for `</%post_author>` the text is `%post_author`.
- *
- * And there are non-elements which are atomic in nature but have no modifiable text.
- *  - `DOCTYPE` nodes like `<DOCTYPE html>` which have no closing tag.
- *  - XML Processing instruction nodes like `<?xml charset="utf8"?>` (with restrictions).}
- *  - The empty end tag `</>` which is ignored in the browser and DOM but exposed
- *    to the HTML API.
+ *    an HTML comment when parsing. E.g. for `</%post_author>` the text is `%post_author`.
+ *  - `DOCTYPE` declarations like `<DOCTYPE html>` which have no closing tag.
+ *  - XML Processing instruction nodes like `<?wp __( "Like" ); ?>` (with restrictions [2]).
+ *  - The empty end tag `</>` which is ignored in the browser and DOM.
+ *
+ * [1]: There are no CDATA sections in HTML. When encountering `<![CDATA[`, everything
+ *      until the next `>` becomes a bogus HTML comment, meaning there can be no CDATA
+ *      section in an HTML document containing `>`. The Tag Processor will first find
+ *      all valid and bogus HTML comments, and then if the comment _would_ have been a
+ *      CDATA section _were they to exist_, it will re-classify the bogus comment as such.
+ *
+ * [2]: XML allows a broader range of characters in a processing instruction's target name
+ *      and disallows "xml" as a name, since it's special. The Tag Processor only recognizes
+ *      target names with an ASCII-representable subset of characters. It also exhibits the
+ *      same constraint as with CDATA sections, in that `>` cannot exist within the token
+ *      since Processing Instructions do no exist within HTML and their syntax transforms
+ *      into a bogus comment in the DOM.
  *
  * ## Design and limitations
  *