|
307 | 307 | * |
308 | 308 | * #### Other tokens with modifiable text. |
309 | 309 | * |
310 | | - * There are also non-elements which are atomic in nature and contain modifiable text. |
| 310 | + * There are also non-elements which are void/self-closing in nature and contain |
| 311 | + * modifiable text that is part of that individual syntax token itself. |
311 | 312 | * |
312 | 313 | * - `#text` nodes, whose entire token _is_ the modifiable text. |
313 | | - * - Comment nodes and nodes that became comments because of some syntax error. The |
314 | | - * text for these nodes is the portion of the comment inside of the syntax. E.g. for |
315 | | - * `<!-- comment -->` the text is `" comment "` (note that the spaces are part of it). |
| 314 | + * - HTML comments and tokens that become comments due to some syntax error. The |
| 315 | + * text for these tokens is the portion of the comment inside of the syntax. |
| 316 | + * E.g. for `<!-- comment -->` the text is `" comment "` (note the spaces are included). |
316 | 317 | * - `CDATA` sections, whose text is the content inside of the section itself. E.g. for |
317 | | - * `<![CDATA[some content]]>` the text is `"some content"`. |
| 318 | + * `<![CDATA[some content]]>` the text is `"some content"` (with restrictions [1]). |
318 | 319 | * - "Funky comments," which are a special case of invalid closing tags whose name is |
319 | 320 | * invalid. The text for these nodes is the text that a browser would transform into |
320 | | - * an HTML when parsing. E.g. for `</%post_author>` the text is `%post_author`. |
321 | | - * |
322 | | - * And there are non-elements which are atomic in nature but have no modifiable text. |
323 | | - * - `DOCTYPE` nodes like `<DOCTYPE html>` which have no closing tag. |
324 | | - * - XML Processing instruction nodes like `<?xml charset="utf8"?>` (with restrictions).} |
325 | | - * - The empty end tag `</>` which is ignored in the browser and DOM but exposed |
326 | | - * to the HTML API. |
| 321 | + * an HTML comment when parsing. E.g. for `</%post_author>` the text is `%post_author`. |
| 322 | + * - `DOCTYPE` declarations like `<DOCTYPE html>` which have no closing tag. |
| 323 | + * - XML Processing instruction nodes like `<?wp __( "Like" ); ?>` (with restrictions [2]). |
| 324 | + * - The empty end tag `</>` which is ignored in the browser and DOM. |
| 325 | + * |
| 326 | + * [1]: There are no CDATA sections in HTML. When encountering `<![CDATA[`, everything |
| 327 | + * until the next `>` becomes a bogus HTML comment, meaning there can be no CDATA |
| 328 | + * section in an HTML document containing `>`. The Tag Processor will first find |
| 329 | + * all valid and bogus HTML comments, and then if the comment _would_ have been a |
| 330 | + * CDATA section _were they to exist_, it will re-classify the bogus comment as such. |
| 331 | + * |
| 332 | + * [2]: XML allows a broader range of characters in a processing instruction's target name |
| 333 | + * and disallows "xml" as a name, since it's special. The Tag Processor only recognizes |
| 334 | + * target names with an ASCII-representable subset of characters. It also exhibits the |
| 335 | + * same constraint as with CDATA sections, in that `>` cannot exist within the token |
| 336 | + * since Processing Instructions do no exist within HTML and their syntax transforms |
| 337 | + * into a bogus comment in the DOM. |
327 | 338 | * |
328 | 339 | * ## Design and limitations |
329 | 340 | * |
|
0 commit comments