Skip to content

HTML API: Visit implied body at EOF after head#46

Open
sirreal wants to merge 7 commits into
trunkfrom
html-api-fuzz-fiz/implicit-body
Open

HTML API: Visit implied body at EOF after head#46
sirreal wants to merge 7 commits into
trunkfrom
html-api-fuzz-fiz/implicit-body

Conversation

@sirreal

@sirreal sirreal commented Jun 10, 2026

Copy link
Copy Markdown
Owner

Summary

  • Continue full-document EOF processing through the selected insertion mode once at EOF.
  • Visit implied <body> and virtual EOF closers after documents ending in in head or after head states.
  • Mark EOF-driven stack events as virtual even when the last concrete token was ignored.

Testing

  • HTML API and html5lib PHPUnit groups pass.
  • PHPCS pass for the changed HTML API files.
  • codex review --base trunk.

Trac ticket: https://core.trac.wordpress.org/ticket/65372

Use of AI Tools

AI assistance: Yes
Tool(s): Codex
Model(s): GPT-5.5
Used for: PR description cleanup and code review.


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

sirreal added 2 commits June 10, 2026 11:06
[P2] Mark EOF-implied BODY nodes as virtual — src/wp-includes/html-api/class-wp-html-processor.php:1072-1072

When EOF is processed after the last real token was an ignored BODY token, such as a full document ending with <template><body> or <noscript></body>, this branch leaves state->current_token pointing at that stale BODY. The implied BODY inserted at EOF is then classified by the stack push handler as the same real token, so next_token() exposes a null token and serialization emits </body> without a matching <body>. Please ensure EOF-created nodes are always treated as virtual, or clear/use a synthetic EOF token before implied nodes are inserted.
@sirreal sirreal marked this pull request as ready for review June 10, 2026 10:10
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props jonsurrell, sergeybiryukov.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@sirreal

sirreal commented Jun 10, 2026

Copy link
Copy Markdown
Owner Author

Code review

Found 1 issue:

  1. The new $has_processed_eof property and is_eof_token() method are annotated @since 7.0.0, but trunk is at 7.1-alpha (src/wp-includes/version.php#L19) and Trac #65372 is milestoned 7.1. These should be @since 7.1.0 (same issue previously flagged on PR #44).

/**
* Whether the end-of-file token has been processed through the insertion modes.
*
* @since 7.0.0
*
* @var bool
*/
private $has_processed_eof = false;

/**
* Indicates if the Tag Processor has consumed all input.
*
* @since 7.0.0
*
* @return bool Whether the current token is the end-of-file token.
*/
private function is_eof_token(): bool {

Also verified (no action needed): the EOF reprocess chains terminate (each in template EOF pass shrinks the template insertion-mode stack before reprocessing); the insert_virtual_node() bookmark special-case is needed because WP_HTML_Tag_Processor::set_bookmark() refuses to allocate at STATE_COMPLETE/STATE_INCOMPLETE_INPUT; and the is_eof_token() provenance guard in the push/pop handlers only fires past end of input, so real tokens cannot be mis-marked as virtual.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

sirreal added a commit that referenced this pull request Jun 10, 2026
# Conflicts:
#	src/wp-includes/html-api/class-wp-html-processor.php
SergeyBiryukov and others added 4 commits June 12, 2026 23:42
This updates the `@param` and `@return` descriptions to state that `build_query()` does **not** URL-encode, unlike PHP's native `http_build_query()`, and that callers are responsible for encoding the values beforehand or late-escaping the output with `esc_url()`.

Follow-up to [8215].

Props nimeshatxecurify,  johnbillion.
Fixes #65453.

git-svn-id: https://develop.svn.wordpress.org/trunk@62497 602fd350-edb4-49c9-b593-d223f7449a82
…zip.php`.

This replaces `Chr()` and `Ord()` with their canonical lowercase forms `chr()` and `ord()`.

This is flagged as a case-sensitivity violation by the upcoming [https://wiki.php.net/rfc/case_sensitive_php PHP 8.6 RFC], which will emit `E_DEPRECATED` for function references that don't match their declared casing. Fixing it now keeps WordPress ahead of the deprecation.

Props Soean.
See #64897.

git-svn-id: https://develop.svn.wordpress.org/trunk@62498 602fd350-edb4-49c9-b593-d223f7449a82
This removes an unused variable in `WP_Interactivity_API::data_wp_class_processor()`.

Follow-up to [57563], [61020].

Props Soean.
See #64897.

git-svn-id: https://develop.svn.wordpress.org/trunk@62499 602fd350-edb4-49c9-b593-d223f7449a82
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants