Editor: Preserve empty object attributes through block parse/serialize#1
Draft
margolisj wants to merge 4 commits into
Draft
Editor: Preserve empty object attributes through block parse/serialize#1margolisj wants to merge 4 commits into
margolisj wants to merge 4 commits into
Conversation
Block attribute JSON objects (`{}`) and arrays (`[]`) both decode to PHP
arrays via `json_decode( ..., true )`, so an empty object would serialize
back out as `[]`, corrupting markup on round-trip paths such as Block Hooks
and KSES filtering.
Add an opt-in `preserve_empty_object_attributes` parse option that tags
object-originated arrays with a hidden marker during parsing, then restore
them to objects in `serialize_block_attributes()` via the new
`wp_restore_block_attribute_object_types()`. The default parse path is
unchanged, so render output is byte-for-byte identical.
See #63325.
Document the contract of the `preserve_empty_object_attributes` parse option:
attributes tagged with WP_Block_Parser::OBJECT_ATTRIBUTE_MARKER are not plain
data and must be passed through serialize_block_attributes() or
wp_restore_block_attribute_object_types() before being read or re-encoded.
Tighten the DocBlocks on parse_blocks(), WP_Block_Parser::parse(),
serialize_block_attributes(), and wp_restore_block_attribute_object_types().
Add coverage for:
- the default parse path still collapsing `{}` to `[]` (back-compat lock);
- wp_restore_block_attribute_object_types() no-op vs. re-cast behavior;
- the marker not leaking through the KSES (filter_block_content) and Block
Hooks (apply_block_hooks_to_content) round-trip paths;
- parser-level marker placement (nested objects tagged, arrays and the
attribute root untagged) and null attributes on invalid JSON.
See #63325.
Correct the @SInCE tags from 6.10.0 to 7.1.0 to match the current release cycle (trunk is 7.1-alpha). Guard wp_restore_block_attribute_object_types() so it only treats the marker key as its own when it carries the exact boolean the parser sets. Because the restore runs on every serialize_block_attributes() call (not just on object-preserving parses), this prevents a genuine attribute that merely shares the reserved key name from being silently dropped on the default serialize path. Add a test locking in that behavior. See #63325.
Normalize a non-array $options argument to WP_Block_Parser::parse() to an empty array, so a misuse falls back to default behavior instead of erroring on offset access at a public entry point. Add tests locking in two documented behaviors: a top-level empty object attribute set still serializes away (the attribute root is never tagged) even with object preservation enabled, and a non-array $options is tolerated. See #63325.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Block attribute JSON objects (
{}) and arrays ([]) both decode to PHP arrays viajson_decode( ..., true ), so an empty object is indistinguishable from an empty array after parsing and serializes back out as[]. On round-trip paths that parse and then re-serialize existing post content — Block Hooks (apply_block_hooks_to_content()) and KSES filtering (filter_block_content()) — this silently rewrites{}to[], corrupting block markup whose attributes rely on the object shape.This is the well-known PHP limitation that an empty associative array and an empty list are the same type, combined with
json_encode()always emitting[]for an empty array. The canonical behavior is defined by the JavaScript parser/serializer, which preserves{}vs[]natively;serialize_block_attributes()is documented to remain in sync with the JSserializeAttributesfunction, so PHP was the divergent side. This change brings PHP in line with that contract rather than introducing new behavior.Approach
An opt-in
preserve_empty_object_attributesparse option:WP_Block_Parser::parse()accepts an$optionsarray. When the option is set, attribute JSON is decoded withjson_decode( ..., false )and each object-originated value is converted to an array tagged with an internal marker (WP_Block_Parser::OBJECT_ATTRIBUTE_MARKER). The top-level attribute container is intentionally not tagged, preserving the existing behavior of dropping empty top-level attributes.serialize_block_attributes()calls the newwp_restore_block_attribute_object_types(), which strips the marker and re-casts tagged arrays back to objects sowp_json_encode()emits{}/{…}for objects and[]/[…]for arrays.parse_blocks()path is unchanged, so render output is byte-for-byte identical.Design decisions / trade-offs
The intent here is to be explicit about why it is shaped this way, since each choice trades one risk for another.
Opt-in option vs. changing the default. Chosen: opt-in.
parse_blocks()return shape is byte-for-byte unchanged, so the entire ecosystem that relies on the documented "all attributes are arrays" contract (since 4.0.0) is unaffected. Only the two internal callers that actually round-trip stored markup opt in.[].In-band marker vs. returning
stdClassvs. out-of-band tracking. Chosen: in-band sentinel key.stdClassfor JSON objects was rejected: it breaks the 4.0.0 "returns arrays not objects" contract; essentially every$block['attrs'][...]access in core and plugins assumes arrays, and would need->propaccess instead.hooked_blockcallback) can observe the marker on nested object attributes. This is mitigated by the obscure key name, by tagging only nested values (never the attribute root), by stripping automatically insideserialize_block_attributes(), and by limiting opt-in to two internal callers.Where the marker is stripped. Chosen: in
serialize_block_attributes(), the single chokepoint all block-markup serialization flows through.serialize_block_attributes()now performs an unconditional recursive walk of the attribute array on every call, including the default (no-marker) path. This is bounded to serialization contexts (save, REST, and the round-trip paths) and does not touch the front-endrender_block()path, but it is added work — most visible for blocks with large, deeply nested attribute payloads.Top-level empty objects still collapse. A block written as
<!-- wp:x {} /-->still serializes to<!-- wp:x /-->, because the attribute root is deliberately not tagged. This preserves the historical "drop empty attributes" behavior and avoids churn; the trade-off is that the fix targets nested empty objects (the reported case) rather than a top-level empty-object attribute set, which is semantically equivalent to having no attributes anyway.Marker key collision.
wp_restore_block_attribute_object_types()runs on everyserialize_block_attributes()call, not only on object-preserving parses, so the collision surface is the global serialize path. To keep that safe, the marker is only honored when it carries the exact boolean the parser sets (true === $value[ OBJECT_ATTRIBUTE_MARKER ]); a genuine attribute that merely shares the reserved key name with any other value is left intact. The only residual collision is an attribute literally named__wpBlockAttributeIsObjectwhose value is booleantrue, which is treated as effectively reserved. This was preferred over escaping every key, which would add cost and complexity to a hot path.Compatibility / ramifications
{}. This is strictly more correct: it reduces editor block-validation invalidations (the editor's JS parser already expects the object) and fixestype: objectattributes that previously arrived as[]and failed schema validation in render callbacks. The only code that could be negatively surprised is code that had adapted to the[]corruption — which was already inconsistent with the JS side.hooked_block/hooked_block_types/ block-visitor callbacks, and KSES filtering) may observe the marker on nested object attributes. The contract is documented onparse_blocks()andwp_restore_block_attribute_object_types(): attributes parsed with the option must be passed through restoration before being read, iterated, compared, or re-encoded. It has been verified that core's own callbacks use targeted key access and are unaffected, and that the marker never leaks into_wp_ignored_hooked_blockspost meta or the REST response (those encode lists of block-type names, not attribute objects).block_parser_classfilter and whoseparse()accepts only$documentwill not error (PHP ignores the extra positional argument), but it silently ignorespreserve_empty_object_attributes, so the Block Hooks / KSES paths fall back to the historical collapsing behavior under that parser. This is accepted graceful degradation.Testing
serialize.phptest_serialize_identity_with_preserved_object_types: round-trip identity for empty objects, empty arrays, nested mixes, the reported attribute shape, arrays of empty objects, and numeric-string-keyed objects.test_serialize_collapses_empty_object_without_option: back-compat lock — the default parse path still collapses{}to[].test_restore_object_types_is_noop_without_marker/test_restore_object_types_recasts_tagged_arrays: unit coverage ofwp_restore_block_attribute_object_types().test_restore_object_types_ignores_marker_key_with_non_true_value: a genuine attribute sharing the marker key name (non-truevalue) survives the default serialize path.test_serialize_drops_top_level_empty_object_even_with_option: locks in the intentional scoping — a top-level empty-object attribute set still serializes away even with the option enabled.test_filter_block_content_preserves_empty_object_attributes: the KSES path preserves{}and never leaks the marker.wpBlockParser.phptest_parse_tolerates_non_array_options: a non-array$optionsis normalized to no options rather than erroring.applyBlockHooksToContent.phptest_apply_block_hooks_to_content_preserves_empty_object_attributes: the Block Hooks path preserves{}while a hooked-block callback runs over the marked attributes, and never leaks the marker.Full
--group blockssuite: 1054 tests / 3180 assertions pass (no regressions). PHPCS clean (WordPress standard) on all changed files.Trac ticket: https://core.trac.wordpress.org/ticket/63325