Skip to content

Commit 6eea3d3

Browse files
deepclone_from_array: hydrate closure-bearing nodes as native lazy ghosts
On PHP 8.4+, object nodes whose payload slots carry a named-closure or const-expr-closure marker are created as uninitialized lazy ghosts whose hydration, closure resolution included, is deferred until the engine first touches them. Resolving closures (fake-closure creation, attribute-args re-evaluation) is the measurably expensive part of hydration; nodes without such markers keep hydrating eagerly, as does everything on PHP 8.2/8.3. Also fixes two pre-existing issues that lazy hydration would have amplified: binding a shared &-reference to a typed declared property now registers the property as a type source instead of tripping the engine's deref assertion, and object-ref markers resolved against ref slots are now order-independent by-value snapshots.
1 parent 0b0c2ad commit 6eea3d3

11 files changed

Lines changed: 1529 additions & 20 deletions

CHANGELOG.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,53 @@ All notable changes to this extension will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [Unreleased]
9+
10+
### Added
11+
12+
- On PHP 8.4+, `deepclone_from_array()` now creates object nodes whose
13+
payload slots carry a named-closure or const-expr-closure marker as
14+
native lazy ghosts: all object identities (back-references, shared `&`
15+
references, `===`) exist when the call returns, but those nodes' property
16+
hydration, closure resolution included, is deferred until the engine
17+
first touches each of them. Resolving closures (fake-closure creation,
18+
attribute-args re-evaluation) is the measurably expensive part of
19+
hydration, so deferral is restricted to the nodes that carry them; plain
20+
value slots hydrate eagerly as before (copy-on-write makes them cheaper
21+
to hydrate than to ghost), as do internal classes, `stdClass`,
22+
zero-declared-property classes and `__wakeup`/`__unserialize` nodes, all
23+
mixing freely with lazy ones. On PHP 8.2/8.3 everything keeps hydrating
24+
eagerly. Structural validation and `$allowed_classes` enforcement
25+
(including the const-expr-closure gate) remain eager; only value-level
26+
resolution errors (e.g. a stale const-expr closure line, a named-closure
27+
target that no longer exists) surface at first access instead of inside
28+
`deepclone_from_array()`, where the engine reverts the ghost and keeps it
29+
retryable. The shared hydration state lives in the new internal-only
30+
`DeepClone\HydrationContext` class, which
31+
`ReflectionClass::getLazyInitializer()` exposes as the initializer;
32+
abandoned half-hydrated graphs are reclaimed by the cycle collector. One
33+
documented deferral residue: type sources for shared `&` references bound
34+
to typed properties are registered per node as it hydrates, so a write
35+
through such a reference is only checked against the already-hydrated
36+
holders (see README).
37+
38+
### Fixed
39+
40+
- Binding a shared `&` reference to a *typed* declared property aborted
41+
debug builds (engine deref assertion) and skipped type-source registration
42+
on release builds, so later writes through the reference bypassed the
43+
property type. `deepclone_from_array()` and
44+
`deepclone_hydrate(..., DEEPCLONE_HYDRATE_PRESERVE_REFS)` now mirror
45+
`unserialize()`: the referenced value is verified against the property
46+
type and the property is registered as a type source of the reference.
47+
- Resolving an object-ref marker (`true`) against a *ref id* returned either
48+
an alias or a by-value snapshot of the shared slot depending on which
49+
consumer resolved first. It is now always a by-value snapshot (deref
50+
before copy), making the result independent of hydration order, a
51+
prerequisite for lazy mode, where that order is the user's touch order.
52+
Such payloads are only ever hand-crafted: `deepclone_to_array()` never
53+
emits object-ref markers with negative ids.
54+
855
## [0.7.2] - 2026-06-10
956

1057
### Fixed

README.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,68 @@ function deepclone_hydrate(object|string $object_or_class, array $vars = [], int
8787
(`null` = allow all, `[]` = allow none). Case-insensitive, matching
8888
`unserialize()`'s `allowed_classes` option.
8989

90+
### Lazy hydration of closure-bearing nodes (PHP 8.4+)
91+
92+
`deepclone_from_array()` creates the object nodes that are expensive to
93+
hydrate as
94+
[native lazy ghosts](https://www.php.net/manual/en/language.oop5.lazy-objects.php):
95+
nodes whose payload slots carry a named-closure or (PHP 8.5)
96+
const-expr-closure marker, since resolving those (fake-closure creation,
97+
attribute-args re-evaluation) is where hydration time actually goes. Every
98+
object identity exists when the call returns (back-references, shared `&`
99+
references and `===` behave exactly as for eager nodes), but a ghost's
100+
property hydration, closure resolution included, is deferred until the
101+
engine first touches it.
102+
103+
```php
104+
$clone = deepclone_from_array($payload);
105+
// closure-bearing nodes are uninitialized ghosts; reading any property
106+
// of such a node hydrates that node only.
107+
```
108+
109+
All other nodes hydrate eagerly: nodes without closure markers (plain value
110+
slots are cheaper to hydrate than to ghost, since copy-on-write makes them
111+
refcount bumps), internal classes (and classes inheriting one, `stdClass`
112+
descendants excepted), `stdClass` itself and other classes without declared
113+
properties, and nodes replaying `__wakeup`/`__unserialize` (their call
114+
order is observable and preserved). A graph without closure markers is
115+
hydrated fully eagerly and carries zero lazy-mode overhead, and on PHP
116+
older than 8.4 (no native lazy objects) everything hydrates eagerly.
117+
Mixing lazy and eager nodes in one graph is the normal mode of operation.
118+
119+
Semantics of deferred nodes (the usual native lazy-object rules):
120+
121+
- Whole-graph operations (`serialize()`, `json_encode()`, `foreach`, `==`,
122+
`clone`, `var_export()`) initialize every node they visit; `var_dump()`,
123+
`===`, `spl_object_id()` and `instanceof` do not initialize.
124+
- Structural payload errors (unknown ids, bad scopes, unknown declared
125+
properties) and `$allowed_classes` violations still throw inside
126+
`deepclone_from_array()`. Value-level resolution errors (a class or enum
127+
case that no longer exists, a stale const-expr closure line, a type
128+
mismatch) surface at first access instead; the failing ghost is rolled
129+
back by the engine, stays uninitialized, and rethrows on every retry.
130+
- A never-initialized ghost's destructor is not called.
131+
- Type enforcement on a shared `&` reference is registered per node as it
132+
hydrates. While some holders of the reference are still uninitialized, a
133+
write through it is checked only against the already-hydrated ones; if the
134+
written value violates a pending node's property type, that node's first
135+
touch throws instead (eager mode rejects such a write at the assignment).
136+
- The payload and every object of the graph stay pinned in memory until the
137+
last ghost initializes or dies; a shared `DeepClone\HydrationContext`
138+
object, also returned by `ReflectionClass::getLazyInitializer()`, holds
139+
them. Abandoned graphs are reclaimed by the cycle collector.
140+
141+
Cost model, measured on 20k-node graphs (PHP 8.4, release build): creating
142+
ghosts instead of resolving closures cuts `deepclone_from_array()` time by
143+
3-5x on closure-rich nodes, and the saved work never runs at all for nodes
144+
that are never touched. The trade-offs apply to closure-bearing nodes only:
145+
a ghost that ends up touched replays all its slots then (the total cost of
146+
a fully traversed graph stays comparable, just paid at first touch plus the
147+
ghost bookkeeping), a graph dropped with uninitialized ghosts is reclaimed
148+
by a cycle-collector sweep instead of plain refcounting, and the payload
149+
plus the per-slot index stay pinned until the last ghost initializes or
150+
dies. Graphs without closure markers are not affected by any of this.
151+
90152
`deepclone_hydrate()` accepts either an object to hydrate in place or a class
91153
name to instantiate without calling its constructor. By default, PHP `&`
92154
references in `$vars` are dropped on write; pass `DEEPCLONE_HYDRATE_PRESERVE_REFS`

0 commit comments

Comments
 (0)