Skip to content

Commit 66cce22

Browse files
deepclone_from_array: hydrate closure-bearing nodes as native lazy ghosts
On PHP 8.4+, object nodes whose payload slots or replayed __unserialize state carry a named-closure or const-expr-closure marker are created as uninitialized lazy ghosts: every object identity exists when the call returns (back-references, shared &-references and === behave as for eager nodes), but per-node hydration, closure resolution included, runs on first engine access. Closure-bearing __wakeup/__unserialize nodes replay their hook at the end of their own initialization; per-entry validation stays inside the call. Nodes without closure markers hydrate eagerly as before (copy-on-write makes plain value slots cheaper to hydrate than to ghost), as does everything on PHP 8.2/8.3. Shared state lives in the internal-only DeepClone\HydrationContext; ReflectionClass::getLazyInitializer() returns a Closure bound to it, shared by all ghosts of one call. Structural validation and allowed_classes enforcement (including the const-expr gate) stay eager; only value-level resolution errors surface at first access, where the engine reverts the ghost and keeps it retryable. Measured against v0.7.2 (20k-node graphs, release 8.4): closure-rich graphs hydrate 4-6x faster on creation and partial consumption, occupy 2-3x less memory while untouched, and tear down about 2x faster when dropped untouched; fully traversed graphs pay a comparable total; scalar graphs are unchanged. Also fixes two pre-existing issues lazy hydration would have amplified: shared &-references bound to typed declared properties now register the property as a type source instead of tripping the engine's deref assertion, and object-ref markers resolved against ref slots are order-independent by-value snapshots.
1 parent 0b0c2ad commit 66cce22

12 files changed

Lines changed: 1949 additions & 20 deletions

CHANGELOG.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,57 @@ All notable changes to this extension will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [Unreleased]
9+
10+
### Added
11+
12+
- On PHP 8.4+, `deepclone_from_array()` now creates object nodes whose
13+
payload slots or replayed `__unserialize` state carry a named-closure or
14+
const-expr-closure marker as
15+
native lazy ghosts: all object identities (back-references, shared `&`
16+
references, `===`) exist when the call returns, but those nodes' property
17+
hydration, closure resolution included, is deferred until the engine
18+
first touches each of them. Resolving closures (fake-closure creation,
19+
attribute-args re-evaluation) is the measurably expensive part of
20+
hydration, so deferral is restricted to the nodes that carry them; plain
21+
value slots hydrate eagerly as before (copy-on-write makes them cheaper
22+
to hydrate than to ghost), as do internal classes, `stdClass` and
23+
zero-declared-property classes, all mixing freely with lazy ones.
24+
Closure-bearing `__wakeup`/`__unserialize` nodes defer too: their hook
25+
runs at the end of their own initialization instead of in the global
26+
children-first replay sequence, while per-entry validation stays inside
27+
the call. On PHP 8.2/8.3 everything keeps hydrating eagerly. Structural
28+
validation and `$allowed_classes` enforcement (including the
29+
const-expr-closure gate) remain eager; only value-level resolution errors
30+
(e.g. a stale const-expr closure line, a named-closure target that no
31+
longer exists) surface at first access instead of inside
32+
`deepclone_from_array()`, where the engine reverts the ghost and keeps it
33+
retryable. The shared hydration state lives in the new internal-only
34+
`DeepClone\HydrationContext` class;
35+
`ReflectionClass::getLazyInitializer()` returns a Closure bound to it.
36+
Abandoned half-hydrated graphs are reclaimed by the cycle collector. One
37+
documented deferral residue: type sources for shared `&` references bound
38+
to typed properties are registered per node as it hydrates, so a write
39+
through such a reference is only checked against the already-hydrated
40+
holders (see README).
41+
42+
### Fixed
43+
44+
- Binding a shared `&` reference to a *typed* declared property aborted
45+
debug builds (engine deref assertion) and skipped type-source registration
46+
on release builds, so later writes through the reference bypassed the
47+
property type. `deepclone_from_array()` and
48+
`deepclone_hydrate(..., DEEPCLONE_HYDRATE_PRESERVE_REFS)` now mirror
49+
`unserialize()`: the referenced value is verified against the property
50+
type and the property is registered as a type source of the reference.
51+
- Resolving an object-ref marker (`true`) against a *ref id* returned either
52+
an alias or a by-value snapshot of the shared slot depending on which
53+
consumer resolved first. It is now always a by-value snapshot (deref
54+
before copy), making the result independent of hydration order, a
55+
prerequisite for lazy mode, where that order is the user's touch order.
56+
Such payloads are only ever hand-crafted: `deepclone_to_array()` never
57+
emits object-ref markers with negative ids.
58+
859
## [0.7.2] - 2026-06-10
960

1061
### Fixed

README.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,73 @@ function deepclone_hydrate(object|string $object_or_class, array $vars = [], int
8787
(`null` = allow all, `[]` = allow none). Case-insensitive, matching
8888
`unserialize()`'s `allowed_classes` option.
8989

90+
### Lazy hydration of closure-bearing nodes (PHP 8.4+)
91+
92+
`deepclone_from_array()` creates the object nodes that are expensive to
93+
hydrate as
94+
[native lazy ghosts](https://www.php.net/manual/en/language.oop5.lazy-objects.php):
95+
nodes whose payload slots or replayed `__unserialize` state carry a
96+
named-closure or (PHP 8.5) const-expr-closure marker, since resolving those
97+
(fake-closure creation, attribute-args re-evaluation) is where hydration
98+
time actually goes. Every
99+
object identity exists when the call returns (back-references, shared `&`
100+
references and `===` behave exactly as for eager nodes), but a ghost's
101+
property hydration, closure resolution included, is deferred until the
102+
engine first touches it.
103+
104+
```php
105+
$clone = deepclone_from_array($payload);
106+
// closure-bearing nodes are uninitialized ghosts; reading any property
107+
// of such a node hydrates that node only.
108+
```
109+
110+
All other nodes hydrate eagerly: nodes without closure markers (plain value
111+
slots are cheaper to hydrate than to ghost, since copy-on-write makes them
112+
refcount bumps), internal classes (and classes inheriting one, `stdClass`
113+
descendants excepted), and `stdClass` itself and other classes without
114+
declared properties. A graph without closure markers is hydrated fully
115+
eagerly and carries zero lazy-mode overhead, and on PHP older than 8.4 (no
116+
native lazy objects) everything hydrates eagerly. Mixing lazy and eager
117+
nodes in one graph is the normal mode of operation.
118+
119+
Closure-bearing nodes that replay `__wakeup`/`__unserialize` are deferred
120+
too: their hook runs at the end of their own initialization instead of in
121+
the global, children-first replay sequence (each entry is still validated
122+
inside the call; only the hook calls move). State-replaying nodes without
123+
closure markers keep their eager, ordered replay.
124+
125+
Semantics of deferred nodes (the usual native lazy-object rules):
126+
127+
- Whole-graph operations (`serialize()`, `json_encode()`, `foreach`, `==`,
128+
`clone`, `var_export()`) initialize every node they visit; `var_dump()`,
129+
`===`, `spl_object_id()` and `instanceof` do not initialize.
130+
- Structural payload errors (unknown ids, bad scopes, unknown declared
131+
properties) and `$allowed_classes` violations still throw inside
132+
`deepclone_from_array()`. Value-level resolution errors (a class or enum
133+
case that no longer exists, a stale const-expr closure line, a type
134+
mismatch) surface at first access instead; the failing ghost is rolled
135+
back by the engine, stays uninitialized, and rethrows on every retry.
136+
- A never-initialized ghost's destructor is not called.
137+
- Type enforcement on a shared `&` reference is registered per node as it
138+
hydrates. While some holders of the reference are still uninitialized, a
139+
write through it is checked only against the already-hydrated ones; if the
140+
written value violates a pending node's property type, that node's first
141+
touch throws instead (eager mode rejects such a write at the assignment).
142+
- The payload and every object of the graph stay pinned in memory until the
143+
last ghost initializes or dies: `ReflectionClass::getLazyInitializer()`
144+
returns a `Closure` bound to a shared internal
145+
`DeepClone\HydrationContext` object that holds them. Abandoned graphs are
146+
reclaimed by the cycle collector.
147+
148+
Cost model, measured against the previous fully-eager implementation
149+
(20k-node graphs, PHP 8.4 release build): closure-rich graphs hydrate 4-6x
150+
faster on creation and partial consumption, occupy 2-3x less memory while
151+
untouched (lazy shells plus the slot index weigh less than materialized
152+
closures), and tear down about 2x faster when dropped untouched. A fully
153+
traversed graph pays a comparable total (+12% in the worst measured case),
154+
at first touch instead of inside the call. Graphs without closure markers
155+
take the eager path bit for bit.
156+
90157
`deepclone_hydrate()` accepts either an object to hydrate in place or a class
91158
name to instantiate without calling its constructor. By default, PHP `&`
92159
references in `$vars` are dropped on write; pass `DEEPCLONE_HYDRATE_PRESERVE_REFS`

0 commit comments

Comments
 (0)