Skip to content

[pull] main from react:main#505

Merged
pull[bot] merged 2 commits into
LoadsAForks:mainfrom
react:main
Jun 16, 2026
Merged

[pull] main from react:main#505
pull[bot] merged 2 commits into
LoadsAForks:mainfrom
react:main

Conversation

@pull

@pull pull Bot commented Jun 16, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

poteto and others added 2 commits June 15, 2026 23:43
…36731)

Last of the boundary trilogy, after #36729 and #36730 (both merged);
rebased onto main as a standalone 3-commit change.

JS strings are WTF-16: a lone surrogate in source (`"\uD83E"`) is a
legal string value, but Rust `String` is UTF-8 and cannot hold it. The
compiler previously leaned on the napi bridge's `__SURROGATE_XXXX__`
marker encoding end to end, so the core compiler compared, concatenated,
and constant-folded marker text as if it were the actual string value.

`JsString` (in `react_compiler_diagnostics`, our lowest layer) holds the
common well-formed case as a plain UTF-8 `String` with zero overhead and
falls back to UTF-16 code units only for ill-formed values.
`StringLiteral.value` and `PrimitiveValue::String` carry it through the
pipeline; constant folding concatenates via code units so split
surrogate halves re-pair exactly as they do in JS; markers are emitted
only at the napi edge, where the babel bridge requires them (serde_json
can neither parse nor emit a lone `\uXXXX` escape).

The representation is encapsulated behind an opaque struct (private
`Repr` enum, borrowed `JsStringRef` view via `as_ref`) so the
"well-formed values are always UTF-8" invariant that makes the derived
`PartialEq`/`Hash` sound holds by construction, per review feedback.

The marker decoder scans byte-wise (an earlier draft range-sliced at
fixed offsets and panicked on multibyte UTF-8 following `__SURROGATE_`),
validates hex digits, and accepts uppercase only, exactly mirroring what
the bridge emits; lowercase marker-shaped user text survives verbatim.
The HIR debug printer renders unpaired surrogates as `\uXXXX` escapes
byte-identical to the TS printer.

This closes the lone-surrogate divergence on the e2e harness. Verified
on the rebased branch: cargo workspace tests, Rust snap channel
1804/1804, and HIR + Code parity on the lone-surrogate fixture through
the comparison harness.
Move the toJSON handling out of the JSON.stringify replacer path and
into an explicit recursive resolution step that uses v8's optimized
single-arg JSON.stringify call.

This has been pulled out of the original implementation in
#36053 on the advice of
#36181 (thanks @unstubbable)

### NB: \_\_proto\_\_

The only difference is surrounding the treatment of `{}` vs
`Object.create(null)`. #36053 uses
the latter, which avoids `__proto__` issues, but is slower in
microbenchmarks due to v8 semantics.
#36181 uses the former (`{}`) which
is faster in micro benchmarks but doesn't special-case `__proto__`
according to spec. This PR does both (`{}` and special case), with no
measurable performance difference that I could produce.

---

The rest of this PR's description is reproduced from
#36181:

---

### Problem

When serializing a Flight chunk, `emitChunk` currently calls
`JSON.stringify(value, task.toJSON)`. The `task.toJSON` replacer is
called for every key-value pair in the serialized JSON. While the logic
inside the replacer is lightweight, the C++ to JavaScript boundary
crossing on every node adds up — V8's `JSON.stringify` is implemented in
C++, and calling back into JavaScript for every property incurs overhead
that scales with the number of keys in the output.

### Change

Replace the replacer with a two-step process:

1. `resolveModel()` recursively walks the rendered value, calling
`renderModel()` on each child — doing the same transformation the
replacer used to do, but entirely in JavaScript without C++ boundary
crossings.
2. `JSON.stringify()` is called with no replacer, staying entirely in
C++.

The `resolveModel` walk also replicates `JSON.stringify`'s `toJSON`
semantics for `Date` objects.

### Results

Measured using the Flight SSR benchmark fixture (#36180) on a dashboard
app with ~25 components, 200 product rows (~325KB Flight payload).
Tested across Node 20, 22, and 24.

- **`bench:bare`** (in-process, no script injection): Flight+Fizz sync
median improves by **~4-5%** consistently across all three Node
versions.
- **`bench:server`** (HTTP, c=1): Flight+Fizz sync throughput improves
by **~3-6%** across Node versions. Async results vary between runs but
trend positive.

### Future opportunity

While the immediate performance improvement is moderate, this change
also sets up a potential future optimization: a Flight mode that renders
to an object instead of a stream ([#36143
(comment)](#36143 (comment))).
Since `resolveModel()` already produces a plain JS object tree before
`JSON.stringify` is called, this intermediate representation could
potentially be passed to the SSR client without the
serialization-deserialization roundtrip that the current stream-based
approach requires.

closes #36181
@pull pull Bot locked and limited conversation to collaborators Jun 16, 2026
@pull pull Bot added the ⤵️ pull label Jun 16, 2026
@pull pull Bot merged commit ad78e25 into LoadsAForks:main Jun 16, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants