[pull] main from react:main#505
Merged
Merged
Conversation
…36731) Last of the boundary trilogy, after #36729 and #36730 (both merged); rebased onto main as a standalone 3-commit change. JS strings are WTF-16: a lone surrogate in source (`"\uD83E"`) is a legal string value, but Rust `String` is UTF-8 and cannot hold it. The compiler previously leaned on the napi bridge's `__SURROGATE_XXXX__` marker encoding end to end, so the core compiler compared, concatenated, and constant-folded marker text as if it were the actual string value. `JsString` (in `react_compiler_diagnostics`, our lowest layer) holds the common well-formed case as a plain UTF-8 `String` with zero overhead and falls back to UTF-16 code units only for ill-formed values. `StringLiteral.value` and `PrimitiveValue::String` carry it through the pipeline; constant folding concatenates via code units so split surrogate halves re-pair exactly as they do in JS; markers are emitted only at the napi edge, where the babel bridge requires them (serde_json can neither parse nor emit a lone `\uXXXX` escape). The representation is encapsulated behind an opaque struct (private `Repr` enum, borrowed `JsStringRef` view via `as_ref`) so the "well-formed values are always UTF-8" invariant that makes the derived `PartialEq`/`Hash` sound holds by construction, per review feedback. The marker decoder scans byte-wise (an earlier draft range-sliced at fixed offsets and panicked on multibyte UTF-8 following `__SURROGATE_`), validates hex digits, and accepts uppercase only, exactly mirroring what the bridge emits; lowercase marker-shaped user text survives verbatim. The HIR debug printer renders unpaired surrogates as `\uXXXX` escapes byte-identical to the TS printer. This closes the lone-surrogate divergence on the e2e harness. Verified on the rebased branch: cargo workspace tests, Rust snap channel 1804/1804, and HIR + Code parity on the lone-surrogate fixture through the comparison harness.
Move the toJSON handling out of the JSON.stringify replacer path and into an explicit recursive resolution step that uses v8's optimized single-arg JSON.stringify call. This has been pulled out of the original implementation in #36053 on the advice of #36181 (thanks @unstubbable) ### NB: \_\_proto\_\_ The only difference is surrounding the treatment of `{}` vs `Object.create(null)`. #36053 uses the latter, which avoids `__proto__` issues, but is slower in microbenchmarks due to v8 semantics. #36181 uses the former (`{}`) which is faster in micro benchmarks but doesn't special-case `__proto__` according to spec. This PR does both (`{}` and special case), with no measurable performance difference that I could produce. --- The rest of this PR's description is reproduced from #36181: --- ### Problem When serializing a Flight chunk, `emitChunk` currently calls `JSON.stringify(value, task.toJSON)`. The `task.toJSON` replacer is called for every key-value pair in the serialized JSON. While the logic inside the replacer is lightweight, the C++ to JavaScript boundary crossing on every node adds up — V8's `JSON.stringify` is implemented in C++, and calling back into JavaScript for every property incurs overhead that scales with the number of keys in the output. ### Change Replace the replacer with a two-step process: 1. `resolveModel()` recursively walks the rendered value, calling `renderModel()` on each child — doing the same transformation the replacer used to do, but entirely in JavaScript without C++ boundary crossings. 2. `JSON.stringify()` is called with no replacer, staying entirely in C++. The `resolveModel` walk also replicates `JSON.stringify`'s `toJSON` semantics for `Date` objects. ### Results Measured using the Flight SSR benchmark fixture (#36180) on a dashboard app with ~25 components, 200 product rows (~325KB Flight payload). Tested across Node 20, 22, and 24. - **`bench:bare`** (in-process, no script injection): Flight+Fizz sync median improves by **~4-5%** consistently across all three Node versions. - **`bench:server`** (HTTP, c=1): Flight+Fizz sync throughput improves by **~3-6%** across Node versions. Async results vary between runs but trend positive. ### Future opportunity While the immediate performance improvement is moderate, this change also sets up a potential future optimization: a Flight mode that renders to an object instead of a stream ([#36143 (comment)](#36143 (comment))). Since `resolveModel()` already produces a plain JS object tree before `JSON.stringify` is called, this intermediate representation could potentially be passed to the SSR client without the serialization-deserialization roundtrip that the current stream-based approach requires. closes #36181
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )