You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tier 1 #2 per docs/memory-perf-roadmap.md. Small String Optimization
lets strings of length 0..5 bytes encode inline in the 48-bit NaN-box
payload instead of allocating a StringHeader.
INFRASTRUCTURE-ONLY landing. No creation sites migrated yet — see
docs/sso-migration-plan.md for the 6-step roll-out sequence with
per-step ship criteria.
Why infrastructure-first: a single-commit flip of
DirectParser::parse_string_value to emit SSO immediately regressed
3 test_json_lazy_*.ts tests. The consumer surface for strings in
Perry is large — json.rs alone has 20+ `== STRING_TAG` dispatches,
and the broader fan-out covers object.rs property-get helpers,
string.rs methods, regex.rs, set.rs / map.rs key equality, stdlib
HTTP/DB paths, and codegen string-literal emission. Landing the
infrastructure without producers is safe (the new tag value is
allocated but unused) and unblocks incremental per-site migration.
Added:
- SHORT_STRING_TAG = 0x7FF9_0000_0000_0000 (value.rs)
- JSValue::{try_short_string, short_string_to_buf, short_string_len,
short_string_unchecked}
- JSValue::{is_short_string, is_any_string} — legacy is_string()
stays strict (heap pointer only) so the existing ~50 callers
that follow is_string() with as_string_ptr() don't need to be
audited yet
- js_string_new_sso(ptr, len) -> f64 (string.rs) — SSO-aware
creation, falls back to heap on len > 5
- str_bytes_from_jsvalue(value, &mut scratch) (string.rs) —
central decoder producing (*const u8, u32) for either form
- js_string_materialize_to_heap(value) (string.rs) — compatibility
shim for callers that truly need *mut StringHeader
Consumer-side dispatch already wired:
- typeof (builtins.rs) accepts both tags
- js_jsvalue_equals (value.rs) — SSO fast path when both operands
are SSO (canonical encoding ⇒ same bytes ⇒ same bits), decode via
scratch buffers otherwise
- js_jsvalue_compare (value.rs) — lexicographic comparison via
decoded byte slices
- js_value_length_f64 (value.rs) — direct bit extraction for SSO,
no heap access
- js_jsvalue_to_string (value.rs) — materializes SSO to heap when
caller needs *mut StringHeader
- Three stringify arms in json.rs (stringify_value,
stringify_object_inner field dispatch, stringify_array_depth
element dispatch) — the remaining 15+ arms are Step 1 of the
migration plan
6 new unit tests in value::tests (total 130 → 136):
- roundtrip across 0, 1, 2, 3, 4, 5-byte inputs
- rejection of 6+ byte inputs (returns None from try_short_string)
- embedded-NUL handling (length is authoritative, NULs are data)
- tag-band distinctness from POINTER / INT32 / NUMBER / UNDEFINED
- empty-string roundtrip
- byte-order stability (first byte lands in LSB — invariant for
any future SIMD bulk-decoder)
Full regression sweep verifies infrastructure-only is safe:
- All 10 test_json_*.ts match Node byte-for-byte
- Runtime tests 136/136
- Workspace cargo test unaffected
docs/sso-migration-plan.md sequences the roll-out:
Step 1: stringify consumers (json.rs, ~15 sites)
Step 2: DirectParser emits SSO
Step 3: object key storage (object.rs, PARSE_KEY_CACHE, shape cache)
Step 4: string methods (string.rs)
Step 5: codegen string literals (compile-time constants)
Step 6: stdlib HTTP / DB paths
+ decision gate after Step 2 to re-evaluate vs jumping to tier 2/3
Copy file name to clipboardExpand all lines: CLAUDE.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
8
8
9
9
Perry is a native TypeScript compiler written in Rust that compiles TypeScript source code directly to native executables. It uses SWC for TypeScript parsing and LLVM for code generation.
10
10
11
-
**Current Version:** 0.5.212
11
+
**Current Version:** 0.5.213
12
12
13
13
## TypeScript Parity Status
14
14
@@ -148,6 +148,7 @@ First-resolved directory cached in `compile_package_dirs`; subsequent imports re
148
148
Keep entries to 1-2 lines max. Full details in CHANGELOG.md.
149
149
150
150
- **v0.5.205** — Fix #183: `perry compile --target web` on a real-world app (Bloom Jump built on the Bloom engine) produced a WASM binary the browser refused to load — `Compiling function #687 failed: expected 0 elements on the stack for fallthru, found 103` (count varies with engine state). Root cause in `crates/perry-codegen-wasm/src/emit.rs`: the four direct-`Call`-instruction code paths — `Expr::Call` FuncRef arm (~4302), `Expr::Call` ExternFuncRef arm (~4324), `Expr::New` user-class ctor (~5844), `Expr::SuperCall` parent-ctor (~5894), `Expr::StaticMethodCall` direct-static path (~5979) — each emit `emit_expr(arg)` per source arg and pad up with `TAG_UNDEFINED` when `args.len() < expected`, but had no matching drop-excess branch when `args.len() > expected`. WASM `call` consumes exactly the callee's declared param count, so when JS's "extra args evaluated for side effects, then silently ignored" semantics met Perry's WASM codegen, every extra evaluated arg leaked past the call and accumulated on the enclosing block's operand stack — 103 values by the time `_start`'s final `end` hit the validator. The shape that triggered it in jump/bloom was `bloom/src/core/colors.ts`'s `Colors = new __AnonShape_2(...24 PropertyGets...)` landing on a Phase-3-synthesized ctor with lower declared arity, multiplied across bloom's 10 submodules. Fix: after each existing `for _ in args.len()..expected { I64Const(TAG_UNDEFINED) }` pad-up loop, add the mirror `for _ in expected..args.len() { Drop }` — matches JS semantics (extras evaluated for side effects but discarded) and keeps the operand stack aligned with the callee's WASM type at every direct-Call site. Verified end-to-end against the exact issue repro cloned fresh from `github.com/Bloom-Engine/jump` + `github.com/Bloom-Engine/engine`: both path A `file:./vendor/bloom/` and path B `file:../engine/` now compile to a WebAssembly-validating `.wasm` (416,923 / 413,780 bytes respectively, 140 FFI imports intact, `WebAssembly.compile` resolves clean on node 20+); a synthetic `takesFive(mc(),mc(),1,2,3,4)` minimal case that previously failed `Compiling function #213 failed: ... found 1` also validates. `cargo test --release -p perry-runtime -p perry-hir -p perry-codegen-wasm -p perry`: 262/262 passed. Note: issue #183 also claimed path A found only 1 module and emitted 9 FFI imports — could not reproduce in a fresh clone (both paths find 10 modules identically); most likely an artifact of the reporter's local `vendor/bloom` snapshot predating the `exports` map, and the "runGame silently no-ops" symptom the user actually observed was the browser refusing to instantiate the invalid WASM with the surrounding JS glue swallowing the error — fixed here.
151
+
- **v0.5.213** — Small String Optimization (SSO) infrastructure (tier 1 #2 per `docs/memory-perf-roadmap.md`). **Infrastructure-only landing**; no creation sites migrated yet. New tag `SHORT_STRING_TAG = 0x7FF9_0000_0000_0000` encoding strings of length 0..=5 inline in the 48-bit NaN-box payload (8-bit length at bits 40..47 + 5 bytes of data at bits 0..39). Zero heap allocation for short strings when emitted — the value IS the data. Added: `JSValue::try_short_string(&[u8])` (constructor), `short_string_to_buf` / `short_string_len` (decoders), `is_short_string` / `is_any_string` (predicates, with `is_string` kept strict for legacy call sites that rely on `as_string_ptr` returning a real heap pointer), `js_string_new_sso(ptr, len) -> f64` (SSO-aware creation that falls back to heap for long inputs), `str_bytes_from_jsvalue(value, &mut scratch)` (central decoder producing `(*const u8, u32)` view for either representation), `js_string_materialize_to_heap(value)` (compatibility shim that allocates a heap StringHeader from an SSO value). Consumer-side dispatch already wired in: `typeof` (builtins.rs, accepts both tags), `js_jsvalue_equals` + `js_jsvalue_compare` (value.rs — SSO fast path when both operands are SSO because encoding is canonical, otherwise decode via scratch buffers and byte-compare), `js_value_length_f64` (direct bit extraction for SSO, no heap access), `js_jsvalue_to_string` (materializes SSO to heap when caller needs `*mut StringHeader`), three stringify arms in json.rs (top-level `stringify_value`, object field inline dispatch in `stringify_object_inner`, array element inline dispatch in `stringify_array_depth`). 6 new unit tests in `value::tests` cover roundtrip, rejection of 6+ byte inputs, embedded-NUL handling (length is authoritative), tag-band distinctness from POINTER/INT32/NUMBER/UNDEFINED, empty-string roundtrip, and byte-order stability (first byte lands in LSB of payload — invariant relied on by any future SIMD bulk-decoder). **Why infrastructure-only:** flipping `DirectParser::parse_string_value` to emit SSO without first auditing every consumer produces regressions — `grep "== STRING_TAG" crates/perry-runtime/src/json.rs` alone shows 20+ sites, and the broader consumer surface spans object.rs property-get helpers, string.rs methods (split/replace/slice/indexOf/etc.), regex.rs match extractors, set.rs/map.rs key equality, stdlib HTTP/DB paths, and codegen string-literal emission. Attempting the flip in-session reproduced the hazard: 3 `test_json_lazy_*.ts` tests diffed from Node with stringify emitting `"null"` where SSO values should have decoded. Rolled back the producer flip; kept every consumer arm already added so Step 1 of the migration is ~50% complete. New doc `docs/sso-migration-plan.md` sequences the 6-step roll-out (stringify consumers → DirectParser emit → object key storage → string methods → codegen literals → stdlib) with per-step ship criteria and a decision gate after Step 2 to re-evaluate whether Steps 3-6 are worth the effort vs jumping to tier 2/3 (escape analysis + generational GC). Runtime tests 130 → 136 (added 6 SSO tests). All 10 existing `test_json_*` regressions green under infrastructure-only landing.
151
152
- **v0.5.212** — Three lazy-correctness fixes + comprehensive external-auditor document. (1) `Array.isArray(parsed)` returned `false` on lazy `JSON.parse` results. Root cause: `Expr::ArrayIsArray` in `crates/perry-codegen/src/expr.rs` was a pure compile-time check that emitted `TAG_FALSE` whenever the operand's static HIR type wasn't definitively `Array`/`Tuple`. For `JSON.parse` results (typed `any`) this meant the static check always failed, so even a runtime array always returned false. Fixed by routing indeterminate static types (`Any`, `Unknown`, no annotation) through a new `js_array_is_array` runtime dispatch in the same codegen arm; the fast-path for definitively-array statics still emits `TAG_TRUE` without a runtime call, and definitively-scalar types (Number/String/Boolean/Null/Void/BigInt/Symbol) still emit `TAG_FALSE` statically. (2) `parsed instanceof Array` returned `false` on lazy arrays. Root cause: `js_instanceof` in `crates/perry-runtime/src/object.rs:2864-2890` (the `CLASS_ID_ARRAY` branch) only matched `GC_TYPE_ARRAY`, not `GC_TYPE_LAZY_ARRAY`. Added the lazy-array obj_type to the OR. (3) `Array.isArray` runtime (`crates/perry-runtime/src/array.rs:js_array_is_array`) already had the same eager-only check; extended to accept `GC_TYPE_LAZY_ARRAY` too (now reachable via the new codegen dispatch path). Net effect: every reasonable type-predicate against a lazy array now matches Node byte-for-byte. Pre-existing limitation `parsed.constructor === Array` (returns false on Perry regardless of lazy vs eager — it's a property-lookup limitation in Perry's class system) remains unchanged. The `Array.isArray` codegen change is also a bug-fix for ANY `any`-typed value — before v0.5.212 `Array.isArray(anyTyped)` returned false even when the runtime value was a real eager array; pre-existing latent bug unrelated to lazy work. Verified: `test_json_lazy_per_element.ts` and the new `test_json_lazy_predicates.ts` match Node; all other `test_json_*` regressions clean. New document `docs/audit-lazy-json.md` — **~1400-line comprehensive external-auditor reference** covering the tape-based lazy JSON system end-to-end: layout (TapeEntry, LazyArrayHeader offset invariants, arena layout), dispatch (every user-observable array operation mapped to its runtime entry point), garbage collection (trace_lazy_array, cache zero-on-alloc invariant, tracer safety), correctness proof (14-row case analysis against every JS array operation), performance characteristics (measured numbers vs Node + Bun on all three main benches), edge cases (15 distinct cases with handling), test coverage (every file + gate), known limitations, and a 20-item audit checklist with file:line references. Written so a reviewer with no prior Perry knowledge can verify the entire system against source.
152
153
- **v0.5.211** — Fix pre-existing linker regression from v0.5.204: `js_json_stringify` in `crates/perry-runtime/src/json.rs` was missing `#[no_mangle]`. Rust compiled it as a mangled symbol (`__ZN13perry_runtime4json17js_json_stringify17h…E`) which `perry-stdlib::fastify::context::jsvalue_to_json_string` (and `perry-stdlib::fastify::server::build_response_body`) couldn't find when linking against a statically-built `libperry_runtime.a` — surfaced as `ld: Undefined symbols: _js_json_stringify` when running `scripts/run_fastify_tests.sh` or `cargo test -p perry-stdlib`. Tracking down via `git log -L "/pub unsafe extern \"C\" fn js_json_stringify(/,/^}/"` identified v0.5.204 (`feat(json): lazy parse + lazy stringify`) as the commit that inadvertently removed the attribute when inserting `try_stringify_lazy_array` directly above. All 7 fastify integration tests now pass (GET /hello, GET /users/:id, POST /echo status + body, GET /does-not-exist → 404) — this was blocking end-to-end Fastify coverage since v0.5.204. One-line fix. Also: comprehensive test sweep at v0.5.211: `cargo test --release --workspace` with CI-matching exclusions = **44/44 test runs, 0 failed**, gap tests 24/28 (unchanged), parity tests 106 pass / 12 fail (same 12 as pre-v0.5.208 baseline — no regressions), thread tests 4/4, cache tests PASS, fastify tests 5/5 (newly unblocked), doc tests 95/115 (7 tvos-simulator cross-compile failures are env-dependent — `libperry_ui_tvos.a` built for macOS host not tvOS-sim; pre-existing, unrelated to JSON work).
153
154
- **v0.5.210** — Issue #179 Step 2 completion: **lazy JSON parse is now the default** for top-level array blobs ≥ 1024 bytes. After v0.5.208 per-element sparse materialization + v0.5.209 walk cursor + adaptive materialize threshold, there is no measured access pattern where lazy loses to direct on non-tiny blobs. The `PERRY_JSON_TAPE` env var changed from "opt-in" (`=1` to enable, unset = direct) to "escape hatch" (`=0`/`off`/`false` forces direct, `=1`/`on`/`true` forces tape-on even for small blobs, otherwise auto). Lookup cached via `OnceLock` so the env-var check is amortized to once per process (was per-parse; 100k tight-loop parses saw a ~5 ms difference just from the lookup on macOS). Size threshold 1024 bytes chosen because a small-array bench (`'[1,2,3,4,5,6,7,8,9,10]'` × 100k iters) measured tape overhead ~10% over direct when the tape build + LazyArrayHeader cache/bitmap allocation cost didn't amortize over enough lookups; above the threshold the overhead is swamped by the parse-time savings. `@perry-lazy` JSDoc pragma still forces tape at the codegen level (via `js_json_parse_lazy`), unconditional of size. **Measured impact of flip on the three main benches (best-of-5, macOS ARM64)**: `bench_json_roundtrip` 400 ms / 137 MB → **90 ms / 130 MB** (4.4× faster). `bench_json_readonly` 290 ms / 120 MB → **80 ms / 90 MB** (3.6× faster, 25% less RSS). `bench_json_readonly_indexed` 300 ms / 120 MB → **90 ms / 90 MB** (3.3× faster, 25% less RSS). **vs Node 25.8.0** on the same benches: roundtrip Perry 90 ms vs Node 520 ms (5.8×), readonly Perry 80 ms vs Node 450 ms (5.6×), indexed Perry 90 ms vs Node 450 ms (5.0×). **vs Bun 1.3.12**: roundtrip Perry 90 ms vs Bun 290 ms (3.2×), readonly Perry 80 ms vs Bun 200 ms (2.5×), indexed Perry 90 ms vs Bun 210 ms (2.3×). Perry now leads every measured JSON workload. RSS gap to Bun (~50%) remains — closing that requires tier-3 generational GC per `docs/memory-perf-roadmap.md`. Gap test sweep unchanged at 24/28 (same pre-existing non-lazy failures). All existing `test_json_*.ts` pass byte-for-byte vs Node under default + forced-off + forced-on modes. Runtime tests 130/130. The original plan to implement static HIR analysis for lazy-safety auto-detection (v0.5.209 task) is closed without implementation — the runtime adaptive handling makes compile-time analysis unnecessary since lazy is now a strict improvement in the measured workload matrix.
0 commit comments