Skip to content

Commit fe3b3ad

Browse files
committed
docs(bench): add Kotlin to JSON polyglot, refresh all numbers (v0.5.242)
Adds Kotlin (kotlinx.serialization) to the JSON polyglot suite as the 8th language. Re-runs every benchmark suite on this commit and updates benchmarks/README.md as the canonical single source for every measurement. JSON polyglot (8 languages, 15 rows including idiomatic + optimized): perry (gen-gc + lazy tape) 67 ms / 85 MB ← LEAD rust serde_json (LTO+1cgu) 183 ms / 11 MB rust serde_json 193 ms / 11 MB bun 240 ms / 81 MB perry (mark-sweep, no lazy) 341 ms / 102 MB node 361 ms / 180 MB kotlin -server -Xmx512m 446 ms / 423 MB kotlin (idiomatic) 460 ms / 606 MB c++ -O3 -flto (nlohmann/json) 774 ms / 25 MB go (encoding/json) 783 ms / 22 MB c++ -O2 (nlohmann/json) 840 ms / 25 MB swift -O -wmo (Foundation) 3665 ms / 34 MB swift -O (Foundation) 3674 ms / 33 MB Perry leads the entire field on time: 3.6× over Bun, 5.4× over Node, 2.7× over Rust serde_json LTO, 6.7× over Kotlin server JIT, 11.6× over C++, 11.7× over Go, 54.7× over Swift. Compute polyglot re-run surfaces two honest regressions vs the v0.5.164 baseline: - nested_loops: 8 → 17 ms - accumulate: 24 → 33 ms Both caused by the v0.5.237 generational-GC default flip — per- allocation gen-GC machinery is overhead that compute benches don't recoup. PERRY_GEN_GC=0 recovers baseline. Trade-off was deliberate: gen-GC's wins on long-running workloads (test_memory_json_churn 115 → 91 MB) outweigh the small compute regression. Documented in the README without softening — defensibility requires showing the losses, not just the wins. Memory stability suite: 18/18 PASS. Perry-suite RSS history updated through v0.5.241. benchmarks/README.md is now the canonical single page referenced everywhere — every benchmark, every comparison, every flag, every honest disclaimer in one GitHub-renderable page.
1 parent 17fafdb commit fe3b3ad

9 files changed

Lines changed: 338 additions & 112 deletions

File tree

CHANGELOG.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,63 @@
22

33
Detailed changelog for Perry. See CLAUDE.md for concise summaries.
44

5+
## v0.5.242 — Kotlin added to JSON polyglot, full benchmark sweep refresh, `benchmarks/README.md` updated as the canonical single source for every measurement. Touchpoints:
6+
7+
**`benchmarks/json_polyglot/bench.kt`** — new Kotlin implementation of the identical 10k-record / ~1 MB blob / 50-iteration parse + stringify workload. Uses `kotlinx.serialization-json` 1.9.0 (the official Kotlin serialization library; compile-time-generated (de)serializers via the kotlinx-serialization compiler plugin, no runtime reflection). Compiles via `kotlinc -Xplugin=...kotlinx-serialization-compiler-plugin.jar` against the JARs shipped with `brew install gradle` (the brew kotlin formula doesn't bundle the runtime libs but gradle does). Listed twice in `RESULTS.md` — idiomatic JVM defaults and `-server -Xmx512m`.
8+
9+
**`benchmarks/json_polyglot/run.sh`** — extended to detect kotlinc, use the gradle-bundled kotlinx-serialization JARs, compile the .kt to a JAR, and run under both flag profiles. Other languages unchanged.
10+
11+
**Full sweep re-run on this commit (M1 Max, macOS 26.4, best of 5):**
12+
13+
JSON polyglot (15 rows, sorted by time):
14+
15+
| Implementation | Profile | Time (ms) | Peak RSS (MB) |
16+
|---|---|---:|---:|
17+
| **perry (gen-gc + lazy tape)** | optimized | **67** | 85 |
18+
| rust serde_json (LTO+1cgu) | optimized | 183 | 11 |
19+
| rust serde_json | idiomatic | 193 | 11 |
20+
| bun | idiomatic | 240 | 81 |
21+
| perry (mark-sweep, no lazy) | idiomatic | 341 | 102 |
22+
| node | idiomatic | 361 | 180 |
23+
| node --max-old=4096 | optimized | 364 | 182 |
24+
| kotlin -server -Xmx512m | optimized | 446 | 423 |
25+
| kotlin (kotlinx.serialization) | idiomatic | 460 | 606 |
26+
| c++ -O3 -flto (nlohmann/json) | optimized | 774 | 25 |
27+
| go (encoding/json) | optimized | 783 | 22 |
28+
| go (encoding/json) | idiomatic | 785 | 23 |
29+
| c++ -O2 (nlohmann/json) | idiomatic | 840 | 25 |
30+
| swift -O -wmo (Foundation) | optimized | 3665 | 34 |
31+
| swift -O (Foundation) | idiomatic | 3674 | 33 |
32+
33+
Perry leads the entire field on time: 3.6× over Bun, 5.4× over Node, 2.7× over Rust serde_json (LTO), 6.7× over Kotlin (server JIT), 11.6× over C++ -O3 -flto, 11.7× over Go encoding/json, 54.7× over Swift Foundation. RSS is mid-pack — beats Node and Kotlin (JVM heap reservation is enormous), comparable to Bun, 8× higher than typed-struct languages (Rust 11 MB, Go 22 MB, C++ 25 MB).
34+
35+
Compute polyglot (8 microbenches × 9 runtimes, refreshed):
36+
37+
| Benchmark | Perry | Rust | C++ | Go | Swift | Java | Node | Bun | Python |
38+
|----------------|------:|------:|------:|------:|------:|------:|------:|------:|--------:|
39+
| fibonacci | 302 | 314 | 304 | 440 | 394 | 276 | 991 | 510 | 15661 |
40+
| loop_overhead | 12 | 95 | 94 | 94 | 94 | 96 | 52 | 40 | 2934 |
41+
| array_write | 3 | 7 | 2 | 8 | 2 | 6 | 8 | 5 | 389 |
42+
| array_read | 4 | 9 | 9 | 10 | 9 | 10 | 12 | 15 | 337 |
43+
| math_intensive | 14 | 46 | 49 | 47 | 47 | 50 | 48 | 50 | 2204 |
44+
| object_create | 0 | 0 | 0 | 0 | 0 | 4 | 8 | 6 | 158 |
45+
| nested_loops | 17 | 8 | 8 | 9 | 8 | 10 | 16 | 19 | 470 |
46+
| accumulate | 33 | 94 | 94 | 94 | 95 | 96 | 585 | 96 | 4916 |
47+
48+
**Two regressions vs the v0.5.164 baseline** documented honestly in both the polyglot RESULTS.md and the consolidated README:
49+
- `nested_loops` 8 → 17 ms (+9 ms)
50+
- `accumulate` 24 → 33 ms (+9 ms)
51+
52+
Both caused by the v0.5.237 generational-GC default flip — per-allocation gen-GC machinery (write-barrier potential, age-bump pass) is overhead that allocation-heavy compute benches can't recoup. `PERRY_GEN_GC=0` recovers the 8 / 24 ms baseline. The trade-off was deliberate; gen-GC's wins on long-running and RSS-sensitive workloads (`test_memory_json_churn` 115 → 91 MB in v0.5.237) outweigh the small compute regression. All other compute cells unchanged or slightly faster. Listed unapologetically in the docs because the point of the consolidated benchmark page is to be defensible, not to win.
53+
54+
Memory-stability suite: 18/18 PASS under default / mark-sweep escape hatch / gen-gc + write barriers; 6/6 PASS under gen-gc + evacuate. RSS values per test: long_lived_loop 54 MB, json_churn 91 MB, string_churn 48 MB, closure_churn 13 MB, gc_aggressive_forced 9 MB, gc_deep_recursion 6 MB.
55+
56+
Perry-suite RSS (best of 5, with `/usr/bin/time -l`): bench_json_roundtrip default 66 ms / 85 MB, direct (no lazy) 375 ms / 102 MB; bench_json_readonly default 67 ms / 81 MB, direct 279 ms / 103 MB; 07_object_create 0 ms / 6 MB; 12_binary_trees 0 ms / 6 MB; bench_gc_pressure 16 ms / 25 MB; 04_array_read 4 ms / 211 MB; 05_fibonacci 309 ms / 6 MB; 08_string_concat 0 ms / 6 MB.
57+
58+
**`benchmarks/README.md` is now the canonical single source** — every benchmark, every comparison runtime, every flag profile, every honest disclaimer in one GitHub-renderable page. The page is the file every other doc reference points at when discussing performance. Page-internal sections: TL;DR (JSON + compute), how-to-read, JSON polyglot full data + library choices + per-cell disclaimers, compute polyglot full data + regression footnotes, memory tests, RSS-history table, strengths (4 wins with current ratios), weaknesses (8 known gaps), what-this-doesn't-measure honesty section, reproducing instructions, design-doc links. Closing line: "the point of this page is to be defensible, not to win".
59+
60+
`brew install kotlin gradle` added as dependencies for the Kotlin JSON bench. nlohmann-json (3.12.0) for C++ already a dependency from v0.5.241.
61+
562
## v0.5.241 — Polyglot JSON benchmark suite + consolidated `benchmarks/README.md`. The repo previously had two benchmark sources — `benchmarks/polyglot/` (8 compute microbenches across 10 runtimes, last refreshed at v0.5.164) and `benchmarks/suite/` (Perry-only roundtrip / readonly / GC-pressure benches). What was missing was a JSON encoding/decoding comparison against native runtimes (Go, Rust, C++, Swift) and a single consolidated page that a skeptical reader could open, see every benchmark in context, and verify Perry's claims. This commit fixes both.
663

764
**`benchmarks/json_polyglot/`** — new directory, 6 benchmark implementations of the identical 10k-record / ~1 MB blob / 50-iteration parse + stringify workload:

CLAUDE.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
88

99
Perry is a native TypeScript compiler written in Rust that compiles TypeScript source code directly to native executables. It uses SWC for TypeScript parsing and LLVM for code generation.
1010

11-
**Current Version:** 0.5.241
11+
**Current Version:** 0.5.242
1212

1313
## TypeScript Parity Status
1414

@@ -149,6 +149,7 @@ First-resolved directory cached in `compile_package_dirs`; subsequent imports re
149149

150150
Keep entries to 1-2 lines max. Full details in CHANGELOG.md.
151151

152+
- **v0.5.242** — Add Kotlin (kotlinx.serialization) to JSON polyglot, re-run every benchmark suite, refresh `benchmarks/README.md` with all current numbers. Kotlin compiles via `kotlinc -Xplugin=...kotlinx-serialization-compiler-plugin.jar` against `kotlinx-serialization-{core,json}-jvm-1.9.0.jar` (shipped with `brew install gradle`); listed twice in JSON results — idiomatic JVM (460 ms / 606 MB) and `-server -Xmx512m` (446 ms / 423 MB). RSS for Kotlin is JVM heap reservation, not working-set; honestly framed in the page. **Compute polyglot re-run** surfaced two regressions vs the v0.5.164 baseline: `nested_loops` 8 → 17 ms, `accumulate` 24 → 33 ms — both caused by the v0.5.237 generational-GC default flip (per-allocation overhead on tight allocation loops). All other compute cells unchanged or slightly faster. Documented unapologetically in both `polyglot/RESULTS.md` and the consolidated README — the trade-off was deliberate (gen-GC's wins on long-running and RSS-sensitive workloads outweigh the small compute regression; `PERRY_GEN_GC=0` recovers baseline). New JSON polyglot table (8 languages × 15 rows including Kotlin) shows Perry leading time across the entire field: 3.6× over Bun, 5.4× over Node, 2.7× over Rust serde_json LTO, 6.7× over Kotlin (server JIT), 11.6× over C++ -O3 -flto, 11.7× over Go, 54.7× over Swift Foundation. Memory-stability suite: 18/18 PASS. **`benchmarks/README.md` is the canonical single page** referenced by every other doc — every benchmark, every comparison, every flag, every honest-disclaimer is on the one page.
152153
- **v0.5.241** — Polyglot JSON benchmark suite + consolidated `benchmarks/README.md`. New `benchmarks/json_polyglot/` runs identical 10k-record / ~1 MB blob / 50-iteration parse + stringify workload across Perry / Bun / Node / Go / Rust serde_json / Swift Foundation / C++ nlohmann. Each language listed twice (idiomatic + optimized flag profile) so skeptics see both the default-build floor and the aggressive-tuning ceiling. **Perry leads on time at 65 ms** vs Rust LTO 180 ms / Bun 242 ms / Node 359 ms / C++ 778 ms / Go 785 ms / Swift 3706 ms. Perry's RSS (85 MB) is mid-pack — beats Node (182 MB), comparable to Bun (80 MB), 8× higher than typed-struct languages (Rust 11 MB, Go 22 MB). Honest framing called out: dynamic-typing fundamental cost, lazy-tape workload-specificity, library-choice (nlohmann vs simdjson) trade-offs. New `benchmarks/README.md` consolidates everything into ONE GitHub-renderable page: TL;DR tables, methodology, full compiler-flags table, JSON polyglot results, compute microbench results (linked from existing `benchmarks/polyglot/`), memory + GC stability suite, strengths, weaknesses, "what this doesn't measure" honesty section, reproducing instructions, design-doc links. Single page for defensibility — every implementation, every flag, every methodology decision in one place.
153154
- **v0.5.240** — Gen-GC docs: academic + industry lineage appendix added to `docs/generational-gc-plan.md`. Maps each phase (A/B/C/C4/C4b/D) to its canonical paper and lists shipping VMs that use the same techniques (V8, JSC, HotSpot, SpiderMonkey, .NET, OCaml, Mono, Go, LuaJIT). Single strongest reference: **Bartlett 1988 *Mostly Copying Garbage Collection*** (DEC SRC TN-13) — Perry's `CONS_PINNED` + `GC_FLAG_FORWARDED` + reference rewriting + conservative-pin policy is essentially Bartlett's algorithm in Rust extended to the generational case (per Ungar 1984). 8-paper bibliography + textbook reference (Jones/Hosking/Moss *Garbage Collection Handbook*). Defensibility material: every design decision traces to a paper or shipping VM doing the same thing. No code changes.
154155
- **v0.5.239** — Gen-GC **roadmap complete (architectural)**: `docs/generational-gc-plan.md` Log table filled in with 21 commits across Phases A→D. The original Phase D scope listed a conservative-scanner shrink ("scan only the C stack below JS frames") as a sub-goal; **deferred** with rationale documented in plan §Deferred-follow-ups. The naive simple-shrink (skip ranges by SP-at-push only) is unsafe — Rust runtime frames sandwiched between JS frames (e.g., `js_array_map` between caller-JS and callback-JS) hold JSValue locals that need conservative coverage; skipping them prematurely frees live objects. A correct implementation requires platform-specific frame-pointer chain walking on entry to `js_shadow_frame_push`, with deep alternating-call test coverage. Conservative-scan time is sub-1% of every measured benchmark, so the optimization deferred is genuinely marginal. Phase D's three primary ship criteria — `PERRY_GEN_GC=1` default, escape hatch retained, docs updated — are all met by v0.5.237/238/239.

Cargo.lock

Lines changed: 27 additions & 27 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ opt-level = "s" # Optimize for size in stdlib
109109
opt-level = 3
110110

111111
[workspace.package]
112-
version = "0.5.241"
112+
version = "0.5.242"
113113
edition = "2021"
114114
license = "MIT"
115115
repository = "https://github.com/PerryTS/perry"

0 commit comments

Comments
 (0)