Skip to content

Commit 7d39ec1

Browse files
committed
chore: update benchmark results
1 parent 5393483 commit 7d39ec1

1 file changed

Lines changed: 48 additions & 48 deletions

File tree

benchmarks/results.md

Lines changed: 48 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,29 @@
11
# execbox Performance Benchmark Results
22

3-
**Date:** 2026-04-10
4-
**Environment:** Node v25.9.0 | darwin arm64 (Apple Silicon)
3+
**Date:** 2026-04-11
4+
**Environment:** Node v24.13.1 | darwin arm64 (Apple Silicon)
55
**Configuration:** iterations=15, warmups=3
66
**Command:** `npm run benchmark -- --iterations=15 --warmups=3`
77

8-
This file is a snapshot from one local run. Treat the tables as measured data for this environment, not universal product guarantees.
8+
This file records the current local benchmark run. Treat the tables as measured data for this environment, not universal product guarantees.
99

1010
---
1111

1212
## 1. Single-Execution Latency
1313

1414
| Executor | No Tools (median) | 1 Tool Call (median) | 2 Tool Calls (median) |
1515
| -------------------- | ----------------- | -------------------- | --------------------- |
16-
| QuickJS (in-process) | **1.81ms** | **3.13ms** | **5.09ms** |
17-
| Worker (ephemeral) | 97.37ms | 98.43ms | 105.44ms |
18-
| Worker (pooled) | 1.84ms | 3.26ms | 4.80ms |
19-
| Process (ephemeral) | 222.33ms | 222.68ms | 335.43ms |
20-
| Process (pooled) | 2.27ms | 3.66ms | 5.38ms |
16+
| QuickJS (in-process) | 2.23ms | **2.91ms** | **4.13ms** |
17+
| Worker (ephemeral) | 122.59ms | 122.77ms | 126.03ms |
18+
| Worker (pooled) | **1.72ms** | 3.02ms | 4.38ms |
19+
| Process (ephemeral) | 342.67ms | 347.26ms | 354.62ms |
20+
| Process (pooled) | 2.11ms | 3.47ms | 4.86ms |
2121

2222
### Notes
2323

2424
- On this machine, warmed pooled executors stayed close to QuickJS for trivial scripts.
2525
- Ephemeral executors remained far slower than pooled executors because each execution still pays worker or process startup cost.
26-
- Process pooled stayed competitive for median latency, but its variance was visibly wider than worker pooled in the later throughput and contention suites.
26+
- Process pooled stayed low-latency in median terms, but its variability was still visibly wider than worker pooled on the no-tool case.
2727

2828
---
2929

@@ -33,54 +33,54 @@ Only pooled executors expose explicit `prewarm()`. QuickJS and ephemeral executo
3333

3434
| Executor | Cold First-Run (median) | Warm First-Run (median) | Speedup |
3535
| -------------------- | ----------------------- | ----------------------- | ------- |
36-
| QuickJS (in-process) | 1.72ms | N/A | N/A |
37-
| Worker (ephemeral) | 133.79ms | N/A | N/A |
38-
| Worker (pooled) | 116.67ms | 2.47ms | 97.9% |
39-
| Process (ephemeral) | 307.29ms | N/A | N/A |
40-
| Process (pooled) | 318.78ms | 4.38ms | 98.6% |
36+
| QuickJS (in-process) | 1.90ms | N/A | N/A |
37+
| Worker (ephemeral) | 123.82ms | N/A | N/A |
38+
| Worker (pooled) | 126.61ms | 2.51ms | 98.0% |
39+
| Process (ephemeral) | 352.19ms | N/A | N/A |
40+
| Process (pooled) | 350.72ms | 3.44ms | 99.0% |
4141

4242
### Notes
4343

44-
- True `prewarm()` materially changed first-request behavior in this run: both pooled worker and pooled process executors dropped from shell-plus-guest startup latency to low-single-digit warm execution latency.
45-
- That is the intended effect of prewarm after the benchmark harness and executor updates: pay the host-shell and guest-startup path before live traffic arrives.
44+
- True `prewarm()` delivered the intended first-request behavior in this run: pooled worker and pooled process executors dropped from shell-plus-guest startup latency to low-single-digit warm execution latency.
45+
- `prewarm()` pays the host-shell and guest-startup path before live traffic arrives.
4646

4747
---
4848

4949
## 3. Tool-Call Overhead Scaling
5050

5151
| Tool Calls | QuickJS (median) | Worker Pooled (median) |
5252
| ---------- | ---------------- | ---------------------- |
53-
| 0 | 1.65ms | 2.00ms |
54-
| 1 | 3.00ms | 3.87ms |
55-
| 5 | 8.35ms | 11.72ms |
56-
| 10 | 14.36ms | 15.99ms |
53+
| 0 | 1.58ms | 1.81ms |
54+
| 1 | 2.83ms | 3.23ms |
55+
| 5 | 7.83ms | 8.60ms |
56+
| 10 | 13.67ms | 14.44ms |
5757

5858
**Marginal cost per tool call**
5959

6060
| | QuickJS | Worker (pooled) |
6161
| ------------- | ----------- | --------------- |
62-
| From 1 call | 1.34ms/call | 1.87ms/call |
63-
| From 5 calls | 1.34ms/call | 1.94ms/call |
64-
| From 10 calls | 1.27ms/call | 1.40ms/call |
62+
| From 1 call | 1.25ms/call | 1.42ms/call |
63+
| From 5 calls | 1.25ms/call | 1.36ms/call |
64+
| From 10 calls | 1.21ms/call | 1.26ms/call |
6565

6666
### Notes
6767

6868
- Tool-call cost still scaled roughly linearly in this run.
69-
- Worker pooled carried a slightly higher per-call overhead than QuickJS, but the absolute delta stayed small compared with the startup gap between pooled and ephemeral execution modes.
69+
- QuickJS and worker pooled stayed close enough that real tool work is still likely to dominate end-to-end latency once tools do anything non-trivial.
7070

7171
---
7272

7373
## 4. Schema Validation Overhead
7474

75-
| Executor | With Schema (median) | Without Schema (median) | Overhead |
76-
| -------------------- | -------------------- | ----------------------- | --------------- |
77-
| QuickJS (in-process) | 3.06ms | 3.24ms | -0.17ms (-5.4%) |
78-
| Worker (pooled) | 4.25ms | 4.07ms | 0.17ms (4.3%) |
75+
| Executor | With Schema (median) | Without Schema (median) | Overhead |
76+
| -------------------- | -------------------- | ----------------------- | ------------- |
77+
| QuickJS (in-process) | 2.79ms | 2.72ms | 0.08ms (2.8%) |
78+
| Worker (pooled) | 3.19ms | 2.95ms | 0.24ms (8.1%) |
7979

8080
### Notes
8181

82-
- The small negative QuickJS delta is measurement noise, not evidence that schema validation makes execution faster.
83-
- On the worker pooled path, schema validation remained a small absolute cost compared with overall execution time.
82+
- Schema validation remained a small absolute cost in this run.
83+
- The QuickJS delta was near zero, which reinforces the earlier guidance that schema validation is not the place to chase meaningful latency gains.
8484

8585
---
8686

@@ -90,34 +90,34 @@ The pooled benchmark factories in this suite use a fixed `pool.maxSize: 2`.
9090

9191
| Executor | Conc=1 (exec/s) | Conc=2 | Conc=4 | Conc=8 |
9292
| -------------------- | --------------- | ------ | ------ | ---------- |
93-
| QuickJS (in-process) | 301.3 | 549.8 | 914.9 | **1365.2** |
94-
| Worker (pooled) | 257.9 | 506.1 | 490.0 | 222.4 |
95-
| Process (pooled) | 175.0 | 105.5 | 273.4 | 290.9 |
93+
| QuickJS (in-process) | 361.6 | 683.8 | 1178.7 | **1759.0** |
94+
| Worker (pooled) | 323.2 | 637.6 | 654.7 | 653.3 |
95+
| Process (pooled) | 322.6 | 573.8 | 564.0 | 542.3 |
9696

9797
### Notes
9898

9999
- QuickJS was the highest-throughput path in this run for trusted, in-process workloads.
100100
- Worker pooled tracked closely through concurrency 2, then paid visible queueing once demand moved past the benchmark pool size.
101-
- Process pooled results were materially noisier in this suite, so use the exact ordering here as a local data point rather than a stable ranking.
101+
- Process pooled stayed competitive, but it still trailed worker pooled at every tested concurrency level above 2.
102102

103103
---
104104

105105
## 6. Pool Contention
106106

107107
| Executor | Pool Size | Throughput (exec/s) | Median Latency | P95 Latency | Max Latency |
108108
| -------- | --------- | ------------------- | -------------- | ----------- | ----------- |
109-
| Worker | 1 | 196.6 | 39.51ms | 43.84ms | 44.00ms |
110-
| Worker | 2 | 271.7 | 25.14ms | 41.91ms | 46.19ms |
111-
| Worker | 4 | **400.7** | 15.55ms | 37.65ms | 37.84ms |
112-
| Process | 1 | 123.5 | 63.71ms | 81.57ms | 82.85ms |
113-
| Process | 2 | **286.5** | 26.13ms | 34.02ms | 34.14ms |
114-
| Process | 4 | 236.8 | 29.69ms | 61.02ms | 61.20ms |
109+
| Worker | 1 | 326.2 | 24.40ms | 25.73ms | 26.11ms |
110+
| Worker | 2 | 633.1 | 12.17ms | 13.81ms | 15.50ms |
111+
| Worker | 4 | **1117.0** | 6.51ms | 8.86ms | 8.92ms |
112+
| Process | 1 | 309.5 | 25.58ms | 27.46ms | 27.83ms |
113+
| Process | 2 | 546.4 | 13.44ms | 16.39ms | 17.81ms |
114+
| Process | 4 | 793.5 | 8.14ms | 12.03ms | 12.45ms |
115115

116116
### Notes
117117

118-
- Worker pooled improved steadily as pool size increased in this run.
119-
- Process pooled improved sharply from pool size 1 to 2, but did not hold that benefit at pool size 4 on this machine.
120-
- Pool size is still the main throughput control for out-of-process executors, but process sizing looks more workload-sensitive than worker sizing.
118+
- Increasing pool size improved both worker and process executors in this run.
119+
- Worker pooled still kept the better mix of throughput and tail latency at every pool size tested.
120+
- Process pooled also scaled up well here, but its tails remained wider than worker pooled at the same pool size.
121121

122122
---
123123

@@ -127,9 +127,9 @@ This suite only measures the parent Node process. It does not attempt to attribu
127127

128128
| Executor | Heap Delta | RSS Delta | External Delta |
129129
| -------------------- | ---------- | --------- | -------------- |
130-
| QuickJS (in-process) | -0.49MB | +3.61MB | -0.00MB |
131-
| Worker (ephemeral) | +0.02MB | +21.34MB | 0.00MB |
132-
| Worker (pooled) | +0.01MB | -1.33MB | 0.00MB |
130+
| QuickJS (in-process) | -0.49MB | +2.48MB | -0.02MB |
131+
| Worker (ephemeral) | +0.01MB | -8.42MB | 0.00MB |
132+
| Worker (pooled) | +0.01MB | +0.52MB | 0.00MB |
133133

134134
### Notes
135135

@@ -143,9 +143,9 @@ This suite only measures the parent Node process. It does not attempt to attribu
143143
### High-value takeaways from this snapshot
144144

145145
- QuickJS remained the lowest-latency and highest-throughput option for trusted, in-process workloads on this machine.
146-
- True `prewarm()` now delivered the intended first-request benefit for pooled worker and pooled process executors in this run.
146+
- True `prewarm()` delivered the intended first-request benefit for pooled worker and pooled process executors.
147147
- Worker pooled remained the strongest general-purpose local trade-off between isolation, throughput, and tail latency.
148-
- Process pooled stayed viable when process isolation matters, but its concurrency and contention behavior was more workload-sensitive in this snapshot.
148+
- Process pooled stayed viable when process isolation matters, but it still trailed worker pooled on throughput and tail latency.
149149
- Ephemeral modes remained dramatically slower than pooled modes and are best reserved for cases that need a fresh host boundary per execution.
150150

151151
### What this snapshot does not prove

0 commit comments

Comments
 (0)