You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fair-bench: adapt to upstream API changes after master merge
Master changed several APIs and removed files that fair-bench depended on:
- Connectors are now factories taking a ConnectorRuntimeConfig (clockworklabs#4647)
- runOne now requires runtimeConfig: RunnerRuntimeConfig
- The SpacetimeDB connector exposes .call('seed', ...) (no more .reducer)
- templates/keynote-2/src/helpers.ts was deleted; poolMaxFromEnv lives
on getSharedRuntimeDefaults().poolMax now
- USE_SPACETIME_METRICS_ENDPOINT is no longer read anywhere
- The Rust client (clockworklabs#4753), warmup (clockworklabs#4757), and confirmedReads default
(clockworklabs#4682) asymmetries documented in FAIR-BENCHMARK.md were already
fixed upstream
Updates:
- fair-bench.ts: use parseBenchOptions, pass runtimeConfig to runOne,
factory(config) instead of factory(), unified .call('seed', ...)
seeding path for both Spacetime and RPC connectors
- postgres-storedproc-rpc-server.ts: import getSharedRuntimeDefaults
from ../config.ts instead of the deleted helpers.ts
- FAIR-BENCHMARK.md: rewritten to credit master for the asymmetries it
has resolved and focus on what this PR still adds (sequential
pipelining, read_committed, synchronous_commit=on, stored procedure
variant)
This is an alternative benchmark configuration that levels the playing field between SpacetimeDB and traditional database stacks. The original benchmark (`demo.ts`) has several asymmetries that compound to inflate SpacetimeDB's advantage far beyond what the architecture alone provides.
4
-
5
-
## What This Changes
6
-
7
-
| Factor | Original Benchmark | Fair Benchmark |
8
-
|--------|-------------------|----------------|
9
-
|**SpacetimeDB client**| Custom Rust client with 16,384 in-flight ops | Same TypeScript client as everyone else |
10
-
|**TPS counting**| Server-side Prometheus metrics (fire-and-forget) | Client-side round-trip counting for ALL systems |
11
-
|**Durability**|`confirmedReads=false` (no durability guarantee) |`confirmedReads=true` (durable commits, like Postgres fsync) |
12
-
|**Concurrency model**| 16,384 in-flight for SpacetimeDB vs 8 for competitors | Sequential (non-pipelined) for all systems |
13
-
|**Postgres isolation**|`serializable` (non-default, worst-case for contention) |`read_committed` (Postgres actual default) |
|**Postgres transfer**| 5 ORM round-trips via Drizzle | Also tested with stored procedure (single DB call) |
16
-
|**Warmup**| 5s warmup for Rust client only | No warmup for any system (equal cold start) |
3
+
This is an alternative benchmark configuration that levels the playing field
4
+
between SpacetimeDB and traditional database stacks. It runs alongside the
5
+
standard `demo`/`bench` commands without modifying their behaviour.
6
+
7
+
## What's Already Fair in `master`
8
+
9
+
When this fork was first opened (Feb 2026), the standard benchmark had
10
+
several asymmetries between SpacetimeDB and competitors. Most have since been
11
+
addressed upstream:
12
+
13
+
| Asymmetry | Status on master |
14
+
|-----------|------------------|
15
+
| SpacetimeDB used a hand-tuned Rust client; competitors used TypeScript |**Fixed (#4753):** Rust client removed; everyone uses the TypeScript client. |
16
+
|`STDB_CONFIRMED_READS` defaulted to `false`|**Fixed (#4682):** confirmed reads is now the default. |
17
+
| 5s warmup applied only to the SpacetimeDB Rust client |**Fixed (#4757):** warmup removed everywhere. |
18
+
| TypeScript client was missing some optimizations the Rust client had |**Fixed (#4494):** TS client brought to parity. |
19
+
| Compression unspecified |**Fixed (#4743):** compression mode is now an explicit knob. |
20
+
21
+
What this PR additionally enforces:
22
+
23
+
| Factor | Standard `bench`| Fair Benchmark |
24
+
|--------|------------------|----------------|
25
+
|**Pipelining**| SpacetimeDB pipelines up to `maxInflightPerWorker` (128) per connection; HTTP RPC connectors do not opt-in to pipelining |**Sequential** for all systems (`BENCH_PIPELINED=0`) so per-connection concurrency is identical |
26
+
|**Postgres isolation**|`serializable` (forced via the standard `docker-compose.yml`) |`read_committed` (Postgres' actual default) — `SELECT … FOR UPDATE` already provides row locking |
These remaining differences matter because the **architectural** advantage
32
+
of SpacetimeDB (colocated compute + storage, no network hop for data access)
33
+
is what we want to isolate. Sequential mode and a single-call stored
34
+
procedure remove the most obvious confounds between "platform architecture"
35
+
and "client/protocol/ORM choices."
17
36
18
37
## Why These Changes Matter
19
38
20
-
### 1. Same Client Language (TypeScript for All)
21
-
22
-
The original benchmark uses a hand-tuned **Rust client** for SpacetimeDB that sends 16,384 concurrent operations per connection via binary WebSocket, while all competitors use a TypeScript client with HTTP/JSON and 8 in-flight operations. This alone is a ~2000x difference in concurrency per connection.
23
-
24
-
The README justifies this by saying "we were bottlenecked on our test TypeScript client" — but then no competitor gets the same optimization. A fair comparison uses the same client for all.
which the runner picks up to pipeline 128 in-flight reducer calls per
43
+
connection. RPC connectors (Postgres, CockroachDB, SQLite) leave this unset,
44
+
so their per-connection concurrency is effectively 1. Forcing
45
+
`BENCH_PIPELINED=0` makes both sides sequential per connection, isolating
46
+
the per-call latency comparison.
27
47
28
-
The original benchmark defaults `STDB_CONFIRMED_READS` to `false`, meaning SpacetimeDB doesn't wait for durable commits before reporting success. Meanwhile Postgres runs with `fsync=on`. This is comparing "maybe durable" vs "definitely durable" — not a fair durability comparison.
48
+
### 2. Postgres Isolation Level
29
49
30
-
### 3. Client-Side TPS Counting
50
+
The standard `docker-compose.yml` sets
51
+
`default_transaction_isolation=serializable` for Postgres. That is **not**
52
+
Postgres' default (`read_committed`). Under the Zipf contention workload,
53
+
serializable causes large transaction-abort/retry storms that
54
+
disproportionately hurt Postgres. The benchmark already uses
55
+
`SELECT … FOR UPDATE` for row-level locking, so serializable is unnecessary
56
+
to get correct results.
31
57
32
-
The original `demo.ts` sets `USE_SPACETIME_METRICS_ENDPOINT=1`, which counts committed transactions **on the server** via Prometheus. Combined with the fire-and-forget Rust client, this counts transactions that completed server-side but whose acknowledgments may not have reached the client. All other systems count only after the full round-trip completes.
A SpacetimeDB reducer is a **single atomic call** that runs inside the
61
+
database. The standard Postgres comparison uses Drizzle ORM, which sends
62
+
roughly:
35
63
36
-
The original forces `default_transaction_isolation=serializable` on Postgres, which is **not** the default (`read_committed`). Under the Zipf contention workload, serializable causes massive transaction aborts and retries, dramatically hurting Postgres performance. The benchmark already uses `SELECT ... FOR UPDATE` for row-level locking, making serializable unnecessary.
37
-
38
-
### 5. Stored Procedure vs ORM
39
-
40
-
SpacetimeDB's reducer executes as a single atomic operation inside the database. The original Postgres benchmark uses Drizzle ORM which requires:
41
64
-`BEGIN`
42
-
-`SELECT ... FOR UPDATE` (fetch both accounts)
65
+
-`SELECT … FOR UPDATE` (fetch both accounts)
43
66
-`UPDATE` (debit)
44
67
-`UPDATE` (credit)
45
68
-`COMMIT`
46
69
47
-
That's 5 round-trips between the Node.js process and Postgres. A stored procedure (`do_transfer()`) does the same work in a single call — which is the fair equivalent of SpacetimeDB's reducer model.
70
+
i.e. ~5 round-trips between Node and Postgres per transfer. A PL/pgSQL
71
+
stored procedure (`do_transfer`) does the same work in a single round-trip
72
+
— architecturally the same shape as a reducer. Comparing
73
+
`postgres_storedproc_rpc` against `spacetimedb` cleanly isolates the
74
+
"platform architecture" gap from the "ORM round-trip overhead" gap.
75
+
76
+
### 4. `synchronous_commit=on`
77
+
78
+
SpacetimeDB with confirmed reads waits for durable acknowledgement before
79
+
returning. Postgres should match: `synchronous_commit=on` (the default;
80
+
the standard compose file overrides it to `off` for raw throughput).
48
81
49
82
## Running the Fair Benchmark
50
83
@@ -54,33 +87,32 @@ That's 5 round-trips between the Node.js process and Postgres. A stored procedur
54
87
# Install dependencies
55
88
pnpm install
56
89
57
-
# Start services
90
+
# Start services with fair config (Postgres tuned to defaults; stored
91
+
# procedure RPC server included)
58
92
docker compose -f docker-compose-fair.yml up -d
59
93
60
-
#Start SpacetimeDB
94
+
#If running SpacetimeDB locally instead of in Docker
61
95
spacetime start
62
-
63
-
# Publish the SpacetimeDB module
64
96
spacetime publish --server local test-1 --module-path ./spacetimedb
65
97
```
66
98
67
99
### Run
68
100
69
101
```bash
70
102
# Default: SpacetimeDB vs Postgres (ORM) vs Postgres (stored proc)
71
-
npm run fair-bench
103
+
pnpm run fair-bench
72
104
73
-
# With options
74
-
npm run fair-bench -- --seconds 10 --concurrency 50 --alpha 0.5
105
+
# With options (these mirror the standard bench CLI; --skip-prep is fair-bench-only)
106
+
pnpm run fair-bench -- --seconds 10 --concurrency 50 --alpha 0.5
75
107
76
108
# High contention
77
-
npm run fair-bench -- --alpha 1.5
109
+
pnpm run fair-bench -- --alpha 1.5
78
110
79
111
# Include more systems
80
-
npm run fair-bench -- --systems spacetimedb,postgres_rpc,postgres_storedproc_rpc,sqlite_rpc
112
+
pnpm run fair-bench -- --systems spacetimedb,postgres_rpc,postgres_storedproc_rpc,sqlite_rpc
81
113
82
114
# Skip seeding (if already seeded)
83
-
npm run fair-bench -- --skip-prep
115
+
pnpm run fair-bench -- --skip-prep
84
116
```
85
117
86
118
### Start the Stored Procedure RPC Server (non-Docker)
@@ -90,22 +122,34 @@ npm run fair-bench -- --skip-prep
Now that `master` has removed the SpacetimeDB Rust client, the original
128
+
"Rust binary protocol vs Node.js HTTP+JSON" gap is no longer present in the
129
+
standard benchmark — both sides use TypeScript. The
130
+
`postgres-rust-client/` directory in this PR is therefore a **standalone
131
+
reference**: it lets you measure how much of any remaining gap is due to
132
+
Node.js client overhead vs. the database itself, by driving Postgres with a
133
+
Rust binary client.
94
134
95
-
With a leveled playing field, SpacetimeDB's genuine architectural advantage (colocated compute+storage, no network hop for data access) should still show a meaningful speedup — likely **2-5x** rather than the claimed **14x**. The remaining advantage is real and architectural:
135
+
```bash
136
+
cd postgres-rust-client
137
+
cargo run --release -- --seconds 10 --concurrency 50 --alpha 0.5
138
+
```
139
+
140
+
## What the Numbers Should Show
141
+
142
+
With the playing field leveled, SpacetimeDB's genuine architectural
143
+
advantage (colocated compute+storage, no network hop for data access)
144
+
should still show a meaningful speedup. The remaining gap reflects:
96
145
97
146
- Zero-copy in-process data access vs TCP round-trips
98
-
- Rust execution vs Node.js JavaScript
147
+
- Rust execution vs Node.js JavaScript on the server side
99
148
- Binary BSATN protocol vs JSON serialization
100
149
101
-
The factors that are **not** architectural and were removed:
0 commit comments