Skip to content

Commit 919e31e

Browse files
authored
Use many connections on the benchmarks server (#7852)
## Summary Test website: http://ec2-18-219-54-101.us-east-2.compute.amazonaws.com:3000/ Turns out we can just clone the connection and have each of them write (with retry). Now all threads can read and write concurrently instead of being serialized through a lock. You can look at this [file](https://github.com/vortex-data/vortex/blob/8afd24b705cae0bd6ebe8fd09b1ab7cc93e5961f/benchmarks-website/server/ARCHITECTURE.md) to see the architecture of the new server. ## Testing What's that?! --------- Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
1 parent cab6036 commit 919e31e

29 files changed

Lines changed: 2918 additions & 776 deletions

Cargo.lock

Lines changed: 5 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

benchmarks-website/README.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,12 +36,13 @@ local DB file. Five fact tables (`query_measurements`, `compression_times`,
3636
`compression_sizes`, `random_access_times`, `vector_search_runs`) plus a
3737
`commits` dim table — see [`server/src/schema.rs`](server/src/schema.rs) for
3838
the column contracts. Three HTML routes (`/`, `/chart/{slug}`,
39-
`/group/{slug}`) and four JSON routes (`GET /api/groups`,
40-
`GET /api/chart/{slug}`, `GET /api/group/{slug}`, `GET /health`), plus a
41-
bearer-gated `POST /api/ingest`. Charts render inline on the landing page via
42-
SSR + lazy hydration; visual downsampling (LTTB at most
43-
`MAX_VISIBLE_POINTS = 500`) is client-side in
44-
[`server/static/chart-init.js`](server/static/chart-init.js).
39+
`/group/{slug}`) and four stable JSON routes (`GET /api/groups`,
40+
`GET /api/chart/{slug}`, `GET /api/group/{slug}`, `GET /health`), plus
41+
versioned group shard artifacts and bearer-gated `POST /api/ingest`. The hot
42+
website path serves precomputed, precompressed latest-100 artifacts from an
43+
in-memory read model; pages render chart shells and hydrate groups via shard
44+
artifacts, while full history warms in the background. See
45+
[`server/ARCHITECTURE.md`](server/ARCHITECTURE.md).
4546

4647
For the per-module crate map and the request-flow walkthrough, see the
4748
`//!` doc on [`server/src/lib.rs`](server/src/lib.rs). The producer side of
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
<!--
2+
SPDX-License-Identifier: Apache-2.0
3+
SPDX-FileCopyrightText: Copyright the Vortex contributors
4+
-->
5+
6+
# Benchmark Server Architecture
7+
8+
The benchmark website is optimized around a materialized latest-100 read path.
9+
DuckDB is the source of truth, but normal landing-page and group-open
10+
hydration does not run SQL, serialize JSON, or compress responses per request.
11+
12+
## Hot Read Path
13+
14+
On startup the server builds a `ReadGeneration` from one DuckDB snapshot. That
15+
generation contains precomputed JSON artifacts for:
16+
17+
- `/api/groups`
18+
- default `/api/chart/{slug}` latest-100 payloads
19+
- default `/api/group/{slug}` latest-100 compatibility payloads
20+
- versioned group shard payloads under
21+
`/api/artifacts/{generation}/groups/{group_slug}/shards/{index}`
22+
23+
Each artifact is stored in memory as identity, gzip, and brotli bytes. Request
24+
handlers negotiate `Accept-Encoding` and serve those bytes directly with
25+
`ETag`, `Vary: Accept-Encoding`, `Content-Length`, and cache headers.
26+
27+
## Page Hydration
28+
29+
The landing page and `/group/{slug}` render group metadata plus chart shells,
30+
not inline chart payloads. Each group carries the active read generation, shard
31+
count, and shard URL prefix. `chart-init.js` fetches shard 0 on intent or group
32+
open so charts paint quickly, then queues the remaining latest-100 shards with
33+
bounded per-tab concurrency.
34+
35+
Latest-100 chart payloads include additive `history` metadata:
36+
37+
- `total_commits`: full x-axis length for the chart
38+
- `start_index`: where this payload starts in the full x-axis
39+
- `loaded_commits`: number of loaded commits
40+
- `complete`: whether the payload covers the full x-axis
41+
42+
The client normalizes incomplete latest-100 payloads onto the full virtual
43+
x-axis. Older unloaded commits are represented by blank labels and null series
44+
values, so the range strip, zoom limits, and slider bounds behave as if the
45+
whole history is present without fabricating data.
46+
47+
## Full-History Warmup
48+
49+
Opening a group queues `/api/chart/{slug}?n=all` for that group's charts in a
50+
separate low-concurrency priority queue. A later-opened group gets higher
51+
priority than queued work for older groups. If the user pans or zooms into an
52+
unloaded virtual range before warmup finishes, that chart's queued full-history
53+
request is promoted. In-flight requests are not cancelled.
54+
55+
When the full payload arrives, the client replaces the virtual latest-100
56+
payload in place and preserves the current x-range when possible.
57+
58+
## Fallback Paths
59+
60+
`?n=all` and non-default `?n=` windows still use the DB-backed fallback path.
61+
Those reads go through `QueryCache` single-flight entries and the DB read
62+
semaphore so cold or unusual requests do not stampede DuckDB. Ingest writes do
63+
not consume read permits.
64+
65+
## Ingest And Rebuild
66+
67+
Successful ingest invalidates `QueryCache` and schedules a read-model rebuild.
68+
The active generation remains live while rebuilding. Repeated rebuild requests
69+
coalesce, and a failed rebuild keeps serving the old generation. The server
70+
keeps the active generation plus the most recent previous generation so already
71+
loaded pages can continue resolving immutable shard URLs across a swap.
72+
73+
## Main Files
74+
75+
- `src/read_model.rs`: materialized generation and encoded artifact serving
76+
- `src/api/mod.rs`: API routing between materialized artifacts and fallbacks
77+
- `src/api/charts.rs`: chart DTO construction and `history` metadata
78+
- `src/html/mod.rs`, `src/html/landing.rs`: shell/shard HTML rendering
79+
- `static/chart-init.js`: virtual-axis normalization, shard hydration, and
80+
full-history priority warmup
81+
- `src/query_cache.rs`: single-flight fallback cache
82+
- `src/db.rs`: DuckDB connection cloning and read backpressure

benchmarks-website/server/Cargo.toml

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,15 @@ path = "src/main.rs"
2626
anyhow = { workspace = true }
2727
axum = "0.8"
2828
base64 = "0.22"
29+
brotli = "8.0.2"
30+
bytes = "1.11"
31+
dashmap = { workspace = true }
2932
# track vortex-duckdb's bundled engine version (build.rs)
3033
duckdb = { version = "1.10502", features = ["bundled"] }
34+
flate2 = "1"
3135
maud = { version = "0.27", features = ["axum"] }
32-
serde = { workspace = true, features = ["derive"] }
36+
parking_lot = { workspace = true }
37+
serde = { workspace = true, features = ["derive", "rc"] }
3338
serde_json = { workspace = true }
3439
subtle = "2.6"
3540
thiserror = { workspace = true }
@@ -39,9 +44,9 @@ tower-http = { version = "0.6", features = ["compression-br", "compression-gzip"
3944
tracing = { workspace = true, features = ["std"] }
4045
tracing-subscriber = { workspace = true, features = ["env-filter", "fmt"] }
4146
twox-hash = "2.1"
47+
vortex-utils = { workspace = true }
4248

4349
[dev-dependencies]
44-
flate2 = "1"
4550
insta = { workspace = true }
4651
reqwest = { workspace = true, features = ["json"] }
4752
tempfile = { workspace = true }

0 commit comments

Comments
 (0)