You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|`/data/static/`| 20 static assets (CSS, JS, HTML, fonts, images) — 15 ship with `.gz` and `.br` sibling files for precompression-aware frameworks |
46
45
|`/certs/server.crt`| TLS certificate for HTTPS/H2/H3 |
47
46
|`/certs/server.key`| TLS private key for HTTPS/H2/H3 |
48
47
49
-
All data mounts are provided unconditionally — your container always has access to all files regardless of which profiles it participates in.
48
+
Postgres (profiles `async-db`, `crud`, `api-4`, `api-16`, and the compose-orchestrated gateway + production-stack) is provided by a separate sidecar container, reachable via the `DATABASE_URL` environment variable — not a mount. Redis (profile `crud`) is similarly reachable via `REDIS_URL`. See [Configuration](../../running-locally/configuration/) for the full env var list.
Copy file name to clipboardExpand all lines: site/content/docs/add-framework/implementation-rules/tuned.md
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,13 +10,17 @@ Tuned entries have more freedom. They can use non-default configurations, experi
10
10
- Alternative JSON serializers (simd-json, sonic-json, etc.)
11
11
- Custom buffer sizes and TCP socket options
12
12
- Experimental or unstable framework flags
13
-
- Pre-computed responses and response caching
14
13
- Memory-mapped files and in-memory static file caching
15
14
- Custom thread pools and worker configurations
16
15
- Non-default GC settings without documentation requirement
17
16
- Framework-specific performance flags not recommended for production
18
17
- Any compression approach for static files — custom compression, pre-compressed file serving, alternative compression libraries
19
18
19
+
## What is NOT allowed
20
+
21
+
-**Pre-computed response bodies** — serializing a fixed response at startup and returning the same bytes per request (e.g. caching a JSON blob and writing it back unchanged). The serialization + compression work is the workload; bypassing it defeats the measurement.
22
+
-**Response caching** — memoizing the full HTTP response body keyed by URL/params and replaying it. This is distinct from upstream data caching (DB query results, JWT verification, etc.), which remains allowed where the profile calls for it (e.g. the CRUD profile's read cache).
23
+
20
24
## What is still required
21
25
22
26
- Must use the framework's HTTP server (not a raw socket replacement)
|`production-stack`| HTTP/2 | Compose stack: edge + JWT auth sidecar + Redis + server (TLS, port 8443) |
55
60
|`unary-grpc`| gRPC |`BenchmarkService/GetSum` (h2c, port 8080) |
56
61
|`unary-grpc-tls`| gRPC |`BenchmarkService/GetSum` (TLS, port 8443) |
62
+
|`stream-grpc`| gRPC |`BenchmarkService/StreamSum` (h2c, port 8080) |
63
+
|`stream-grpc-tls`| gRPC |`BenchmarkService/StreamSum` (TLS, port 8443) |
57
64
|`echo-ws`| WebSocket |`/ws` echo (port 8080) |
58
65
59
66
Only include profiles your framework supports. Frameworks missing a profile simply don't appear in that profile's leaderboard.
60
67
61
-
### async-db
62
-
63
-
The `async-db` profile requires an async PostgreSQL driver. The benchmark script starts a Postgres sidecar with 100K rows and passes `DATABASE_URL=postgres://bench:bench@localhost:5432/benchmark` to your container. Your framework must:
64
-
65
-
1. Connect to Postgres using the `DATABASE_URL` environment variable
66
-
2. Implement `GET /async-db?min=X&max=Y&limit=N` that queries: `SELECT id, name, category, price, quantity, active, tags, rating_score, rating_count FROM items WHERE price BETWEEN $1 AND $2 LIMIT $3`
67
-
3. Return JSON: `{"items": [...], "count": N}` with nested `rating: {score, count}` and `tags` as a JSON array
68
-
4. Return `{"items":[],"count":0}` if the database is unavailable
69
-
5. Use lazy connection initialization — retry connecting if Postgres isn't ready at startup
70
-
71
-
### gateway-64
72
-
73
-
The `gateway-64` profile tests your framework as part of a complete deployment stack over HTTP/2 with TLS. Unlike other tests that run a single container, this test uses **Docker Compose** to orchestrate multi-container deployments — typically a reverse proxy in front of an application server, but any architecture is allowed.
74
-
75
-
**Quick start:**
76
-
77
-
1. Create a `compose.gateway.yml` in your framework directory
78
-
2. Define your services (proxy, server, cache — whatever you need)
79
-
3. Pin each service to specific CPUs using `cpuset` — total must be exactly 64 logical CPUs (0-31 + 64-95), always in physical+SMT pairs (core N and N+64 together)
80
-
4. All services must use `network_mode: host`, `security_opt: [seccomp:unconfined]`, and appropriate ulimits
81
-
5. Use `${CERTS_DIR}`, `${DATA_DIR}`, and `${DATABASE_URL}` env vars — they are exported by the benchmark script
82
-
6. Port **8443** must serve HTTPS/H2 — this is where the load generator sends requests
83
-
7. The stack must implement `/static/*`, `/json`, `/async-db`, and `/baseline2` endpoints
84
-
85
-
**What makes this different from other tests:**
86
-
- You control the full architecture via Docker Compose
87
-
- Multiple containers compete for a shared 64-CPU budget
88
-
- The proxy, caching layer, and internal protocol choices are all part of the benchmark
89
-
- Static files can be served directly by the proxy (e.g., Nginx) instead of the application server
90
-
91
-
See the [Gateway-64 implementation guide](/docs/test-profiles/gateway/gateway-h2/implementation) for detailed documentation, three complete compose examples (two-tier, three-tier, and single-tier), CPU topology rules, and proxy configuration options.
68
+
Per-profile endpoint contracts, request/response shapes, and validation rules live under the [Test Profiles](/docs/test-profiles/) section — link to the specific profile's Implementation page from your PR description when adding a new framework.
HttpArena drives the `echo-ws` profile with **gcannon in `--ws` mode**. The same io_uring engine documented under [HTTP/1.1 → gcannon](../h1/gcannon/) is reused here — worker threads, per-thread provided-buffer rings, multishot receives, per-connection state — with a frame-aware send/recv loop layered on top. Using one tool across transports keeps the client-side ceiling, threading model, and CPU-pinning behavior consistent so differences in the measurement land on the server, not the generator.
6
+
7
+
## Handshake
8
+
9
+
Each worker opens TCP connections and issues an HTTP/1.1 upgrade request to the target URL (typically `http://localhost:8080/ws`). The server must respond with `HTTP/1.1 101 Switching Protocols` and the correct `Sec-WebSocket-Accept` value derived from the client's `Sec-WebSocket-Key`. Connections that fail the handshake are reported as reconnects; the validator ([WebSocket validation](../../test-profiles/ws/echo/validation/)) checks the handshake path separately and catches framework-side bugs before benchmarks run.
10
+
11
+
## Echo loop
12
+
13
+
Once upgraded, each connection runs the steady-state loop:
14
+
15
+
1. Build a masked client-to-server text frame with a short payload
16
+
2. Send the frame via `io_uring_prep_send`
17
+
3. Wait for the server to echo it back (matched server-to-client frame)
18
+
4. On receipt, increment the per-thread frame counter and immediately send the next frame
19
+
20
+
Pipeline depth is 1 for the `echo-ws` profile — one message in flight per connection — so the measurement is effectively a back-to-back request/response loop rather than a batched burst. With thousands of concurrent connections each running this loop in parallel, the steady-state throughput reflects the server's ability to multiplex WebSocket frames across a large connection count without head-of-line blocking.
21
+
22
+
Both text frames (opcode `0x1`) and binary frames (opcode `0x2`) are exercised against the server during validation; benchmark runs use the text shape for simplicity. Framing follows RFC 6455: masked from client to server, unmasked from server to client, FIN bit set on every frame (no fragmented messages in the benchmark path).
23
+
24
+
## Command-line usage
25
+
26
+
```bash
27
+
gcannon http://localhost:8080/ws --ws \
28
+
-c <connections> -t <threads> -d <duration> -p 1
29
+
```
30
+
31
+
| Flag | Description |
32
+
|------|-------------|
33
+
|`<url>`| The WebSocket endpoint served over HTTP/1.1 (uses `http://` scheme; the upgrade is implicit) |
34
+
|`--ws`| Switches gcannon from HTTP request mode into WebSocket echo mode |
35
+
|`-c`| Total concurrent connections (distributed evenly across `-t` threads) |
36
+
|`-t`| Worker threads (each owns an io_uring and a slice of connections; defaults to `$THREADS=64`) |
37
+
|`-d`| Test duration — `5s` for `echo-ws`|
38
+
|`-p`| Pipeline depth — fixed at `1` for `echo-ws` (one message in flight per connection) |
39
+
40
+
The profile dispatcher (`scripts/lib/tools/gcannon.sh:ws-echo`) wires all of this automatically when you invoke `./scripts/benchmark.sh <framework> echo-ws`.
41
+
42
+
## Output shape
43
+
44
+
gcannon reports WebSocket results with the same layout as HTTP requests, except the summary line reads "frames sent / frames received" instead of "requests / responses":
45
+
46
+
```
47
+
2400000 frames sent in 5.00s, 2400000 frames received
48
+
Throughput: 480.00K frames/s
49
+
WS frames: 2400000
50
+
```
51
+
52
+
The parser (`gcannon_parse ws-echo`) records `frames received` as the `status_2xx` equivalent and divides by the measured duration to produce the headline RPS number shown on the [WebSocket leaderboard](/leaderboards/websocket/). One echo round-trip counts as one unit — the frames-received count from the client side, not frames-sent, because the metric is "how many echoes the framework completed," not "how many messages the benchmarker pushed into the socket."
53
+
54
+
## Why not a dedicated WebSocket tool
55
+
56
+
The two common alternatives — `wrk2` with a Lua WebSocket plugin, or `artillery` — either can't saturate the server at 64-core scale (GC + per-connection Lua overhead becomes the bottleneck) or produce non-deterministic per-thread CPU pinning that makes cross-framework comparison unreliable. Reusing gcannon means the generator's tuning story is the same one already vetted against the HTTP/1.1 profiles, and the operator-side flags (`$GCANNON_CPUS`, cpuset pinning, provided buffer ring sizing) compose identically.
|`H3THREADS`|`64`| h2load-h3 worker threads (HTTP/3 over QUIC). |
19
19
20
-
In `benchmark-lite.sh`, `THREADS`/ `H2THREADS` / `H3THREADS`all default to `nproc / 2` instead.
20
+
In `benchmark-lite.sh`, `THREADS`defaults to `max(nproc / 2, 1)` and `H2THREADS` / `H3THREADS`mirror `$THREADS`. Pass `--load-threads N` to override all three in one shot.
|`H1TLS_PORT`|`8081`| HTTP/1.1 + TLS, used only by the `json-tls` profile. |
26
+
|`PORT`|`8080`| HTTP/1.1 plaintext (all `h1*` profiles + `echo-ws`); also h2c for gRPC (`unary-grpc`, `stream-grpc` — prior-knowledge on the same socket). |
27
+
|`H2PORT`|`8443`| HTTPS / HTTP/2 over TLS (`baseline-h2`, `static-h2`, gateway + production-stack), HTTP/3 over QUIC (`baseline-h3`, `static-h3`, `gateway-h3`), and gRPC-TLS (`unary-grpc-tls`, `stream-grpc-tls`). |
28
+
|`H1TLS_PORT`|`8081`| HTTP/1.1 + TLS, used only by the `json-tls` profile (ALPN `http/1.1`). |
29
+
|`H2C_PORT`|`8082`| HTTP/2 cleartext prior-knowledge for the `baseline-h2c` and `json-h2c` profiles. Must be a dedicated listener that refuses HTTP/1.1 — the validator checks this explicitly. |
29
30
30
31
Every framework `Dockerfile` reads the same defaults from its env, so you rarely need to change these.
31
32
@@ -83,10 +84,10 @@ From `endpoint_tool()` in `scripts/lib/profiles.sh`:
0 commit comments