Skip to content

Commit 76f7f87

Browse files
authored
Merge pull request #7 from linksplatform/issue-6-469804e9e1db
fix: add PR quick benchmark mode and timeout to resolve CI timeout issue
2 parents 4a2fc19 + e92b45a commit 76f7f87

5 files changed

Lines changed: 1833 additions & 4 deletions

File tree

.github/workflows/rust-benchmark.yml

Lines changed: 126 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ jobs:
2020
test:
2121
name: Test (${{ matrix.os }})
2222
runs-on: ${{ matrix.os }}
23-
timeout-minutes: 360
23+
timeout-minutes: 30
2424
strategy:
2525
fail-fast: false
2626
matrix:
@@ -89,11 +89,125 @@ jobs:
8989
# Run tests sequentially to avoid parallel interference with shared SpacetimeDB state.
9090
run: cargo test -- --test-threads=1
9191

92+
# Quick benchmark validation for pull requests.
93+
# Runs benchmarks with reduced scale to verify they work and produce results
94+
# in well under 10 minutes. Results are not committed but uploaded as artifacts.
95+
#
96+
# Parameters chosen to keep total benchmark time under 5 minutes:
97+
# BENCHMARK_LINK_COUNT=10, BACKGROUND_LINK_COUNT=30: reduces SpacetimeDB
98+
# round trips per iteration from ~8000 to ~80, making each iteration ~0.1s.
99+
# --sample-size 10: collect 10 samples per benchmark (instead of 100 default).
100+
# --warm-up-time 1: 1s warm-up instead of 3s default.
101+
# --measurement-time 2: 2s measurement instead of 5s default.
102+
# Expected runtime: ~3-5 minutes total for all 35 benchmarks.
103+
benchmark-pr:
104+
name: Benchmark (PR validation)
105+
runs-on: ubuntu-latest
106+
needs: [test]
107+
if: github.event_name == 'pull_request'
108+
timeout-minutes: 20
109+
steps:
110+
- uses: actions/checkout@v4
111+
112+
- name: Setup Rust (nightly)
113+
uses: dtolnay/rust-toolchain@master
114+
with:
115+
toolchain: nightly
116+
targets: wasm32-unknown-unknown
117+
118+
- name: Setup Python
119+
uses: actions/setup-python@v5
120+
with:
121+
python-version: '3.11'
122+
123+
- name: Install Python dependencies
124+
run: pip install matplotlib numpy
125+
126+
- name: Cache cargo registry
127+
uses: Swatinem/rust-cache@v2
128+
with:
129+
workspaces: rust -> target
130+
cache-on-failure: "true"
131+
132+
- name: Install SpacetimeDB CLI
133+
run: |
134+
curl -sSf https://install.spacetimedb.com | sh -s -- -y
135+
echo "$HOME/.local/bin" >> $GITHUB_PATH
136+
working-directory: .
137+
138+
- name: Build SpacetimeDB module (WASM)
139+
run: cargo build --release --target wasm32-unknown-unknown
140+
working-directory: rust/spacetime-module
141+
142+
- name: Start SpacetimeDB server
143+
run: |
144+
spacetime start &
145+
for i in $(seq 1 30); do
146+
if curl -sf http://localhost:3000/ > /dev/null 2>&1; then
147+
echo "SpacetimeDB server is ready"
148+
break
149+
fi
150+
sleep 1
151+
done
152+
working-directory: .
153+
154+
- name: Publish SpacetimeDB module
155+
run: |
156+
spacetime publish \
157+
--server http://localhost:3000 \
158+
--bin-path target/wasm32-unknown-unknown/release/spacetime_module.wasm \
159+
--yes \
160+
benchmark-links
161+
working-directory: rust/spacetime-module
162+
163+
- name: Build benchmark
164+
run: cargo build --release
165+
166+
- name: Run benchmark (quick mode for PR validation)
167+
env:
168+
# Reduced scale: 10 links instead of 1000, 30 background instead of 3000.
169+
# This reduces SpacetimeDB round trips per iteration from ~8000 to ~80,
170+
# keeping each iteration under 0.1s and total benchmark time under 5 minutes.
171+
BENCHMARK_LINK_COUNT: 10
172+
BACKGROUND_LINK_COUNT: 30
173+
SPACETIMEDB_URI: http://localhost:3000
174+
SPACETIMEDB_DB: benchmark-links
175+
run: |
176+
cargo bench --bench bench -- \
177+
--output-format bencher \
178+
--sample-size 10 \
179+
--warm-up-time 1 \
180+
--measurement-time 2 \
181+
--nresamples 1000 \
182+
| tee out.txt
183+
184+
- name: Generate charts
185+
run: python3 out.py
186+
187+
- name: Upload PR benchmark artifacts
188+
uses: actions/upload-artifact@v4
189+
with:
190+
name: benchmark-results-pr
191+
path: |
192+
rust/out.txt
193+
rust/bench_rust.png
194+
rust/bench_rust_log_scale.png
195+
196+
# Full benchmark run for commits to main/master.
197+
# Uses full scale (1000 links, 3000 background) with reduced sample count
198+
# to produce statistically meaningful results while fitting within 1 hour.
199+
#
200+
# Parameters:
201+
# BENCHMARK_LINK_COUNT=1000, BACKGROUND_LINK_COUNT=3000: realistic scale.
202+
# --sample-size 20: 20 samples per benchmark (down from 100 default).
203+
# --nresamples 10000: 10k bootstrap resamples (down from 100k default).
204+
# Expected runtime: ~30-45 minutes total for all 35 benchmarks.
92205
benchmark:
93-
name: Benchmark
206+
name: Benchmark (full)
94207
runs-on: ubuntu-latest
95208
needs: [test]
96209
if: github.event_name == 'push' && (github.ref == 'refs/heads/main' || github.ref == 'refs/heads/master')
210+
timeout-minutes: 180
97211
steps:
98212
- uses: actions/checkout@v4
99213
with:
@@ -154,13 +268,21 @@ jobs:
154268
- name: Build benchmark
155269
run: cargo build --release
156270

157-
- name: Run benchmark
271+
- name: Run benchmark (full mode for main branch)
158272
env:
273+
# Full scale: 1000 links, 3000 background for realistic results.
274+
# --sample-size 20 reduces total runtime from ~2h (default 100) to ~25-40 min
275+
# while still providing statistically valid measurements.
159276
BENCHMARK_LINK_COUNT: 1000
160277
BACKGROUND_LINK_COUNT: 3000
161278
SPACETIMEDB_URI: http://localhost:3000
162279
SPACETIMEDB_DB: benchmark-links
163-
run: cargo bench --bench bench -- --output-format bencher | tee out.txt
280+
run: |
281+
cargo bench --bench bench -- \
282+
--output-format bencher \
283+
--sample-size 20 \
284+
--nresamples 10000 \
285+
| tee out.txt
164286
165287
- name: Generate charts
166288
run: python3 out.py
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Fix benchmark CI timing: add PR quick mode and full mode with timeout
2+
3+
## Problem
4+
5+
The `Benchmark` job in `rust-benchmark.yml` exceeded GitHub Actions' 6-hour limit
6+
when pushed to `main`. Root cause: Criterion's default settings (100 samples, 5s
7+
measurement) combined with SpacetimeDB's synchronous round-trip per operation
8+
(~8000 round trips × ~1ms each per iteration) caused each SpacetimeDB benchmark
9+
to run for ~13 minutes, totalling ~2+ hours for all 7 SpacetimeDB benchmarks —
10+
and the cleanup `delete_all` overhead pushed it past 6 hours.
11+
12+
Additionally, there was no benchmark validation for pull requests at all.
13+
14+
## Solution
15+
16+
- **PR quick mode** (`benchmark-pr` job): runs on `pull_request` events with reduced
17+
scale (`BENCHMARK_LINK_COUNT=10`, `BACKGROUND_LINK_COUNT=30`) and tighter Criterion
18+
settings (`--sample-size 10 --warm-up-time 1 --measurement-time 2`). Expected
19+
runtime: 3–5 minutes total for all 35 benchmarks. Results uploaded as artifacts
20+
but not committed to the repository.
21+
22+
- **Full mode** (`benchmark` job): runs on `push` to `main`/`master` with full scale
23+
(`BENCHMARK_LINK_COUNT=1000`, `BACKGROUND_LINK_COUNT=3000`) and reduced sample count
24+
(`--sample-size 20 --nresamples 10000`) to finish in ~30–45 minutes (well under
25+
3 hours) while still producing statistically meaningful results.
26+
27+
- **Safety timeout**: `timeout-minutes: 180` added to the `benchmark` job and
28+
`timeout-minutes: 30` to `test` jobs (was `360` = 6 hours).
29+
30+
- **Case study**: Deep analysis of the root cause documented in
31+
`docs/case-studies/issue-6/README.md`.
32+
33+
Fixes #6.

0 commit comments

Comments
 (0)