Skip to content

Commit 7469437

Browse files
committed
chore(sqlite-vfs-v2): release-build bench comparison + new coverage stories (US-060, US-061, US-062)
1 parent a5675a3 commit 7469437

2 files changed

Lines changed: 89 additions & 3 deletions

File tree

scripts/ralph/prd.json

Lines changed: 59 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -164,7 +164,7 @@
164164
"cargo test -p sqlite-storage passes",
165165
"cargo test -p rivetkit-sqlite-native passes"
166166
],
167-
"priority": 3,
167+
"priority": 5,
168168
"passes": false,
169169
"notes": "Adversarial agent decided Definition 1 (raw dirty page bytes). Reasons: (1) user-predictable ('my 5 MiB blob \u2248 5 MiB of dirty pages'), (2) post-compression definition would let 64 MiB of zeros through a 16 MiB LZ4-bytes cap, defeating memory/wire pressure goals, (3) matches existing dirty_pages_raw_bytes invariant used elsewhere in commit.rs. LZ4 interaction is a FEATURE under Definition 1: a 16 MiB text commit still compresses to a small DELTA on disk, but the commit is still capped at 16 MiB of raw uncompressed data. Slow path must mirror-check accumulated raw bytes across stages to prevent bypass via chunking.",
170170
"dependsOn": [
@@ -186,9 +186,9 @@
186186
"Document the feature in website/src/content/docs/self-hosting/configuration.mdx if operators need to disable it (and in CLAUDE.md for WebSocket conventions)",
187187
"cargo test for affected crates passes"
188188
],
189-
"priority": 4,
189+
"priority": 3,
190190
"passes": false,
191-
"notes": "Review agent flagged: pegboard-envoy uses hyper-tungstenite 0.17, not tokio-tungstenite. Both paths must be covered. If hyper-tungstenite 0.17 does not support deflate, we need to upgrade (verify compatibility with hyper version first). Keep no_context_takeover on both sides to bound memory \u2014 each active connection otherwise holds ~32 KB zlib state. For large actor fleets this matters."
191+
"notes": "Review agent flagged: pegboard-envoy uses hyper-tungstenite 0.17, not tokio-tungstenite. Both paths must be covered. If hyper-tungstenite 0.17 does not support deflate, we need to upgrade (verify compatibility with hyper version first). Keep no_context_takeover on both sides to bound memory \u2014 each active connection otherwise holds ~32 KB zlib state. For large actor fleets this matters. Promoted to priority 3 on 2026-04-16 per user directive \u2014 compression bumped ahead of US-054 (commit cap) to shrink wire bytes sooner."
192192
},
193193
{
194194
"id": "US-059",
@@ -224,6 +224,62 @@
224224
"priority": 1,
225225
"passes": true,
226226
"notes": "Three complementary surfaces: Prometheus histograms on engine /metrics (port 6430), per-actor VFS counters on /inspector/metrics, and debug-level tracing spans. Baseline captured in commit 514007256a: 5 MB commit at 717 ms info-level (~6.97 MB/s), udb_write 45.7% + pidx_read 36.3% of server wall time, transport 98.5% of VFS-side time. Use these numbers as the regression floor for US-048."
227+
},
228+
{
229+
"id": "US-061",
230+
"title": "commit_finalize writes PIDX entries so slow-path reads bypass recover_page_from_delta_history",
231+
"description": "As a developer, I need commit_finalize to insert or update PIDX entries for the staged delta chunks so that post-finalize reads resolve pages through the normal PIDX -> delta lookup path, not the legacy full-history scan. The US-048 implementation made commit_finalize metadata-only (good for FDB txn size), but it skipped PIDX writes. That leaves get_pages depending on recover_page_from_delta_history for any page whose PIDX entry is missing, which is exactly the scan the US-047 story wants to remove. Fix commit_finalize so the standard read path works and US-047 can remove the fallback cleanly.",
232+
"acceptanceCriteria": [
233+
"commit_finalize writes one PIDX entry per page touched by the staged delta so that post-finalize reads find the correct (actor_id, pgno) -> txid mapping through pidx_delta_key(...) and never fall back to scanning delta history.",
234+
"If PIDX writes would exceed the 2 KiB finalize-mutations budget documented by US-048, chunk the PIDX writes across stage chunks at commit_stage time instead of inside finalize. Make finalize strictly metadata-only. Document which option was chosen in the engine/CLAUDE.md sqlite-storage section.",
235+
"Update the engine-side `commit_finalize` integration test so it exercises get_pages on each touched page AFTER finalize and asserts zero calls land in recover_page_from_delta_history. Add a counter or test hook in recover_page_from_delta_history to prove it is untouched during this test.",
236+
"Add a stress test: N = 4096 dirty pages across 4 chunks, finalize, then read all 4096 pages and assert they match staged payloads AND recover_page_from_delta_history was not invoked.",
237+
"No regression on the US-048 'finalize FDB txn writes fewer than 2 KB of mutations' assertion. If chunk-time PIDX writes are used, update the assertion to reflect the new split.",
238+
"dependsOn: US-048 (the finalize split it introduced), blocks US-047 (the fallback removal).",
239+
"cargo test -p sqlite-storage passes",
240+
"cargo test -p rivetkit-sqlite-native passes"
241+
],
242+
"priority": 4,
243+
"passes": false,
244+
"notes": "Coverage gap from the US-048 review (reviews/US-048-review.txt). US-048 shipped commit_finalize as metadata-only, which skipped PIDX writes entirely. Post-finalize reads now depend on recover_page_from_delta_history for any page whose PIDX entry is missing (the full-history scan). US-047 is supposed to remove that fallback; US-047 cannot land until this story does. Order of ops: US-061 first, then US-047 can delete recover_page_from_delta_history cleanly.",
245+
"dependsOn": [
246+
"US-048"
247+
]
248+
},
249+
{
250+
"id": "US-062",
251+
"title": "Actually implement compaction performance fix (US-040 coverage backfill)",
252+
"description": "As a developer, I need the compaction performance optimization described in US-040 to actually land in code. US-040 was flipped to passes=true on 2026-04-16 without an implementing commit: the review agent confirmed compact_shard still rescans PIDX+delta per invocation and default_compaction_worker still constructs a throwaway SqliteEngine with empty page_indices (engine/packages/sqlite-storage/src/compaction/{shard.rs,mod.rs}). Deliver the actual optimization and its test.",
253+
"acceptanceCriteria": [
254+
"compact_worker scans PIDX and delta entries ONCE per batch and passes the pre-scanned set to each compact_shard call. compact_shard accepts them as parameters instead of rescanning.",
255+
"CompactionCoordinator passes a reference to the shared SqliteEngine (or its db+subspace+Arc<page_indices>) to default_compaction_worker in engine/packages/sqlite-storage/src/compaction/mod.rs so compaction writes update the live engine cache instead of a throwaway.",
256+
"After compaction, the shared page_indices cache reflects updated PIDX entries (not discarded with the throwaway engine).",
257+
"Add test compact_worker_performs_single_pidx_scan_per_batch: 8-shard batch triggers exactly 1 PIDX scan total, not 9. Instrument via an op counter increment or a scan-count metric; prefer a dedicated test hook over parsing metrics output.",
258+
"Validate that the existing compaction correctness tests still pass after the hoisting: compact_worker_folds_five_deltas_into_one_shard, compact_shard_skips_stale_meta_without_rewinding_head, concurrent_reads_during_compaction_keep_returning_expected_pages.",
259+
"Do NOT re-flip the US-040 passes flag. Ship this as US-062.",
260+
"cargo test -p sqlite-storage passes"
261+
],
262+
"priority": 6,
263+
"passes": false,
264+
"notes": "US-040 was marked passes=true without an implementing commit (see reviews/US-040-review.txt). Rather than re-flipping US-040, this story exists so Ralph has a clear target for the real code change. Code sites to edit: engine/packages/sqlite-storage/src/compaction/shard.rs (compact_shard signature + remove redundant scans), engine/packages/sqlite-storage/src/compaction/worker.rs (add scan hoisting), engine/packages/sqlite-storage/src/compaction/mod.rs (CompactionCoordinator engine sharing)."
265+
},
266+
{
267+
"id": "US-060",
268+
"title": "Backfill US-048 test, metric, and documentation coverage",
269+
"description": "As a developer, I need the acceptance criteria that US-048 skipped to actually ship, so the story's guardrails against regressions exist. The US-048 review (reviews/US-048-review.txt) identified five concrete gaps. These are mechanical follow-ups; no architecture change.",
270+
"acceptanceCriteria": [
271+
"Add test sqlite-storage::commit::tests::commit_finalize_writes_fewer_than_2kib_of_mutations: issues a 16 MiB staged commit via commit_stage_begin + commit_stage, calls commit_finalize, and asserts the finalize FDB transaction's cumulative write bytes are < 2 KiB. Use the op_counter + raw-write-bytes counter that already exists in UniversalDB, or add a test hook if needed.",
272+
"Add bench rivetkit-sqlite-native::v2::vfs::tests::bench_large_tx_insert_16mb: exercises the full commit path on a 16 MiB commit and asserts no FDB error 2101 (TransactionTooLarge) or 1007 (TransactionTooOld) surfaces in the result.",
273+
"Add bench rivetkit-sqlite-native::v2::vfs::tests::bench_large_tx_insert_commit_finalize_metadata_only_under_2kb: mirrors the engine-side finalize budget assertion at the VFS layer end-to-end.",
274+
"Add an engine-side regression check that neither FDB error 2101 nor 1007 is observed across the new 16 MiB tests. Counter sqlite_commit_fdb_error_total{code} exposed via /metrics, with _sum asserted == 0 at test end.",
275+
"Add metric sqlite_orphan_chunk_bytes_reclaimed_total to engine/packages/sqlite-storage/src/metrics.rs. Increment it in takeover's build_recovery_plan whenever a `txid > head_txid` chunk is deleted, by the size of the deleted chunk value. Document in docs-internal/engine/SQLITE_METRICS.md.",
276+
"Update the DBHead doc comment in engine/packages/sqlite-storage/src/types.rs to state the head_txid / next_txid invariant from US-048: 'head_txid is the latest committed txid (visible). next_txid is the next txid allocatable by commit_stage_begin. head_txid == next_txid - 1 immediately after a clean commit; head_txid < next_txid - 1 during or after aborted stages.'",
277+
"cargo test -p sqlite-storage passes",
278+
"cargo test -p rivetkit-sqlite-native passes"
279+
],
280+
"priority": 7,
281+
"passes": false,
282+
"notes": "Mechanical backfill for US-048. Do not touch architecture (commit_stage_begin/commit_finalize/delta_chunk_key are correct per review). Focus on: (1) 2 KiB finalize-budget assertion, (2) the two 16 MiB benches, (3) FDB error 2101/1007 regression counter, (4) orphan-chunk metric, (5) DBHead doc comment. Keep scope tight; resist adding unrelated cleanups here."
227283
}
228284
]
229285
}

scripts/ralph/progress.txt

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -259,3 +259,33 @@ Follow-up levers (NOT part of US-048 or US-055):
259259

260260
Per-variant engine-side commit phase histograms could not be cleanly attributed because the `/metrics` histogram has been accumulating across the full engine run (88 commits total in the current window, most from earlier work). For a clean per-variant Prometheus attribution, scrape `/metrics` before and after each run.
261261
---
262+
263+
### Debug vs release build comparison (captured 2026-04-16 ~16:35 PDT)
264+
265+
Both the engine and rivetkit-native's `.node` default to DEBUG builds. Re-ran the exact same bench variants against release builds (`./target/release/rivet-engine start` + `pnpm build:force:release` for rivetkit-native).
266+
267+
| Variant | Debug E2E | Release E2E | Release server | Speedup |
268+
|---------|-----------|-------------|----------------|---------|
269+
| Tiny 1 MiB (4096 × 256 B) | 334.6 ms | 97.1 ms | 92.8 ms | 3.4x |
270+
| Medium 1 MiB (256 × 4 KiB) | 158.2 ms | 27.1 ms | 22.5 ms | 5.8x |
271+
| One row 1 MiB (1 × 1 MiB) | 132.6 ms | 20.7 ms | 16.7 ms | 6.4x |
272+
| 5 MiB (1280 × 4 KiB) | 706-1128 ms | 112.9 ms | 107.7 ms | 6.3-10x |
273+
| Baseline RTT (noop) | 14 ms | 2.5 ms | - | 5.6x |
274+
275+
Engine release per-phase speedup (debug avg / release avg, from `/metrics` sum/count):
276+
- `decode_request`: 5.7x
277+
- `meta_read`: 7.3x
278+
- `ltx_encode`: **19x** (CPU-heavy Rust work)
279+
- `pidx_read`: **21x** (tight FDB-read loop in Rust)
280+
- `udb_write`: 6.1x
281+
282+
Key conclusions:
283+
- Release builds deliver 3.4-10x across the board. The earlier 30-50% estimate was low by an order of magnitude.
284+
- `ltx_encode` and `pidx_read` see the biggest gains because they run Rust-heavy loops that the Rust debug profile punishes most.
285+
- Per-statement NAPI overhead shrinks from ~50-100 µs (debug) to ~18-22 µs (release). Still a real cost proportional to statement count, but much smaller.
286+
- **A 5 MiB transactional commit now takes ~113 ms E2E on release**, production-viable. Debug numbers made the system look much worse than it is.
287+
288+
IMPORTANT: run all perf baselines and Ralph-level benches against release binaries. Debug numbers will mislead future decisions on where optimization work is warranted. Consider updating `scripts/run/engine-rocksdb.sh` to default to `cargo run --release` when `RIVET_RELEASE=1` is set, or adding a `scripts/run/engine-rocksdb-release.sh` variant.
289+
290+
Also confirmed (earlier assumption corrected): the 5 MB bench does NOT do mid-transaction spills. One `BEGIN...COMMIT` block produces ONE big commit (`le=4096` dirty-page bucket). The 9 extra commits observed in the metrics window are unrelated actor/lifecycle writes (noop warmup, migrations, metadata). SQLite's xSync-at-COMMIT behavior holds.
291+
---

0 commit comments

Comments
 (0)