Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
194 commits
Select commit Hold shift + click to select a range
fc84418
feat(sqlite): pitr & forking
NathanFlurry Apr 29, 2026
e73c6eb
feat: US-001 - Add new identifier types and branch records to pump/ty…
NathanFlurry Apr 29, 2026
930fa8e
feat: US-002 - Add namespace tier state, pin status, resolved version…
NathanFlurry Apr 29, 2026
c5c0b84
feat: US-003 - Add cold-tier and pointer-snapshot data types
NathanFlurry Apr 29, 2026
57b5f82
feat: US-004 - Add new FDB key-builder partitions in pump/keys.rs
NathanFlurry Apr 30, 2026
77a7e29
feat: US-005 - Add tunable constants for PITR + forking
NathanFlurry Apr 30, 2026
18fe2d3
feat: US-006 - Add error variants for PITR + fork operations
NathanFlurry Apr 30, 2026
b5e2a79
feat: US-007 - Migrate commit path to versioned SHARD writes
NathanFlurry Apr 30, 2026
0438924
feat: US-008 - Write VTX index entry on every commit
NathanFlurry Apr 30, 2026
54c7d16
feat: US-009 - Allocate root branch on first actor commit and migrate…
NathanFlurry Apr 30, 2026
6b07ec5
feat: US-010 - Allocate root namespace branch and write NSPTR on name…
NathanFlurry Apr 30, 2026
4e7ed9f
feat: US-011 - Implement derive_branch_at primitive
NathanFlurry Apr 30, 2026
6288c76
feat: US-012 - Implement derive_namespace_branch_at primitive
NathanFlurry Apr 30, 2026
1f7b9e2
feat: US-013 - Implement ensure_tier_at_least CAS helper for T0 -> T1…
NathanFlurry Apr 30, 2026
80c34b3
feat: US-014 - Implement fork_actor operation
NathanFlurry Apr 30, 2026
144855b
feat: US-015 - Implement fork_namespace operation
NathanFlurry Apr 30, 2026
de5dee1
feat: US-066 - Flatten tier model
NathanFlurry Apr 30, 2026
c9a5389
feat: US-016 - Implement resolve_actor_pointer with lazy ns-branch pa…
NathanFlurry Apr 30, 2026
e9a0208
feat: US-017 - Implement rollback_actor operation with branch freezing
NathanFlurry Apr 30, 2026
dfb5f4c
feat: US-018 - Implement rollback_namespace operation
NathanFlurry Apr 30, 2026
0f2f3a7
feat: US-019 - Add bookmark wire format parsing/formatting helpers
NathanFlurry Apr 30, 2026
9769d18
feat: US-020 - Implement create_bookmark (ephemeral) and bookmark_sta…
NathanFlurry Apr 30, 2026
ac01f6b
feat: US-021 - Implement resolve_bookmark with two-level fork-chain b…
NathanFlurry Apr 30, 2026
a09817f
feat: US-022 - Implement create_pinned_bookmark (Pending) with bk_pin…
NathanFlurry Apr 30, 2026
904d3c7
feat: US-023 - Implement delete_pinned_bookmark with pin recompute sc…
NathanFlurry Apr 30, 2026
5bc46bb
feat: US-024 - Implement restore_to_bookmark wrapper (rollback then a…
NathanFlurry Apr 30, 2026
dc3120e
feat: US-025 - Implement flattened ancestry cache on Db
NathanFlurry Apr 30, 2026
939b7b8
feat: US-026 - Implement cache invalidation contract on commit and ge…
NathanFlurry Apr 30, 2026
5da8496
feat: US-027 - Implement get_pages walking ancestry chain with PIDX/S…
NathanFlurry Apr 30, 2026
c9e822c
feat: US-028 - Implement /META/head_at_fork read on fresh forks until…
NathanFlurry Apr 30, 2026
2db5865
feat: US-029 - Implement list_databases with NSCAT walk + cap-by-vers…
NathanFlurry Apr 30, 2026
2e1f12a
feat: US-031 - Hot compactor enforces MAX_SHARD_VERSIONS_PER_SHARD wi…
NathanFlurry Apr 30, 2026
f3dae0b
feat: US-032 - Hot compactor writes last_hot_pass_txid sub-key
NathanFlurry Apr 30, 2026
cd395a4
feat: US-033 - Hot compactor universal hot-tier retention sweep
NathanFlurry Apr 30, 2026
c19d501
feat: US-034 - Implement throttled access-touch on ActorDb
NathanFlurry Apr 30, 2026
e2cdd5e
feat: US-035 - Add ColdTier trait with Filesystem/S3 implementations
NathanFlurry Apr 30, 2026
ee452d4
feat: US-036 - Cold compactor scaffold (Standalone service + UPS subs…
NathanFlurry Apr 30, 2026
43d3361
feat: US-037 - Cold compactor Phase A (pre-tx pending marker registra…
NathanFlurry Apr 30, 2026
b1d1d59
feat: US-038 - Cold compactor Phase B
NathanFlurry Apr 30, 2026
0385da1
feat: US-039 - Cold compactor Phase C (FDB write tx with OCC fence)
NathanFlurry Apr 30, 2026
65dbb1e
feat: US-040 - Cold compactor handles pinned-bookmark create messages…
NathanFlurry Apr 30, 2026
318607c
feat: US-041 - Cold compactor follow-up sweep deletes orphaned S3 obj…
NathanFlurry Apr 30, 2026
3392a7c
feat: US-042 - Cold-tier read fall-through on FDB miss
NathanFlurry Apr 30, 2026
c997944
feat: US-043 - Eviction compactor scaffold (Standalone service + glob…
NathanFlurry Apr 30, 2026
7194950
feat: US-044 - Implement eviction predicate (3-gate: hot window + des…
NathanFlurry Apr 30, 2026
886170b
feat: US-045 - Eviction OCC fence on last_hot_pass_txid
NathanFlurry Apr 30, 2026
05b3f4b
feat: US-046 - Eviction removes branch from index on full evict
NathanFlurry Apr 30, 2026
526cb66
feat: US-047 - Implement GC pin formula and pass logic
NathanFlurry Apr 30, 2026
4891dd6
feat: US-051 - Implement burst-mode hot quota cap
NathanFlurry Apr 30, 2026
234452b
feat: US-053 - Cold compactor handles fork-warmup UPS messages
NathanFlurry Apr 30, 2026
6307f17
feat: US-054 - Add debug APIs
NathanFlurry Apr 30, 2026
a8bb9d9
feat: US-055 - Add Prometheus metrics for PITR + forking
NathanFlurry Apr 30, 2026
66145db
feat: US-056 - Add fork_actor and fork_namespace integration tests
NathanFlurry Apr 30, 2026
aa2bb1f
feat: US-058 - Add bookmarks integration tests (ephemeral, pinned, pa…
NathanFlurry Apr 30, 2026
791f67a
feat: US-059 - Add cold compactor integration tests (Phase A/B/C orch…
NathanFlurry Apr 30, 2026
73057e1
feat: US-060 - Add eviction compactor integration tests
NathanFlurry Apr 30, 2026
f3a50db
feat: US-061 - Add GC + list_actors integration tests
NathanFlurry Apr 30, 2026
79bb29f
feat: US-062 - Add critical fault-injection tests from spec section 21
NathanFlurry Apr 30, 2026
5b38a4f
feat: US-063 - Create docs-internal/engine/sqlite/ folder with five docs
NathanFlurry Apr 30, 2026
a8afdb3
feat: US-064 - Update sqlite-storage CLAUDE.md and engine CLAUDE.md t…
NathanFlurry Apr 30, 2026
9d33aca
feat: US-065 - Rename 'actor' storage entity to 'database' across sql…
NathanFlurry Apr 30, 2026
fe2ae7e
chore: rename to depot
NathanFlurry May 1, 2026
08d1762
chore: sqlite comapctor wf
NathanFlurry May 1, 2026
65906b6
feat: US-001 - Add workflow compaction constants and FDB key helpers
NathanFlurry May 1, 2026
ae7d4fd
feat: US-002 - Add versioned payload types for compaction metadata
NathanFlurry May 1, 2026
f9f6b04
feat: US-003 - Teach readers the new published manifest fallback
NathanFlurry May 1, 2026
225ac8e
feat: US-004 - Add workflow signal and durable state types
NathanFlurry May 1, 2026
a6286e9
feat: US-005 - Create persistent branch workflow skeletons
NathanFlurry May 1, 2026
08cb0c1
feat: US-006 - Wire dirty marker admission and DeltasAvailable
NathanFlurry May 1, 2026
beab95b
feat: US-007 - Implement manager FDB refresh and hot-job planning
NathanFlurry May 1, 2026
37b8d1e
feat: US-008 - Build staged hot SHARD output
NathanFlurry May 1, 2026
3bec703
feat: US-009 - Publish latest-head hot compaction output
NathanFlurry May 1, 2026
3bab453
feat: US-010 - Add direct DB history pins to hot planning
NathanFlurry May 1, 2026
7e94f57
feat: US-011 - Implement FDB DELTA reclaim after coverage proof
NathanFlurry May 1, 2026
ff5947c
feat: US-012 - Resolve namespace-derived pins before deletion
NathanFlurry May 1, 2026
28bbb56
feat: US-013 - Implement cold upload and manager publish
NathanFlurry May 1, 2026
7f16b14
feat: US-014 - Add retired cold-object grace deletion
NathanFlurry May 1, 2026
5141a22
feat: US-015 - Add branch destruction workflow lifecycle
NathanFlurry May 1, 2026
1898729
feat: US-016 - Add repair logging and orphan cleanup paths
NathanFlurry May 1, 2026
3f17cc5
feat: US-017 - Add force-compaction and wait-idle workflow test hook
NathanFlurry May 1, 2026
926bdf7
feat: US-018 - Add depot workflow end-to-end tests using force compac…
NathanFlurry May 1, 2026
5b616ae
feat: US-019 - Remove legacy compactor coordination mechanisms
NathanFlurry May 1, 2026
b6a9d1f
feat: US-020 - Add end-to-end workflow compaction coverage
NathanFlurry May 1, 2026
91ce776
feat: US-021 - Move SQLite and depot inline Rust tests into tests dir…
NathanFlurry May 1, 2026
0b9212c
feat: US-022 - Resolve pinned bookmark Ready semantics
NathanFlurry May 1, 2026
5473c9c
feat: US-023 - Delete unused BranchStopState Stopping variant
NathanFlurry May 1, 2026
8381009
feat: US-024 - Delete unused BranchState Deleted variant
NathanFlurry May 1, 2026
81a2609
feat: US-025 - Remove gas vbare wrappers from workflow payloads
NathanFlurry May 1, 2026
b4b7ce8
feat: US-026 - Remove or implement udb scan_prefix_values stub
NathanFlurry May 1, 2026
945efd0
feat: US-027 - Fix empty bk_pin recompute sentinel
NathanFlurry May 1, 2026
f6c2dd0
feat: US-028 - Add fork-vs-GC retention OCC fence
NathanFlurry May 1, 2026
b64b213
feat: US-029 - Prevent stale boundary SHARD pages after truncate
NathanFlurry May 1, 2026
654c0b4
feat: US-030 - Use exact-value cleanup for truncate PIDX and SHARD rows
NathanFlurry May 1, 2026
eca57fb
feat: US-031 - Decrement parent refcount when reaping branch
NathanFlurry May 1, 2026
3e75897
feat: US-032 - Gate cold tier by Rivet config and preserve FDB source…
NathanFlurry May 1, 2026
eec6a7c
feat: US-033 - Cleanup cold upload objects after rejected publish
NathanFlurry May 1, 2026
bd1271b
feat: US-034 - Fence repair reclaim activities by lifecycle generation
NathanFlurry May 1, 2026
eeb78a3
feat: US-035 - Clarify forced reclaim semantics
NathanFlurry May 1, 2026
67508e7
feat: US-036 - Use SHA-256 for compaction input fingerprints
NathanFlurry May 1, 2026
5bbfe10
feat: US-037 - Bound reclaim commit-prefix scan by txid ceiling
NathanFlurry May 1, 2026
b21e091
feat: US-038 - Remove async-path parking_lot cache locks from Db
NathanFlurry May 1, 2026
b52a150
feat: US-039 - Add actor_id fields to workflow compaction logs
NathanFlurry May 1, 2026
eb158ef
feat: US-040 - Replace anyhow macro usage in depot hot paths
NathanFlurry May 1, 2026
dc7f02f
feat: US-041 - Use typed NoSuchKey detection for S3 cold tier
NathanFlurry May 1, 2026
9c3b16d
feat: US-042 - Pass real node id into FaultyColdTier metrics
NathanFlurry May 1, 2026
8b200aa
feat: US-043 - Gate workflow compaction test hooks with cfg test
NathanFlurry May 1, 2026
f6bd0bb
feat: US-032 - Split workflow compaction file by workflow
NathanFlurry May 1, 2026
32d45fb
feat: US-033 - Extract workflow manager phases into functions
NathanFlurry May 1, 2026
3a0b2d0
feat: US-034 - Type compaction manager active job lanes
NathanFlurry May 1, 2026
6c37ca0
feat: US-035 - Type planned compaction jobs by lane
NathanFlurry May 1, 2026
980386d
feat: US-036 - Convert compaction manager to effect-driven orchestration
NathanFlurry May 1, 2026
3c91f25
feat: US-037 - Add branch-id accessors to compaction workflow signals
NathanFlurry May 1, 2026
a59203a
feat: US-038 - Move cold-storage config decisions out of workflow hel…
NathanFlurry May 1, 2026
823d052
feat: US-039 - Encapsulate force-compaction workflow state
NathanFlurry May 1, 2026
42b46fc
feat: US-040 - Require companion workflow IDs in manager state
NathanFlurry May 1, 2026
0abe9a8
feat: US-041 - Model manager stop reasons explicitly
NathanFlurry May 1, 2026
ada2a7a
feat: US-042 - Simplify companion workflow runner by kind-specific ha…
NathanFlurry May 1, 2026
80dbe63
feat: US-043 - Split conveyer types into domain submodules
NathanFlurry May 1, 2026
b4ed941
feat: US-044 - Split conveyer read into phase submodules
NathanFlurry May 1, 2026
4dd8626
feat: US-045 - Split conveyer branch into operation submodules
NathanFlurry May 1, 2026
3ca29b8
feat: US-046 - Split conveyer commit into phase submodules
NathanFlurry May 1, 2026
23da74a
feat: US-047 - Split conveyer bookmark into lifecycle submodules
NathanFlurry May 1, 2026
7e429c5
feat: US-048 - Split cold tier implementations into submodules
NathanFlurry May 1, 2026
7bb9b6b
feat: US-061 - Add pinned bookmark transition tests
NathanFlurry May 1, 2026
f58ab90
feat: US-062 - Test crash between cold pin upload and bookmark flip
NathanFlurry May 1, 2026
a219f74
feat: US-063 - Test manager generation bump during active job
NathanFlurry May 1, 2026
c6dedaa
feat: US-064 - Test hot OCC abort after concurrent commit
NathanFlurry May 1, 2026
a68470a
feat: US-065 - Replace compactor dispatch real-clock sleep
NathanFlurry May 1, 2026
f48fe43
feat: US-066 - Replace conveyer commit absence sleep
NathanFlurry May 1, 2026
8a8f8ac
feat: US-067 - Consolidate workflow compaction wait helpers
NathanFlurry May 1, 2026
588eb4c
feat: US-068 - Use error-chain matching in fork test helper
NathanFlurry May 1, 2026
eb948a9
feat: US-069 - Consolidate depot test helper modules
NathanFlurry May 1, 2026
1f12e97
feat: US-070 - Migrate SQLite VFS off legacy compact_default_batch
NathanFlurry May 1, 2026
d7bca26
feat: US-071 - Drop Ups from depot Db and conveyer APIs
NathanFlurry May 1, 2026
f176eae
feat: US-072 - Move compactor metric labels to non-legacy owners
NathanFlurry May 1, 2026
acece65
feat: US-073 - Delete legacy depot compactor modules and tests
NathanFlurry May 1, 2026
d60204f
feat: US-074 - Delete legacy-inline-tests cargo feature
NathanFlurry May 1, 2026
822fa6c
feat: US-075 - Delete legacy database branch record decoding
NathanFlurry May 1, 2026
8b6216e
feat: US-076 - Delete legacy storage scope read fallback
NathanFlurry May 1, 2026
5c5695a
feat: US-077 - Remove legacy materialized txid fallback from commit path
NathanFlurry May 1, 2026
70e6fb3
feat: US-078 - Remove dead_code allow from Db struct
NathanFlurry May 1, 2026
e6e9076
feat: US-079 - Rename bookmarks to restore points
NathanFlurry May 1, 2026
4a88df3
feat: US-080 - Rename Depot-local namespace terms to bucket
NathanFlurry May 1, 2026
267ed0d
feat: US-081 - Add PITR and shard cache policy storage
NathanFlurry May 1, 2026
065f6b4
feat: US-082 - Add PITR interval coverage records
NathanFlurry May 1, 2026
8af8c86
feat: US-083 - Select interval PITR coverage during hot planning
NathanFlurry May 1, 2026
6ea48ae
feat: US-084 - Implement restore target resolution
NathanFlurry May 1, 2026
7527331
feat: US-085 - Implement restore point CRUD
NathanFlurry May 1, 2026
845e372
feat: US-086 - Wire fork and restore to snapshot selectors
NathanFlurry May 1, 2026
01ecede
feat: US-087 - Make reclaim honor interval PITR retention
NathanFlurry May 1, 2026
9045e3f
feat: US-088 - Finalize cold-disabled FDB durability
NathanFlurry May 1, 2026
a7d00e5
feat: US-089 - Implement shard cache eviction policy
NathanFlurry May 1, 2026
d085be2
feat: US-090 - Add background read-through shard cache fill
NathanFlurry May 1, 2026
8bd6258
feat: US-091 - Add shard cache metrics
NathanFlurry May 1, 2026
b672b17
feat: US-092 - Add end-to-end PITR timestamp coverage tests
NathanFlurry May 1, 2026
b397dd7
feat: US-093 - Add end-to-end dual-purpose shard cache tests
NathanFlurry May 1, 2026
0c5beef
feat: US-094 - Clean up docs and static terminology checks
NathanFlurry May 1, 2026
30bb0dd
chore(rivetkit): remove setpreventsleep from tests
NathanFlurry May 1, 2026
e0e94e2
fix(rivetkit): prevent sqlite access after shtudown
NathanFlurry May 1, 2026
b944831
feat: US-001 - S1 — Replace set_last_error with clear_last_error in c…
NathanFlurry May 1, 2026
6f896f3
feat: US-002 - S2 — vfs_delete on main DB resets in-memory state or r…
NathanFlurry May 1, 2026
e4b7aa3
feat: US-003 - S35 — Rename SQLITE_RESTORE_POINT_COUNT_PER_NAMESPACE …
NathanFlurry May 1, 2026
fbb5bc7
feat: US-004 - S33 — Remove dead per-commit read of branch_manifest_c…
NathanFlurry May 1, 2026
2e86145
feat: US-005 - S34 — Delete dead delete_expired_pitr_interval_coverag…
NathanFlurry May 1, 2026
df899c8
feat: US-006 - S36 — Remove dead takeover scan of legacy database-sco…
NathanFlurry May 1, 2026
06f0b1f
feat: US-007 - S5 — Document xSync durability contract on io_sync
NathanFlurry May 1, 2026
6b66f5b
feat: US-008 - S8 — Add regression test for embedded NUL in SQL string
NathanFlurry May 1, 2026
ad919eb
feat: US-009 - S7 — Re-verify whether seed_main_page is dead, then cl…
NathanFlurry May 1, 2026
3e60a42
feat: US-010 - S3 drop write lock around moka cache insert in resolve…
NathanFlurry May 1, 2026
b29edb0
feat: US-011 - S19 — Move test fixtures out of vfs.rs into tests/inli…
NathanFlurry May 1, 2026
62e705b
feat: US-012 - S10 — Restructure depot/workflows/ to one-module-per-w…
NathanFlurry May 1, 2026
5a98762
feat: US-013 - S9 add VFS crash-recovery dirty-buffer test
NathanFlurry May 1, 2026
957ea16
feat: US-014 - S9 — Add VFS test for concurrent reader during commit-…
NathanFlurry May 1, 2026
738f6da
feat: US-015 - S9 — Add VFS test for PITR-restore-then-write
NathanFlurry May 1, 2026
4eb38a6
feat: US-016 - S9 — Add VFS test for fork-and-immediately-reopen
NathanFlurry May 1, 2026
a21de51
feat: US-017 - S26 — Diagnose RocksDB driver-build flake and remove t…
NathanFlurry May 1, 2026
8746f2d
feat: US-018 - S25 — Make VFS registration panic-safe via a Drop guard
NathanFlurry May 1, 2026
8234f0b
feat: US-019 - S13 — Wrap NativeDatabase Drop block_on with a bounded…
NathanFlurry May 1, 2026
9c5748c
feat: US-020 - S18 — Replace shared mpsc::Receiver with multi-consume…
NathanFlurry May 1, 2026
081857f
feat: US-021 - S15 — Make Db::new takeover reconcile non-blocking
NathanFlurry May 1, 2026
5d4594d
feat: US-022 - S22 — Replace timing-magnitude assertion with Promise-…
NathanFlurry May 1, 2026
9cf1cb1
feat: US-023 - S24 — Add connection.ready Promise to TS client
NathanFlurry May 1, 2026
4702a09
feat: US-024 - S21 — Replace startup vi.waitFor with await connection…
NathanFlurry May 1, 2026
304d74d
feat: US-025 - S20 — Replace action-polling sleep observer with non-a…
NathanFlurry May 1, 2026
28ab642
feat: US-026 - S11 — Pre-arm Notified before re-checking outstanding …
NathanFlurry May 1, 2026
3292381
feat: US-027 - S12 — Collapse three-RwLock branch-cache invalidation …
NathanFlurry May 2, 2026
4e483ed
feat: US-028 - S17 — Cold-tier post-fetch revalidation prevents silen…
NathanFlurry May 2, 2026
2b9d5b9
feat: US-029 - S30a — Add VFS/depot test for SQLite sparse-page zero-…
NathanFlurry May 2, 2026
d334ede
feat: US-030 - S29 — Restore-point creation: re-validate target txid …
NathanFlurry May 2, 2026
1141c82
feat: US-031 - S30b — restore_database: fold rollback + undo restore-…
NathanFlurry May 2, 2026
038c55e
feat: US-032 - S32 — Add shard-cache eviction regression test for pin…
NathanFlurry May 2, 2026
ac1ff3e
feat: US-033 - S23a add Gasoline WorkflowCreated bump subject
NathanFlurry May 2, 2026
5e238b4
feat: US-034 - S23b — Use WorkflowCreated subscription in workflow_co…
NathanFlurry May 2, 2026
96d19e7
feat: US-035 - Add Depot inspect API with big metadata JSON and pagin…
NathanFlurry May 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions .agent/notes/depot-pitr-compactor-wf-review.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Review: `04-29-feat_sqlite_pitr_forking` + `04-30-chore_sqlite_comapctor_wf`

## Overview

Two stacked PRs reshaping the SQLite storage backend (renamed from `sqlite-storage` to `depot`):

- **`04-29-feat_sqlite_pitr_forking`** (~35k LoC, 50+ commits) — Per-database branches, fork/restore primitives, bookmarks (ephemeral + pinned, two-phase), namespace forks, S3 cold tier, GC pin recompute, burst-mode quota.
- **`04-30-chore_sqlite_comapctor_wf`** (~13k LoC) — Reimplements compaction as a Gasoline workflow. New per-branch DB manager workflow + hot/cold/reclaimer companions, `CMP/root` manifest, global pin/proof/dirty indexes (partitions `0x70..=0x75`), `lifecycle_generation` field, `CompactionSignaler` injected into `Db`, 5076-line `workflows/compaction.rs`. Per CLAUDE.md, this workflow is the active compaction authority.

---

## High-Priority Correctness Issues

### Conveyer

1. **Pinned bookmarks bypass the two-phase protocol entirely.** `bookmark.rs:207, 413` write `PinStatus::Ready` synchronously. The two-phase contract requires the request tx to write `Pending`, set `SQLITE_CMP_DIRTY` (or signal the manager workflow via the `CompactionSignaler` already injected into `Db`), and have the cold companion + manager flip `Pending → Ready` after the S3 pin layer is uploaded. As written, no S3 pin layer is ever built and `pin_object_key` is permanently `None`.
2. **Empty `bk_pin` recompute permanently locks the key at zero.** `bookmark.rs:685` writes `[0;16]` when no pins remain. Subsequent `MutationType::ByteMin` on the same key sees `min(0, new) = 0`, so the next pinned bookmark on this branch never advances `bk_pin`.
3. **Fork-vs-GC OCC fence missing.** `branch.rs:525 derive_branch_at` reads `bk_pin` only; CLAUDE.md mandates a regular-read of parent `META/manifest.retention_pin_txid` so concurrent GC aborts the fork.
4. **Truncate cleanup off-by-one for fully-above-EOF SHARDs.** `commit.rs:567` and `takeover.rs:142` use `>` instead of `>=`. A whole-shard truncate (e.g. `new_db_size_pages = 64`) leaves shard 1 alive.
5. **PIDX/SHARD deletes during truncate use `clear`, not `COMPARE_AND_CLEAR`.** `commit.rs:272-274`. CLAUDE.md is explicit; this clobbers a concurrent compactor write.
6. **`load_branch_ancestry` off-by-one against `MAX_FORK_DEPTH`.** Loop is `0..=MAX_FORK_DEPTH` (17 iterations); `derive_branch_at:534` caps with `>=`. End-to-end allows reading 17 levels.
7. **Branch reap leaks parent refcount.** `gc/mod.rs:155-197 sweep_unreferenced_branch_tx` clears the branch keys but never decrements `branches_refcount_key` on the parent set in `derive_branch_at:591-595`.
8. **Cold tier disabled when commit signaler is wired.** `db.rs:156-173 new_with_compaction_signaler` passes `cold_tier: None`; the only caller is the new envoy path (`pegboard-envoy/src/ws_to_tunnel_task.rs:730-756`). `read.rs:495` short-circuits when cold_tier is None — fork descendants and historical reads break under the new path.

### Compactor workflow

9. **S3 leak when cold publish rejects after upload.** `workflows/compaction.rs:868-895` clears `active_cold_job` without GC'ing the already-uploaded objects. `schedule_stale_cold_output_cleanup` (line 909) only fires for *different* active jobs, not for failed publish of the same job.
10. **Repair reclaim activities skip the lifecycle generation fence.** `compaction.rs:1652, 1666` set `base_lifecycle_generation: 0`; `cleanup_repair_fdb_outputs_tx` (line 2462) and `DeleteOrphanColdObjects` (line 3022) never call `branch_record_is_live_at_generation`. A repair queued before destroy can run during/after destroy and delete S3 objects.
11. **Force-compaction reclaim flag silently dropped.** `plan_reclaim_job` takes `_force: bool` and never reads it (`compaction.rs:3868-3873`). Forced reclaim with no actionable lag falls through to `force_noop_reasons`. Hot/cold honor force; reclaim does not.
12. **Staged-hot install no-op when only DB-pin coverage was staged.** `compaction.rs:1957-1968` requires every PIDX row to be covered by `latest_staged_shards`, but `selected_hot_coverage_txids` only covers DB pins above `hot_watermark_txid`. Pin at `head_txid` only with no intermediate pins → install rejects with "missing staged hot shard".
13. **`mix_fingerprint` is a hand-rolled non-cryptographic combiner** (`compaction.rs:4058-4066`) used as the OCC fence on `CompactionInputFingerprint`. `sha2::Digest` is already imported (line 11). Collision-resistant hashing matters here because the fingerprint guards "active job identity" through the publish path.
14. **Reclaim commit-prefix scan uses `continue` instead of `break`** (`compaction.rs:3559-3572`). Txids are ascending big-endian; the rest of the scan is wasted on every reclaim batch.

---

## Convention Violations

- **`Db` cache fields use `parking_lot::Mutex`** (`db.rs:107-123, 193-203`). All called from async contexts — no forced-sync requirement. Should be `tokio::sync::Mutex`. The `Mutex<scc::HashMap>` wrapping is doubly wrong — drop the outer mutex.
- **Tracing logs in `workflows/compaction.rs` lack the `actor_id` field** the engine convention requires (e.g. `2503-2511, 3064-3074, 3084-3093`); only `database_branch_id` is included.
- **`anyhow::anyhow!` macro** sprinkled across `commit.rs:677, 724, 738`, `quota.rs:60, 82`, `burst_mode.rs:80`, `takeover.rs:174-232`. Convention is `.context()` / `Error::msg`.
- **`S3ColdTier::get_object` matches NoSuchKey by `err.to_string().contains("NoSuchKey")`** (`cold_tier/mod.rs:311`). Use the typed `is_no_such_key()` SDK API.
- **`FaultyColdTier` records metrics under hardcoded `"unknown"` node_id** (`cold_tier/mod.rs:13, 480`).
- **Test hook `lazy_static!` + `parking_lot::Mutex<Option<…>>`** in `workflows/compaction.rs:51-54` is gated by `cfg(debug_assertions)` rather than `cfg(test)` — shipped in dev builds.
- **Inline `#[cfg(test)] mod tests` in `conveyer/types.rs:1123`** (not feature-gated). Move to `tests/`.

---

## Test Coverage

**Strengths:** Real UDB via `test_db()`, `FilesystemColdTier` for cold, fault injection via `FaultyColdTier` + custom wrapper tiers. Force-compaction tests correctly wait on durable `force_compaction_results` (`workflow_compaction_skeletons.rs:1502, 1562, 1638, 1712, 1826`). OCC race tests use clean atomic-bool-guarded retry hooks (`fork_database.rs:99`, `fork_namespace.rs:145`).

**Gaps:**
- **No test for the workflow `Pending → Ready` flip on a pinned bookmark.** Given finding #1, this contract isn't asserted anywhere — neither the success path nor the `Pending → Failed` path on cold upload failure.
- **No test for crash between cold pin upload and the manager's bookmark-flip publish.**
- **No test for manager workflow generation bump mid-active-job** — only individual companion-side rejections (`workflow_compaction_skeletons.rs:1960, 2025, 2778, 3007`).
- **No test for hot OCC abort when a concurrent commit lands between hot-stage read and manager install.**

**Quality concerns:**
- **`conveyer_commit.rs:760`** does a 1ms wall-clock absence check. Replace with explicit yield/notify.
- **17 polling helpers in `workflow_compaction_skeletons.rs:125-605`** each implement their own `loop { sleep 25ms }` waiting for durable workflow state. Not retry-til-success masks (each polls a real state condition), but real-clock and duplicative — a single deterministic helper would shave ~250 LoC and CI time.
- **`fork_common/mod.rs:157-165 assert_storage_error`** uses `err.downcast_ref` (top-level only), while `bookmarks.rs:114` and `gc_pin_recompute_under_bookmark_delete_race.rs:16` use `err.chain().any(...)`. Inconsistent — context-wrapped errors silently pass.
- **Helper sprawl**: `fault_common`, `fork_common`, `bookmarks.rs:30`, `takeover.rs:13` all redefine `test_db`, `make_db`, `read_value` instead of sharing a single helper module.

---

## Notable Strengths

- `BookmarkStr` newtype validates the 33-char wire format at construction *and* deserialization (`types.rs:69-118`). `MutationType::ByteMin` correctly applied for branch pin atomic-min everywhere it should be.
- `read.rs` correctly caps ancestor PIDX reads by per-source `versionstamp_cap`, falls through to latest SHARD ≤ cap, and disables the PIDX cache when the read plan has multiple branches (`read.rs:144-147, 174, 285-294`).
- Cold object retire → grace → `DeleteIssued` → S3 delete → cleanup-Deleted sequence in the workflow (`compaction.rs:4715-4759`) faithfully implements the "completed retired record so the key cannot be republished" invariant.
- `resolve_namespace_fork_pins` proof walk (`compaction.rs:3214-3394`) materializes `DB_PIN(NamespaceFork)` records before reclaim and treats missing/ambiguous proof as a retention blocker, matching the constraint.
51 changes: 51 additions & 0 deletions .agent/notes/depot-tier-test-matrix-results.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Depot Tier Test Matrix Results

Run at: 2026-05-01T15:28:56-07:00

Command:

```bash
cargo test -p depot
```

Result: passed.

Summary:

```text
lib unit tests: 28 passed
burst_mode: 1 passed
cold_tier: 4 passed
conveyer_branch: 19 passed
conveyer_commit: 15 passed
conveyer_compaction_payloads: 7 passed
conveyer_constants: 1 passed
conveyer_error: 2 passed
conveyer_keys: 13 passed
conveyer_page_index: 4 passed
conveyer_pitr_interval: 2 passed
conveyer_policy: 4 passed
conveyer_quota: 4 passed
conveyer_read: 19 passed
conveyer_restore_point: 26 passed
debug: 3 passed
fork_bucket: 4 passed
fork_common_helpers: 1 passed
fork_database: 6 passed
gc: 4 passed
gc_pin_recompute_under_restore_point_delete_race: 1 passed
list_databases: 3 passed
restore_points: 5 passed
takeover: 1 passed
workflow_compaction_payloads: 1 passed
workflow_compaction_skeletons: 69 passed
doc tests: 0 passed
```

Exit status: `0`

Notes:

- The first combined run failed to compile because existing Depot source edits had dropped `anyhow::Error` from `conveyer/commit/helpers.rs` and borrowed temporary `tx.informal()` handles across `try_join!` in `burst_mode.rs`.
- Those compile blockers were fixed narrowly, then the command above passed.
- `workflow_compaction_skeletons.rs` now runs its tier-agnostic manager, hot compaction, PITR, and reclaim cases through a local disabled/filesystem workflow matrix. Cold-only and cold-disabled assertions remain single-mode by design.
109 changes: 25 additions & 84 deletions .agent/notes/driver-test-progress.md
Original file line number Diff line number Diff line change
@@ -1,91 +1,32 @@
# Driver Test Suite Progress

Started: 2026-04-26T14:05:00-07:00
Started: 2026-05-01
Config: registry (static), client type (http), encoding (bare)
Scope: DB driver tests only

## Fast Tests
## DB Tests

- [x] manager-driver | Manager Driver Tests
- [x] actor-conn | Actor Connection Tests
- [x] actor-conn-state | Actor Connection State Tests
- [x] conn-error-serialization | Connection Error Serialization Tests
- [x] actor-destroy | Actor Destroy Tests
- [x] request-access | Request Access in Lifecycle Hooks
- [x] actor-handle | Actor Handle Tests
- [x] action-features | Action Features
- [x] access-control | access control
- [x] actor-vars | Actor Variables
- [x] actor-metadata | Actor Metadata Tests
- [x] actor-onstatechange | Actor onStateChange Tests
- [ ] actor-db | Actor Database
- [ ] actor-db-raw | Actor Database Raw Tests
- [ ] actor-db-init-order | Actor DB Init Order Tests
- [ ] actor-workflow | Actor Workflow Tests
- [ ] actor-error-handling | Actor Error Handling Tests
- [ ] actor-queue | Actor Queue Tests
- [ ] actor-kv | Actor KV Tests
- [ ] actor-stateless | Actor Stateless Tests
- [ ] raw-http | raw http
- [ ] raw-http-request-properties | raw http request properties
- [ ] raw-websocket | raw websocket
- [ ] actor-inspector | Actor Inspector Tests
- [ ] gateway-query-url | Gateway Query URL Tests
- [ ] actor-db-pragma-migration | Actor Database Pragma Migration
- [ ] actor-state-zod-coercion | Actor State Zod Coercion
- [ ] actor-save-state | Actor Save State Tests
- [ ] actor-conn-status | Connection Status Changes
- [ ] gateway-routing | Gateway Routing
- [ ] lifecycle-hooks | Lifecycle Hooks
- [ ] serverless-handler | Serverless Handler Tests

## Slow Tests

- [ ] actor-state | Actor State Tests
- [ ] actor-schedule | Actor Schedule Tests
- [ ] actor-sleep | Actor Sleep Tests
- [ ] actor-sleep-db | Actor Sleep Database Tests
- [ ] actor-lifecycle | Actor Lifecycle Tests
- [ ] actor-conn-hibernation | Actor Connection Hibernation Tests
- [ ] actor-run | Actor Run Tests
- [ ] hibernatable-websocket-protocol | hibernatable websocket protocol
- [ ] actor-db-stress | Actor Database Stress Tests

## Excluded

- [ ] actor-agent-os | Actor agentOS Tests (skip unless explicitly requested)
- [x] actor-db | Actor Database
- [x] actor-db-raw | Actor Database Raw Tests
- [x] actor-db-pragma-migration | Actor Database Pragma Migration
- [x] actor-sleep-db | Actor Sleep Database Tests
- [x] actor-db-stress | Actor Database Stress Tests
- [x] actor-db-init-order | Actor DB Init Order

## Log

- 2026-04-26T14:06:57-07:00 manager-driver: PASS

- 2026-04-26T14:07:27-07:00 actor-conn: PASS

- 2026-04-26T14:07:37-07:00 actor-conn-state: PASS

- 2026-04-26T14:07:42-07:00 conn-error-serialization: PASS

- 2026-04-26T14:08:14-07:00 actor-destroy: PASS

- 2026-04-26T14:08:19-07:00 request-access: PASS

- 2026-04-26T14:08:31-07:00 actor-handle: PASS

- 2026-04-26T14:08:31-07:00 action-features: PASS

- 2026-04-26T14:08:46-07:00 access-control: PASS

- 2026-04-26T14:08:51-07:00 actor-vars: PASS

- 2026-04-26T14:08:58-07:00 actor-metadata: PASS

- 2026-04-26T14:08:59-07:00 actor-onstatechange: PASS

- 2026-04-26T14:10:59-07:00 actor-db: FAIL (exit 124)

- 2026-04-26T14:12:00-07:00 runner: stale suite-description filters found for action-features, actor-onstatechange, actor-db, gateway-query-url, and likely other renamed suites; switching to per-file bare filter.

- 2026-04-26T14:12:54-07:00 action-features: PASS (bare file filter)

- 2026-04-26T14:12:59-07:00 actor-onstatechange: PASS (bare file filter)

- 2026-04-26T14:17:33-07:00 actor-db: FAIL (exit 1, bare file filter)
- 2026-05-01 12:45:05 PDT actor-db: FAIL - 4 failures in static/bare run. First failing test reproduced standalone: `persists across sleep and wake cycles` returned count 0 instead of 1 after sleep/wake.
- 2026-05-01 13:02:09 PDT actor-db: PASS (13 passed, 26 skipped, 25.4s). Fixed VFS persisted page-1 bootstrap, hot-only sparse page reads, and actor2 serverful reallocate transition ordering.
- 2026-05-01 13:02:30 PDT actor-db-raw: PASS (5 passed, 10 skipped, 4.5s).
- 2026-05-01 13:02:52 PDT actor-db-pragma-migration: PASS (4 passed, 8 skipped, 4.0s).
- 2026-05-01 13:10:52 PDT actor-sleep-db: PASS (14 passed, 58 skipped, 59.1s). Fixed sleep DB fixture hold behavior and made sqlite cleanup terminal for stale actor-context instances.
- 2026-05-01 13:11:48 PDT actor-db-stress: PASS (3 passed, 27.8s).
- 2026-05-01 13:12:41 PDT actor-db-init-order: PASS (6 passed, 12 skipped, 7.4s).
- 2026-05-01 13:13:02 PDT DB TESTS COMPLETE - 6/6 DB file groups passed for static/bare.
- 2026-05-01 14:22:27 PDT DB TESTS RERUN STARTED - static/bare.
- 2026-05-01 14:23:06 PDT actor-db rerun: PASS (13 passed, 26 skipped, 23.7s).
- 2026-05-01 14:23:25 PDT actor-db-raw rerun: PASS (5 passed, 10 skipped, 5.3s).
- 2026-05-01 14:24:37 PDT actor-db-pragma-migration rerun: PASS (4 passed, 8 skipped, 53.4s).
- 2026-05-01 14:25:56 PDT actor-sleep-db rerun: PASS (14 passed, 58 skipped, 64.0s).
- 2026-05-01 14:27:04 PDT actor-db-stress rerun: PASS (3 passed, 28.7s).
- 2026-05-01 14:28:00 PDT actor-db-init-order rerun: PASS (6 passed, 12 skipped, 7.9s).
- 2026-05-01 14:28:04 PDT DB TESTS RERUN COMPLETE - 6/6 DB file groups passed for static/bare.
4 changes: 2 additions & 2 deletions .agent/notes/rivetkit-core-walkthrough.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ File tags: `0x00` main DB, `0x01` rollback journal, `0x02` WAL, `0x03` SHM. The

**v2 slow path.** Large commits that don't fit the one-shot path encode delta blocks as full LTX v3 frames and stuff them directly under the DELTA chunk keys. There is no `/STAGE` prefix, no fixed one-chunk-per-page mapping. A chunk key may contain a raw 4 KiB page *or* an LTX frame; the v3 decoder handles both.

**The parity invariant.** The native Rust VFS and the WASM TypeScript VFS (`rivetkit-typescript/packages/sqlite-wasm/src/vfs.ts` + `kv.ts`) must be byte-for-byte identical: same chunk size, same key encoding, same PRAGMA settings, same delete-range strategy for truncate, same journal mode. When you change one, change the other in the same commit. A database written by the Rust VFS must be readable by the WASM VFS.
**Native-only.** The VFS lives only in `rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs`. There is no runtime WASM/TypeScript VFS — the `@rivetkit/sqlite-wasm` npm package is deprecated and the Rust crate is statically linked into `@rivetkit/rivetkit-napi` via `libsqlite3-sys`. Per-VFS rules (4 KiB chunks, `journal_mode=DELETE`, `locking_mode=EXCLUSIVE`, `auto_vacuum=NONE`) live in this one source.

---

Expand Down Expand Up @@ -393,7 +393,7 @@ Consolidated from the above:
1. KV internal prefixes (`[1]`, `[2]*`, `[5]*`, `[0x08]*`) are reserved. No runtime enforcement. User writes into them corrupt the actor.
2. `enqueue_and_wait` completion waits ignore the actor abort token. Breaking this breaks hibernation.
3. Queue metadata rebuilds by full-scan on decode failure. Slow, safe, never lose messages.
4. Native SQLite VFS and WASM SQLite VFS must match byte-for-byte. Chunk size, key layout, PRAGMAs, truncate strategy, journal mode.
4. SQLite VFS is native-only (`rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs`). Chunk size, key layout, PRAGMAs, truncate strategy, journal mode all live here.
5. Raw `onRequest` HTTP bypasses message-size limits. Action and queue routes do not.
6. Static native actor HTTP flows through `RegistryDispatcher::handle_fetch`, not `actor/event.rs`. Sleep-timer fixes need both entry points.
7. WebSocket message/close callbacks run inline under the callback guard, not as dispatch events.
Expand Down
Loading
Loading