Skip to content

Commit 5db51bb

Browse files
Phase 6: schema, system catalog, DDL, constraint enforcement; txn write-job generalization
1 parent 7786c75 commit 5db51bb

21 files changed

Lines changed: 3143 additions & 124 deletions

CHANGELOG.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,47 @@ under a category (`Added` / `Changed` / `Fixed` / `Removed` / `Security`).
88

99
## [Unreleased]
1010

11+
### Phase 6 — Schema, catalog & constraints
12+
13+
#### Added
14+
- `catalog`: strict typed tables over the transaction layer — the schema
15+
model (`TableDef`/`ColumnDef`: types, single/composite PK, NOT NULL,
16+
UNIQUE, CHECK, DEFAULT values + `now`/`uuid_v7` generators, auto-increment,
17+
rowversion, `on_update: now`, `update: free | guarded`) with definition
18+
validation, persisted in the in-file system catalog.
19+
- The system catalog **is** the published-root B+tree (`DECISIONS.md` D14):
20+
per table a schema entry, a data-root entry, and an auto-increment sequence
21+
entry — one commit covers schema + data atomically and one snapshot pins
22+
both consistently.
23+
- DDL as ordinary write transactions: `create_table`, `drop_table` (frees the
24+
whole table tree via deferred reclamation), `add_column` (nullable or
25+
constant default; old rows padded lazily on read — D17).
26+
- Row DML with full constraint enforcement on insert and update; multi-row
27+
`insert_many` is atomic (in-batch PK/UNIQUE collisions included); typed
28+
per-constraint errors mapped to the `SPEC.md` §9 taxonomy.
29+
- Provisional `CheckExpr` (comparisons + boolean combinators, SQL 3VL: NULL
30+
passes — D15) and provisional scan-based UNIQUE probes until Phase 7's
31+
unique indexes (D16).
32+
- Engine-managed values under the writer: auto-increment (durable sequence,
33+
no reuse across reopen), rowversion (1, then +1 per update), `now` /
34+
`uuid_v7` defaults and `on_update: now` driven by the injected clock/RNG.
35+
- Exit-criteria tests: create/reopen/inspect round-trip, one violation test
36+
per constraint, guarded persisted/readable, generated values under a
37+
manual clock, multi-row atomicity, snapshot consistency across DDL+DML.
38+
39+
#### Changed
40+
- `txn`: write transactions generalized to a `WriteJob` trait run against a
41+
`WriteCtx` (multi-tree edits, typed post-commit outputs); `Db<B>` is now an
42+
alias for `JobDb<B, OpsJob>` (API unchanged); the writer classifies job
43+
errors by category (`Io`/`Corruption` fatal, others reject the one
44+
transaction); new `TxnError::Rejected` carries a higher layer's typed
45+
error across the writer thread; `Snapshot` gains `root`/`get_in`/
46+
`range_in`/`scan_in` for reading trees under the pinned root.
47+
- `btree`: `BTree::pages` collects every page of a tree (drop-table
48+
reclamation).
49+
- `types`: `UuidV7Gen` now owns `Arc<dyn Clock>`/`Arc<dyn Rng>` so generator
50+
state can span transactions.
51+
1152
### Phase 5 — Types, values & encoding
1253

1354
#### Added

Cargo.lock

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

DECISIONS.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,74 @@ Per `PLAN.md` §1 rule 6, every resolution of an ambiguity or deviation from
55

66
---
77

8+
## D17 — Write-path semantics the SPEC leaves open
9+
10+
**Phase:** 6 · **Status:** accepted
11+
12+
`SPEC.md` §4 defines the constraint set but not every edge of the write path.
13+
Phase 6 fixes these (each is a one-line change later if v2 decides otherwise):
14+
15+
- **PK columns are immutable** — an update touching one is rejected
16+
(`PkImmutable`). Change-of-key is delete + insert.
17+
- **Engine-managed columns reject explicit writes**: `rowversion` and
18+
`on_update: now` columns can never be set by the caller.
19+
- **`rowversion` starts at 1** on insert and bumps by 1 on every update.
20+
- **`on_update: now` also stamps on insert** (an `updated_at` is never NULL).
21+
- **Explicit values on auto-increment columns are allowed**; the sequence
22+
advances past them (`seq = max(seq, given + 1)`), so generated keys never
23+
collide. The sequence is a catalog entry — durable, gaps allowed, no reuse
24+
after crash/reopen.
25+
- **`add column`** requires nullable or a *constant* default; existing rows
26+
are padded lazily on read (generators cannot backfill), no rewrite.
27+
- **UNIQUE ignores NULLs** (multiple NULLs allowed, SQL semantics).
28+
29+
## D16 — UNIQUE enforced by a scan probe until Phase 7
30+
31+
**Phase:** 6 · **Status:** accepted (provisional)
32+
33+
`SPEC.md` §4.1 says UNIQUE is "enforced via a unique index", but `index` is
34+
Phase 7. Phase 6 enforces the constraint **correctly but provisionally** with
35+
a full-table scan probe (one scan per write batch). Phase 7's unique indexes
36+
replace the probe with an index lookup; behavior is unchanged, only cost.
37+
38+
## D15 — Provisional CHECK expressions with SQL three-valued logic
39+
40+
**Phase:** 6 · **Status:** accepted (provisional)
41+
42+
`SPEC.md` §4.1 allows `CHECK(<expr>)` over the row, but the expression
43+
language (§5.2) arrives with the query layer in Phases 8–9.
44+
45+
**Decision:** Phase 6 ships a minimal `CheckExpr` — column-vs-literal
46+
comparisons, `and`/`or`/`not`, `is_null`/`is_not_null` — validated at DDL
47+
(columns exist, literal kinds match) and stored in the catalog. Evaluation
48+
follows SQL 3VL: a CHECK is violated only when it is definitively **false**;
49+
NULL/unknown passes. The Phase 8/9 expression engine supersedes the enum
50+
(same precedent as Phase 3's provisional byte keys).
51+
52+
## D14 — One published root: the catalog tree owns everything
53+
54+
**Phase:** 6 · **Status:** accepted
55+
56+
`ARCHITECTURE.md` §3.5 stores the system catalog "in B+trees referenced from
57+
the meta page" but leaves the multi-tree commit mechanics open.
58+
59+
**Decision:**
60+
- The meta page's root **is the catalog B+tree**; every other tree hangs off
61+
it. Per table: `("tbl", name)` → schema (changes on DDL), `("root", name)`
62+
data-tree root (changes on every write), `("seq", name)` → auto-increment
63+
cursor. A write to table T updates T's tree, then T's root entry, producing
64+
one new catalog root — so one root install + one fsync pair commits schema
65+
and data atomically, and a snapshot pins both consistently.
66+
- `txn` generalizes to **`WriteJob`/`WriteCtx`**: a job runs on the writer
67+
thread, edits any tree under the root, and returns a typed output delivered
68+
after durable commit. `Db<B>` is now an alias for `JobDb<B, OpsJob>` — the
69+
Phase 4 API and tests are unchanged. Jobs inherit D8's validate-then-apply
70+
contract; the writer classifies job errors **by category** (`Io`/
71+
`Corruption` fatal, everything else rejects just that transaction, with the
72+
root and freed-list restored defensively). Rejections cross the thread as
73+
`TxnError::Rejected(Box<dyn CategorizedError>)` and downcast back to the
74+
catalog's typed error at the API surface.
75+
876
## D13 — No `u64` value type; `rowversion` columns are `i64`
977

1078
**Phase:** 5 · **Status:** accepted (SPEC inconsistency, raised)

crates/btree/src/tree.rs

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -168,6 +168,21 @@ impl<'p, B: IoBackend> BTree<'p, B> {
168168
})
169169
}
170170

171+
/// Collect every page id reachable from `root` — the whole tree, in no
172+
/// particular order. Used by callers that retire an entire tree (e.g.
173+
/// `drop table`) and must hand all of its pages to deferred reclamation.
174+
pub fn pages(&self, root: PageId) -> Result<Vec<PageId>> {
175+
let mut out = Vec::new();
176+
let mut stack = vec![root];
177+
while let Some(id) = stack.pop() {
178+
out.push(id);
179+
if let Node::Internal { children, .. } = self.read_node(id)? {
180+
stack.extend(children);
181+
}
182+
}
183+
Ok(out)
184+
}
185+
171186
// --- internals -------------------------------------------------------
172187

173188
fn read_node(&self, id: PageId) -> Result<Node> {

crates/catalog/Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ description = "Schema model, system catalog, DDL, constraint enforcement (Phase
1111

1212
[dependencies]
1313
common.workspace = true
14+
pager.workspace = true
15+
btree.workspace = true
1416
types.workspace = true
1517
txn.workspace = true
1618
thiserror.workspace = true

0 commit comments

Comments
 (0)