This file provides guidance to Codex (Codex.ai/code) when working with code in this repository.
SQLRite is a from-scratch SQLite-style embedded database written in Rust. It's published on crates.io as sqlrite-engine (imported as use sqlrite::… — the lib target keeps the short name) and ships as: a REPL binary (sqlrite), a Tauri 2 + Svelte 5 desktop app, a Model Context Protocol stdio server (sqlrite-mcp), a C FFI shim (sqlrite-ffi), and language SDKs (Python via PyO3, Node via napi-rs, Go via cgo, WASM via wasm-bindgen). Phases 1–7 are shipped; the current branch phase-8-plan drafts inverted-index + BM25 full-text search and hybrid retrieval.
Cargo.toml is a workspace whose members are: . (the engine, package sqlrite-engine, lib sqlrite), desktop/src-tauri, examples/desktop-journal/src-tauri, sqlrite-ffi, sqlrite-ask, sqlrite-mcp, sdk/python, sdk/nodejs, benchmarks. sdk/wasm and sdk/go are deliberately not workspace members (wasm32 target / cgo separation).
src/— engine. Public API isConnection/Statement/Rows/Row/Valuefrom src/connection.rs, re-exported via src/lib.rs. Any new SDK should bind only to this surface.sqlrite-ask/— pure-Rust LLM adapter (Anthropic/OpenAI/Ollama) for natural-language → SQL. The engine'saskfeature provides the thinConnectionAskExt::askglue.sqlrite-mcp/— MCP stdio server. Seven tools:list_tables,describe_table,query,execute,schema_dump,vector_search,ask.--read-onlyopens with a shared lock and hidesexecute.sqlrite-ffi/— C ABI cdylib + generatedsqlrite.hheader. Backs the Go SDK and any C consumer.desktop/— Tauri 2 + Svelte 5 generic SQL playground GUI. Embeds the engine directly (no FFI hop).examples/desktop-journal/— Tauri 2 + Svelte 5 local-first journaling app (SQLR-41). Showcase for Phase 8 BM25 +askin a non-AI-native product. Mirrorsdesktop/'s engine-as-Cargo-dep pattern but uses the modernConnectionAPI.benchmarks/— SQLR-4 / SQLR-16 bench harness.Drivertrait + SQLRite + SQLite (rusqlite-bundled) drivers + criterion-driven workloads. Excluded from the default CI build/test/clippy/doc commands; run locally withmake bench(ormake bench-duckdb). See docs/benchmarks-plan.md.web/— marketing + docs site (Next.js 15 + Tailwind v4). Independent of the Cargo workspace; lives in-repo for now but is structured to lift into its own repository later. See web/README.md.
Architecture deep-dive: docs/architecture.md. The full doc index is docs/_index.md.
SQL string → src/sql/mod.rs process_command parses with the external sqlparser crate (SQLite dialect) → src/sql/parser/ trims the AST into internal structs (CreateQuery, InsertQuery, SelectQuery) → src/sql/executor.rs runs the statement against the in-memory Database (src/sql/db/database.rs). On any write, auto-save serializes changed pages through src/sql/pager/ — 4 KiB pages, cell-encoded B-trees per table and index, WAL + crash-safe checkpoint, fs2 advisory locks. Vector search (Phase 7d) goes through src/sql/hnsw.rs; KNN uses a bounded-heap top-k in the executor. Transactions snapshot the in-memory state; ROLLBACK restores it. There is no query optimizer beyond the KNN/HNSW shortcut, no joins, no aggregates yet.
CI is the source of truth — the workspace excludes that follow are required because the desktop crate needs a Svelte build first, the PyO3/napi-rs cdylibs can't link standalone test binaries, and the benchmarks/ harness deliberately stays out of CI (criterion is noisy on shared runners; the rusqlite-bundled build is heavy).
# Build / test the Rust workspace (matches CI)
cargo build --workspace --exclude sqlrite-desktop --exclude sqlrite-journal --exclude sqlrite-python --exclude sqlrite-nodejs --exclude sqlrite-benchmarks --all-targets
cargo test --workspace --exclude sqlrite-desktop --exclude sqlrite-journal --exclude sqlrite-python --exclude sqlrite-nodejs --exclude sqlrite-benchmarks
# Single test (exact name; --nocapture to see println!)
cargo test <test_name> -- --nocapture
# Lint (CI runs all three)
cargo fmt --all -- --check
cargo clippy --workspace --exclude sqlrite-desktop --exclude sqlrite-journal --exclude sqlrite-python --exclude sqlrite-nodejs --exclude sqlrite-benchmarks --all-targets
cargo doc --workspace --exclude sqlrite-desktop --exclude sqlrite-journal --exclude sqlrite-python --exclude sqlrite-nodejs --exclude sqlrite-benchmarks --no-deps
# Run the REPL (default features include cli + ask + file-locks)
cargo run # in-memory
cargo run -- path/to/db.sqlrite # open/create file
cargo run -- --readonly path/to/db.sqlrite # shared-lock open
# Crate-specific
cargo build --release -p sqlrite-ffi # C cdylib + sqlrite.h
cd desktop && npm install && npm run tauri dev # desktop app dev mode
cargo run -p sqlrite-mcp -- /path/to.sqlrite # MCP server (stdio)
cd examples/desktop-journal && npm install && npm run tauri dev # journal example app
# Benchmarks (SQLR-4 / SQLR-16) — local-only, never CI
make bench # SQLRite + SQLite (lean)
make bench-duckdb # adds DuckDB driver (Group B only)
# Release plumbing
scripts/bump-version.sh 0.2.0 # bumps version across 11 manifestsSQLRITE_LLM_API_KEY is required for the .ask REPL command, the engine's ask feature, and the MCP ask tool. Clippy is not -D warnings yet (intentional — see top of .github/workflows/ci.yml); deny-by-default lints still fail CI.
- Errors. Single
SQLRiteErrorenum (thiserror, six variants) with a project-wideResult<T>alias. All public APIs return typed errors; no panics. The enum hand-rollsPartialEqbecausestd::io::Errordoesn't derive it. - Storage isn't bincode. Tables and indexes share a cell-encoded B-tree format with a 4 KiB page size; the file header carries a format version (currently v4 after the Phase 7a vector column work). The diff-based pager only writes changed pages. See docs/file-format.md and docs/pager.md.
- B-tree commit strategy. Bottom-up rebuild on every commit (O(N), correct-by-construction). No in-place splits — deferred design decision.
- Feature gates matter.
default = ["cli", "ask", "file-locks"]. The REPL[[bin]]required-features = ["cli", "ask"]. WASM and lean library embeddings build withdefault-features = falseto avoid rustyline / clap / fs2 / sqlrite-ask. Don't pull these into the always-on dependency set. - Don't reinvent the SQL parser.
sqlparseris the tokenizer and AST source; project code only narrows that AST. New SQL features start by mapping the existingsqlparserAST node, not by extending a custom grammar. - Phase numbering is real. The roadmap is sequenced in docs/roadmap.md; design discussions live at github.com/sqlrite/design and feature work generally tracks an open phase plan in
docs/phase-*-plan.md. Treat in-flight phase plans as load-bearing context. - Concurrency. Engine mutates state through
Arc<Mutex<_>>(Tauri-friendly). On-disk concurrency uses fs2 advisory locks: shared for readers, exclusive for the single writer.
- Code:
/Users/joaoh82/projects/rust_sqlite - Context (read first):
~/Documents/josh-obsidian-synced/Projects/rust_sqlite/context.md - Notes (running journal):
~/Documents/josh-obsidian-synced/Projects/rust_sqlite/notes.md - Project wiki:
~/Documents/josh-obsidian-synced/Projects/rust_sqlite/wiki/
How to use each:
context.md— stable background (product goals, stakeholders, domain). Read before starting non-trivial work. Update only when underlying facts change.notes.md— append-only dated journal. Add entries under## YYYY-MM-DDheadings for decisions, blockers, TODOs, and incidents — anything worth preserving but not stable enough forcontext.md.wiki/— reference sub-docs (e.g.Architecture.md,Local Dev Setup.md,Tech Services.md). Create new files as topics emerge.
When to save:
- New stable fact about the product/domain → update
context.md. - A decision, incident, or working note → append a dated entry to
notes.md. - Reusable reference material (setup steps, credential locations, architecture) → new/updated file in
wiki/.
- General wiki:
~/Documents/josh-obsidian-synced/vault/wiki/— start at_master-index.md, then drill into the relevant topic's_index.md. - Raw dumps:
~/Documents/josh-obsidian-synced/vault/raw/— drop unprocessed research here asYYYY-MM-DD-{slug}.md.
Read the general wiki when the question isn't specific to this project. Drop raw research or imported notes into vault/raw/ so it's captured even before it's distilled.