SQLite is a lightweight, embedded SQL database that provides zero-configuration storage for Pangolin's metadata.
SQLite stores all Pangolin catalog metadata in a single file with:
- Zero configuration required
- ACID compliance
- Full SQL support
- Extremely low resource usage
- Perfect for development and embedded deployments
- Zero Configuration: No separate database server needed
- Single File: Entire database in one file
- Lightweight: Minimal memory and CPU usage
- Fast: Excellent performance for read-heavy workloads
- Portable: Copy the file to move the database
- Embedded: Perfect for edge devices and embedded systems
- ACID Compliant: Full transaction support
- Foreign Keys: Referential integrity support
- No Network: No network latency
- Single Writer: Limited concurrent write performance
- No Replication: Manual backup/restore required
- File-Based: Not suitable for distributed systems
- Scalability: Limited to single machine
- No User Management: File-level permissions only
Best For:
- Local development
- Testing and CI/CD
- Embedded applications
- Edge/IoT deployments
- Single-user applications
- Prototyping
Not Ideal For:
- High-concurrency write workloads
- Distributed systems
- Multi-server deployments
- When you need replication
- Large-scale production (use PostgreSQL/MongoDB)
- None! SQLite is embedded in Pangolin
No installation required - SQLite is built into Pangolin.
# File-based database
DATABASE_URL=sqlite:///path/to/pangolin.db
# Relative path
DATABASE_URL=sqlite://./pangolin.db
# In-memory database (testing only, data lost on restart)
DATABASE_URL=sqlite::memory:
# Absolute path
DATABASE_URL=sqlite:////home/user/pangolin/data/pangolin.dbPangolin automatically initializes the SQLite schema on first startup using the bundled sqlite_schema.sql.
The comprehensive schema includes:
Core Infrastructure:
tenants- Multi-tenant isolation recordswarehouses- Storage configurations (S3, Azure, GCP)catalogs- Iceberg catalog definitionsnamespaces- Namespace hierarchies and propertiesassets- Table and view metadata pointersbranches- Git-like branch definitionstags- Immutable commit pointers (snapshots)commits- Detailed operation historymetadata_locations- Physical Iceberg metadata file path tracking
Governance & Security:
users- Root and Tenant user accountsroles- RBAC role definitionsuser_roles- Role assignments to userspermissions- Direct (TBAC/RBAC) permission grantsaccess_requests- Data discovery access workflowsaudit_logs- Comprehensive tamper-evident trailactive_tokens- Session tracking for active tokens
System & Maintenance:
business_metadata- Rich discovery metadata for assetssystem_settings- Tenant-specific configuration overridesfederated_sync_stats- Status tracking for cross-tenant federation
Indexes:
- Pragma-enforced Foreign Keys
- Primary keys on all tables
- Composite indexes for branch isolation (e.g.,
UNIQUE(tenant_id, catalog_name, branch_name, namespace_path, name)) - Composite indexes for common listing operations
# Download schema
curl -O https://raw.githubusercontent.com/AlexMercedCoder/Pangolin/main/pangolin/pangolin_store/sql/sqlite_schema.sql
# Apply schema
sqlite3 pangolin.db < sqlite_schema.sql-- Enable foreign keys (required)
PRAGMA foreign_keys = ON;
-- Performance optimizations
PRAGMA journal_mode = WAL; -- Write-Ahead Logging for better concurrency
PRAGMA synchronous = NORMAL; -- Balance between safety and speed
PRAGMA cache_size = -64000; -- 64MB cache
PRAGMA temp_store = MEMORY; -- Use memory for temp tables
PRAGMA mmap_size = 30000000000; -- 30GB memory-mapped I/O
PRAGMA page_size = 4096; -- 4KB pages (default)-- Database size
SELECT page_count * page_size as size FROM pragma_page_count(), pragma_page_size();
-- Table sizes
SELECT
name,
SUM(pgsize) as size
FROM dbstat
GROUP BY name
ORDER BY size DESC;
-- Index usage
EXPLAIN QUERY PLAN SELECT * FROM catalogs WHERE tenant_id = '...';# Simple file copy (database must be idle)
cp pangolin.db pangolin_backup.db
# Using SQLite backup command (safe while database is in use)
sqlite3 pangolin.db ".backup pangolin_backup.db"
# Dump to SQL
sqlite3 pangolin.db .dump > pangolin_backup.sql
# Restore from SQL
sqlite3 pangolin_new.db < pangolin_backup.sql#!/bin/bash
# backup_sqlite.sh
BACKUP_DIR="/backups/pangolin"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
DB_FILE="/path/to/pangolin.db"
mkdir -p $BACKUP_DIR
sqlite3 $DB_FILE ".backup $BACKUP_DIR/pangolin_$TIMESTAMP.db"
# Keep only last 7 days
find $BACKUP_DIR -name "pangolin_*.db" -mtime +7 -deleteSQLite does not support built-in replication. For high availability:
- Use PostgreSQL or MongoDB for production
- File-level replication: Use tools like Litestream
- Regular backups: Automated backup scripts
# Install Litestream
curl -s https://api.github.com/repos/benbjohnson/litestream/releases/latest \
| grep "browser_download_url.*linux-amd64.tar.gz" \
| cut -d : -f 2,3 \
| tr -d \" \
| wget -qi -
# Configure litestream.yml
replicas:
- url: s3://my-bucket/pangolin.db
# Run Litestream
litestream replicate pangolin.db-
File Permissions: Restrict access to database file
chmod 600 pangolin.db chown pangolin:pangolin pangolin.db
-
Encryption at Rest: Use filesystem encryption (LUKS, dm-crypt)
-
Backup Encryption: Encrypt backup files
sqlite3 pangolin.db .dump | gpg -c > pangolin_backup.sql.gpg
-
Network Security: SQLite has no network access (inherently secure)
# Export from SQLite
sqlite3 pangolin.db .dump > sqlite_dump.sql
# Convert to PostgreSQL (requires manual editing)
# Then import to PostgreSQL
psql -U pangolin -d pangolin -f converted_dump.sqlCustom migration script required.
# Check for locks
lsof | grep pangolin.db
# Force unlock (use with caution)
fuser -k pangolin.db# Check integrity
sqlite3 pangolin.db "PRAGMA integrity_check;"
# Recover from corruption
sqlite3 pangolin.db ".recover" | sqlite3 pangolin_recovered.db-- Analyze database
ANALYZE;
-- Vacuum to reclaim space
VACUUM;
-- Rebuild indexes
REINDEX;✅ All 6 SQLite tests passing:
test_sqlite_tenant_crudtest_sqlite_warehouse_crudtest_sqlite_catalog_crudtest_sqlite_namespace_operationstest_sqlite_asset_operationstest_sqlite_multi_tenant_isolation
# Start Pangolin with SQLite
DATABASE_URL=sqlite://./dev.db cargo run --bin pangolin_api
# Inspect database
sqlite3 dev.db
sqlite> .tables
sqlite> .schema catalogs
sqlite> SELECT * FROM tenants;
# Reset database
rm dev.db
# Restart Pangolin to recreate schema