HelgeSverre
diff --git a/‎docs/COMPETITIVE_ANALYSIS.md‎
Lines changed: 28 additions & 21 deletions b/‎docs/COMPETITIVE_ANALYSIS.md‎
Lines changed: 28 additions & 21 deletions
diff --git a/‎docs/INTEGRATION_OPPORTUNITIES.md‎
Lines changed: 47 additions & 124 deletions b/‎docs/INTEGRATION_OPPORTUNITIES.md‎
Lines changed: 47 additions & 124 deletions
@@ -1,20 +1,21 @@
 # Competitive Analysis
 
-**Last Updated**: 2025-12-26
+**Last Updated**: 2026-05-07
 **Purpose**: Comprehensive competitive landscape and feature opportunity analysis
 
 ## Executive Summary
 
-sql-splitter occupies a **unique position** in the SQL dump processing ecosystem by combining multiple capabilities that currently require separate tools. As of v1.9.0, we offer: **split + merge + analyze + validate + sample (FK-preserving) + shard + convert + diff + redact**.
+sql-splitter occupies a **unique position** in the SQL dump processing ecosystem by combining multiple capabilities that currently require separate tools. As of v1.13.5, we offer: **split + merge + analyze + validate + sample (FK-preserving) + shard + convert + diff + redact + graph + order + query (DuckDB)**.
 
 No existing tool offers this combination in a single, streaming, CLI-first, multi-dialect binary.
 
 **Key differentiators:**
 
 - Works on dump files directly (no database connection required)
 - Streaming architecture handles 10GB+ dumps
-- Multi-dialect support (MySQL, PostgreSQL, SQLite)
+- Multi-dialect support (MySQL, PostgreSQL, SQLite, MSSQL)
 - 600+ MB/s throughput
+- Embedded DuckDB for SQL analytics on dumps without import
 
 ---
 
@@ -35,9 +36,14 @@ No existing tool offers this combination in a single, streaming, CLI-first, mult
 | Dialect conversion                        | ✅ Implemented | v1.7.0  |
 | Validate (integrity checks)               | ✅ Implemented | v1.8.0  |
 | Diff dumps                                | ✅ Implemented | v1.9.0  |
-| Redaction/anonymization                   | ✅ Implemented | v1.9.0  |
-| Query/Filter (WHERE-style)                | 🟡 Planned     | —       |
-| MSSQL support                             | 🟡 Planned     | —       |
+| Redaction/anonymization                   | ✅ Implemented | v1.10.0 |
+| Graph (ERD generation)                    | ✅ Implemented | v1.11.0 |
+| Order (topological FK ordering)           | ✅ Implemented | v1.11.0 |
+| Query (DuckDB SQL analytics)              | ✅ Implemented | v1.12.0 |
+| MSSQL support                             | ✅ Implemented | v1.12.x |
+| Enum type conversion (PG↔MySQL)           | 🟡 Planned     | v1.14.0 |
+| Migrate (schema migration generation)     | 🟡 Planned     | v1.15.0 |
+| DBML import/export                        | 🟡 Planned     | v1.16.0 |
 
 ---
 
@@ -110,7 +116,7 @@ No existing tool offers this combination in a single, streaming, CLI-first, mult
 
 | Tool                    | Language | Stars | MySQL | PostgreSQL | SQLite | Streaming | Notes                         |
 | ----------------------- | -------- | ----- | ----- | ---------- | ------ | --------- | ----------------------------- |
-| **sql-splitter**        | Rust     | —     | ✅    | ✅         | ✅     | ✅        | v1.9.0                        |
+| **sql-splitter**        | Rust     | —     | ✅    | ✅         | ✅     | ✅        | v1.10.0, ~230 MB/s            |
 | **nxs-data-anonymizer** | Go       | 271   | ✅    | ✅         | ❌     | ✅        | Go templates + Sprig          |
 | **pynonymizer**         | Python   | 109   | ✅    | ✅         | ❌     | ❌        | Faker integration, GDPR focus |
 | **myanon**              | C        | ~30   | ✅    | ❌         | ❌     | ✅        | stdin/stdout streaming        |
@@ -130,7 +136,7 @@ No existing tool offers this combination in a single, streaming, CLI-first, mult
 
 | Tool               | Language    | Stars | Dialects  | COPY↔INSERT | Streaming |
 | ------------------ | ----------- | ----- | --------- | ----------- | --------- |
-| **sql-splitter**   | Rust        | —     | 3 (✅)    | ✅          | ✅        |
+| **sql-splitter**   | Rust        | —     | 4 (✅)    | ✅          | ✅        |
 | **sqlglot**        | Python      | 7k+   | 31        | ❌          | ❌        |
 | **pgloader**       | Common Lisp | 5k+   | → PG only | ✅          | ✅        |
 | **mysql2postgres** | Ruby        | 300   | MySQL→PG  | Partial     | ❌        |
@@ -155,27 +161,27 @@ No existing tool offers this combination in a single, streaming, CLI-first, mult
 
 ### Query/Filter Dumps
 
-| Tool             | Language | Stars | Notes                               |
-| ---------------- | -------- | ----- | ----------------------------------- |
-| **sql-splitter** | Rust     | —     | 🟡 Planned: WHERE-style filtering   |
-| **DuckDB**       | C++      | 34.8k | Query SQL/CSV/JSON/Parquet directly |
-| **sqlglot**      | Python   | 7k+   | Parse/transpile, not filter         |
+| Tool             | Language | Stars | Notes                                        |
+| ---------------- | -------- | ----- | -------------------------------------------- |
+| **sql-splitter** | Rust     | —     | ✅ Embedded DuckDB (v1.12.0), full SQL       |
+| **DuckDB**       | C++      | 34.8k | Query SQL/CSV/JSON/Parquet directly          |
+| **sqlglot**      | Python   | 7k+   | Parse/transpile, not filter                  |
 
-**[DuckDB](https://github.com/duckdb/duckdb)** could solve querying but is overkill for simple dump filtering.
+sql-splitter embeds DuckDB to give full SQL analytics on dumps without an import step (in-memory or disk-backed for >2GB dumps), with persistent caching that delivers a 400× speedup on repeat queries.
 
 ---
 
 ### MSSQL Support
 
 | Tool             | MSSQL             |
 | ---------------- | ----------------- |
-| **sql-splitter** | 🟡 Planned        |
+| **sql-splitter** | ✅ (v1.12.x)      |
 | Jailer           | ✅ (via JDBC)     |
 | pynonymizer      | ✅                |
 | sqlglot          | ✅ (parsing only) |
 | pgloader         | ❌                |
 
-**Gap**: Major gap in ecosystem for MSSQL dump processing CLI tools.
+sql-splitter is now the only **streaming, file-based, multi-dialect** CLI with SQL Server support — Jailer/pynonymizer require live DB connections.
 
 ---
 
@@ -247,13 +253,13 @@ No existing tool offers this combination in a single, streaming, CLI-first, mult
 | Sample + FK      | ✅           | ❌       | ❌       | ✅      | ✅        | ❌       | ❌      | ❌      |
 | Tenant sharding  | ✅           | ❌       | ❌       | Limited | Limited   | ❌       | ❌      | Via SQL |
 | Redaction        | ✅           | Basic    | ❌       | ❌      | ❌        | ✅       | ❌      | ❌      |
-| Query/Filter     | 🟡           | ❌       | ❌       | Limited | ❌        | ❌       | ✅      | ✅      |
+| Query/Filter     | ✅           | ❌       | ❌       | Limited | ❌        | ❌       | ✅      | ✅      |
 | Diff             | ✅           | ❌       | ❌       | Limited | ❌        | ❌       | ❌      | Via SQL |
 | Convert dialects | ✅           | ❌       | → PG     | Limited | ❌        | ❌       | ✅      | ✅      |
 | MySQL            | ✅           | ✅       | ✅       | ✅      | ✅        | ✅       | ✅      | ✅      |
 | PostgreSQL       | ✅           | ❌       | ✅       | ✅      | ✅        | ✅       | ✅      | ✅      |
 | SQLite           | ✅           | ❌       | ✅       | ✅      | ❌        | ❌       | ✅      | ✅      |
-| MSSQL            | 🟡           | ❌       | ❌       | ✅      | ❌        | ❌       | ✅      | ❌      |
+| MSSQL            | ✅           | ❌       | ❌       | ✅      | ❌        | ❌       | ✅      | ❌      |
 | Streaming        | ✅           | ✅       | ✅       | ❌      | ❌        | ✅       | ❌      | ✅      |
 | CLI-first        | ✅           | ✅       | ✅       | ❌      | ✅        | ✅       | ✅      | ✅      |
 | Works on dumps   | ✅           | ❌       | ❌       | ❌      | ❌        | ✅       | ✅      | ❌      |
@@ -263,15 +269,16 @@ No existing tool offers this combination in a single, streaming, CLI-first, mult
 
 ## Unique Value Proposition
 
-1. **Unified tool** — Split + merge + sample + shard + convert + diff + redact in one binary
+1. **Unified tool** — Split + merge + sample + shard + convert + diff + redact + graph + order + query in one binary
 2. **Works on dump files** — No database connection required (unlike Jailer, Condenser, mydumper)
 3. **Streaming architecture** — Handle 10GB+ dumps without memory issues
 4. **CLI-first** — DevOps/automation friendly, pipe-compatible
-5. **Multi-dialect** — MySQL, PostgreSQL, SQLite in one tool
+5. **Multi-dialect** — MySQL, PostgreSQL, SQLite, MSSQL in one tool
 6. **FK-aware operations** — Sample and shard preserve referential integrity
 7. **Rust performance** — 600+ MB/s, faster than Python/Java alternatives
 8. **Compression support** — gzip, bz2, xz, zstd auto-detected
 9. **Composable** — Split → Sample → Redact → Convert → Merge pipeline
+10. **Embedded analytics** — DuckDB-powered SQL queries on dumps without import (v1.12.0)
 
 ---
 
@@ -424,7 +431,7 @@ sql-splitter test dump.sql --config schema-tests.yaml
 
 ### Priorities
 
-1. **Complete v2.0** — Current roadmap features
+1. **Complete v1.14–v1.16** — Enum, Migrate, DBML (planned core features)
 2. **Quick wins** — Schema drift (16h), size optimization (12h), cost estimation (8h)
 3. **Differentiation** — Data quality profiling, compliance checks
 4. **Future** — AI integration for schema suggestions, natural language queries
 
@@ -1,6 +1,6 @@
 # Integration Opportunities & Tool Synergies
 
-**Date**: 2025-12-24
+**Date**: 2025-12-24 (Updated 2026-05-07: DuckDB query engine shipped in v1.12.0)
 **Purpose**: Identify strategic integrations to extend sql-splitter capabilities
 
 ## Philosophy: Build vs Integrate vs Wrap
@@ -13,7 +13,7 @@
 
 ## 🔥 Tier 1: High-Impact Integrations
 
-### 1. DuckDB Integration ⭐⭐⭐⭐⭐
+### 1. DuckDB Integration ⭐⭐⭐⭐⭐ — ✅ Query Engine SHIPPED (v1.12.0)
 
 **What is DuckDB?**
 
@@ -24,47 +24,30 @@
 
 **Synergy:** sql-splitter prepares data → DuckDB queries it
 
-#### Integration Strategy A: Query Engine
+#### Integration Strategy A: Query Engine — ✅ SHIPPED v1.12.0
 
-```bash
-# Load dump into DuckDB, run analytics
-sql-splitter query dump.sql --engine duckdb \
-  --sql "SELECT user_id, COUNT(*) FROM orders GROUP BY user_id LIMIT 10"
-
-# Behind the scenes:
-# 1. sql-splitter imports dump.sql into temp DuckDB file
-# 2. DuckDB executes query
-# 3. Output results
-```
-
-**Implementation:**
-
-```rust
-pub fn query_with_duckdb(dump: &Path, sql: &str) -> Result<Vec<Row>> {
-    // Create temp DuckDB database
-    let temp_db = tempfile::NamedTempFile::new()?;
-    let conn = Connection::open(temp_db.path())?;
+The query engine shipped in v1.12.0. Actual usage:
 
-    // Import dump (convert INSERT → CREATE + INSERT for DuckDB)
-    import_dump_to_duckdb(&conn, dump)?;
+```bash
+# Single query
+sql-splitter query dump.sql "SELECT user_id, COUNT(*) FROM orders GROUP BY user_id LIMIT 10"
 
-    // Execute query
-    let mut stmt = conn.prepare(sql)?;
-    let rows = stmt.query_map([], |row| {
-        // Map row to our Row type
-    })?;
+# Interactive REPL
+sql-splitter query dump.sql --interactive
 
-    Ok(rows)
-}
+# Export results
+sql-splitter query dump.sql "SELECT * FROM orders" -f json -o results.json
 ```
 
-**Benefits:**
+Implementation lives in `src/cmd/query.rs` and `src/duckdb/`. Features delivered:
 
-- ✅ Full SQL analytics without database setup
-- ✅ Aggregations, JOINs, window functions
-- ✅ 100x faster than naive row filtering
+- In-memory and disk-backed modes (>2GB dumps)
+- Multi-dialect import (MySQL, PostgreSQL, SQLite, MSSQL)
+- 5 output formats (table, json, jsonl, csv, tsv)
+- Persistent SHA256-keyed cache (400× speedup on repeat queries)
+- `--tables` filter, `--memory-limit` config
 
-**Effort:** ~16h (wrap DuckDB, import conversion)
+For the full design rationale and remaining Parquet export work, see [DUCKDB_INTEGRATION_DEEP_DIVE.md](features/DUCKDB_INTEGRATION_DEEP_DIVE.md).
 
 ---
 
@@ -321,7 +304,7 @@ sql-splitter docs dump.sql -o docs/
 
 ### 6. Graphviz/Mermaid (Already Planned) ✅
 
-**Status:** Already in roadmap for graph command
+**Status:** ✅ Implemented in v1.11.0 (graph command — HTML, DOT, Mermaid, JSON output)
 
 **Additional integration:** Live preview
 
@@ -679,71 +662,9 @@ terraform apply  # Applies changes
 
 ## Integration Architecture
 
-### Wrapper Pattern (Low Effort, High Value)
-
-```rust
-// Simple wrapper around DuckDB CLI
-pub fn query_with_duckdb(dump: &Path, sql: &str) -> Result<String> {
-    // Convert dump to DuckDB-compatible format
-    let temp_dir = tempdir()?;
-    convert_for_duckdb(dump, &temp_dir)?;
-
-    // Shell out to DuckDB
-    let output = Command::new("duckdb")
-        .arg(temp_dir.path().join("db.duckdb"))
-        .arg("-c")
-        .arg(sql)
-        .output()?;
-
-    Ok(String::from_utf8(output.stdout)?)
-}
-```
-
-**Pros:**
-
-- ✅ Quick to implement
-- ✅ Leverage existing tools
-- ✅ No reimplementation
-
-**Cons:**
-
-- ❌ External dependency required
-- ❌ Less control over behavior
-
----
-
-### Library Integration (Medium Effort, More Control)
-
-```rust
-// Use DuckDB as library (via FFI or Rust bindings)
-use duckdb::{Connection, params};
-
-pub fn query_with_duckdb_lib(dump: &Path, sql: &str) -> Result<Vec<Row>> {
-    let conn = Connection::open_in_memory()?;
-
-    // Import dump directly into DuckDB
-    import_dump(&conn, dump)?;
-
-    // Query
-    let mut stmt = conn.prepare(sql)?;
-    let rows = stmt.query_map(params![], |row| {
-        // ...
-    })?;
-
-    Ok(rows.collect()?)
-}
-```
-
-**Pros:**
-
-- ✅ No external binary required
-- ✅ Better error handling
-- ✅ Embedded in sql-splitter binary
-
-**Cons:**
-
-- ❌ More implementation work
-- ❌ Need to keep bindings updated
+> **Historical context:** Two patterns were considered for the DuckDB integration —
+> a CLI wrapper (shell out to `duckdb` binary) and a library integration (Rust FFI bindings).
+> The library path was chosen and shipped in v1.12.0 (`src/duckdb/`).
 
 ---
 
@@ -777,33 +698,38 @@ pub async fn deploy_to_supabase(
 
 ## Recommended Integration Roadmap
 
-### v1.16 — Query & Analytics
+> Note: this section was renumbered on 2026-05-07 to match the updated master roadmap.
+> The DuckDB query engine shipped in v1.12.0; v1.13.x was used for maintenance releases;
+> core features Enum/Migrate/DBML occupy v1.14–v1.16. Integrations follow at v1.17+.
 
-- **DuckDB integration** (16h) — Query engine for dumps
-- **Parquet export** (12h) — Bridge to modern data stack
+### ✅ v1.12.0 — DuckDB Query Engine (SHIPPED)
 
-### v1.17 — Schema Management
+- DuckDB integration as embedded library (16h, completed)
+- See [DUCKDB_INTEGRATION_DEEP_DIVE.md](features/DUCKDB_INTEGRATION_DEEP_DIVE.md)
 
-- **Atlas HCL export** (20h) — Schema-as-code
-- **Liquibase changelog generation** (24h) — Migration tool integration
+### v1.17 — Parquet Export
+
+- **Parquet export** (12h) — Bridge to modern data stack, extends DuckDB query engine
 
 ### v1.18 — Data Quality
 
 - **Great Expectations integration** (16h) — Bootstrap testing
 
-### v1.19 — Documentation
+### v1.19 — Schema Management
 
-- **Self-contained schema browser** (32h) — Interactive docs
-- **tbls format export** (20h) — Compatibility
+- **Atlas HCL export** (20h) — Schema-as-code
+- **Liquibase changelog generation** (24h) — Migration tool integration
 
-### v2.2 — Platform Integrations
+### v1.20 — dbt Integration
 
 - **dbt project generation** (28h) — Data transformation
-- **GitHub Action** (12h) — CI/CD
-- **Airbyte connector** (24h) — ELT pipelines
 
-### v2.3 — Cloud Deployment
+### Future (v2.x) — Documentation & Cloud
 
+- **Self-contained schema browser** (32h) — Interactive docs
+- **tbls format export** (20h) — Compatibility
+- **GitHub Action** (12h) — CI/CD
+- **Airbyte connector** (24h) — ELT pipelines
 - **Supabase deployment** (20h) — Instant database provisioning
 - **Terraform provider** (32h) — IaC integration
 
@@ -813,12 +739,9 @@ pub async fn deploy_to_supabase(
 
 **Under 20h effort, huge value:**
 
-1. **DuckDB query engine** (16h)
-   - Instant SQL analytics on dumps
-   - No database setup required
-
-2. **Parquet export** (12h)
+1. **Parquet export** (12h, planned v1.17.0)
    - Bridge SQL → data lakes
+   - Extends already-shipped DuckDB query engine
    - Pandas/Spark/DuckDB compatible
 
 3. **GitHub Action** (12h)
@@ -877,10 +800,10 @@ sql-splitter query dump.sql "SELECT COUNT(*) FROM users"
 
 **Top 5 integrations for maximum impact:**
 
-1. **DuckDB** — Query analytics on dumps (game changer)
-2. **Atlas/Liquibase** — Schema management workflows
-3. **dbt** — Bootstrap data transformation projects
-4. **Great Expectations** — Data quality testing
-5. **GitHub Actions** — CI/CD automation
+1. ✅ **DuckDB** — Query analytics on dumps (shipped v1.12.0; Parquet export remaining at v1.17.0)
+2. **Atlas/Liquibase** — Schema management workflows (planned v1.19.0)
+3. **dbt** — Bootstrap data transformation projects (planned v1.20.0)
+4. **Great Expectations** — Data quality testing (planned v1.18.0)
+5. **GitHub Actions** — CI/CD automation (future)
 
 These integrations position sql-splitter as the **Swiss Army knife that plays well with others** rather than trying to replace every tool in the ecosystem.