feat(dsql): add blog learnings — filter model, DPU, CTE rewrite

Morlej · Morlej · commit f98901d27775 · 2026-06-24T12:27:55.000-05:00
- Add Three-Layer Filter Model (Index Cond / Storage Filter / Query
  Processor Filter) with optimization table to plan-interpretation.md
- Add Fixing Storage Lookups guidance (INCLUDE columns) with example
- Add Cost Number Interpretation (startup ~100 is normal in DSQL)
- Add DPU Interpretation (Read DPU as primary signal, optimization loop)
- Add CTE late materialization as DSQL-specific rewrite pattern
  (defer Storage Lookups past LIMIT)
- Update workflow.md Phase 1: recommend plain EXPLAIN first for
  expensive queries before EXPLAIN ANALYZE VERBOSE
diff --git a/plugins/databases-on-aws/skills/dsql/references/query-plan/plan-interpretation.md b/plugins/databases-on-aws/skills/dsql/references/query-plan/plan-interpretation.md
@@ -13,6 +13,8 @@
 9. [Anomalous Values](#anomalous-values)
 10. [Type Coercion and Index Bypass](#type-coercion-and-index-bypass)
 11. [Projections and Row Width](#projections-and-row-width)
+12. [Cost Number Interpretation](#cost-number-interpretation)
+13. [DPU Interpretation](#dpu-interpretation)
 
 ---
 
@@ -70,6 +72,38 @@ Index Scan using idx on tablename
 
 A child's timing and row counts roll up into its parent's totals — not into a sibling branch.
 
+### Three-Layer Filter Model
+
+Every predicate is evaluated at one of three layers. The layer determines how much data crosses the network between storage and compute — the primary lever for DSQL optimization.
+
+| Level        | Filter Type            | Where it appears in EXPLAIN                                  | Data Movement                                           | How to push predicates here                                           |
+| ------------ | ---------------------- | ------------------------------------------------------------ | ------------------------------------------------------- | --------------------------------------------------------------------- |
+| 1 (best)     | Index Condition        | `Index Cond:` on scan node                                   | Minimized — only matching index entries read            | Equality/range on indexed key columns; most selective column leftmost |
+| 2 (moderate) | Storage Filter         | `Filters:` inside `Storage Scan` or `Storage Lookup` node    | Reduced — applied at storage before transfer            | Add filter columns to index INCLUDE clause                            |
+| 3 (worst)    | Query Processor Filter | `Filter:` above `Storage Scan` (at the scan-type node level) | Maximum — all data transferred before predicate applied | Requires new index, restructured query, or schema change              |
+
+**Optimization goal:** Move predicates from Level 3 → Level 2 → Level 1. Each step reduces network transfer between storage and compute, directly reducing latency and DPU.
+
+### Fixing Storage Lookups (INCLUDE columns)
+
+When a Storage Lookup node appears, the index satisfied the filter but not all projected columns. The fix: add missing columns to the index's INCLUDE clause.
+
+```
+-- Before: Storage Lookup fetches created_at from base table
+Index Scan using idx1 on account
+  -> Storage Scan on idx1
+  -> Storage Lookup on account        ← extra round trip
+       Projections: created_at
+
+-- Fix: CREATE INDEX ASYNC idx2 ON account (customer_id) INCLUDE (balance, status, created_at)
+-- After: Index Only Scan, no Storage Lookup
+Index Only Scan using idx2 on account
+  -> Storage Scan on idx2
+       Projections: customer_id, balance, status, created_at
+```
+
+**Trade-off:** INCLUDE columns are copied into every index entry, increasing index size. Only include columns that your most-queried paths actually need.
+
 ## Calculating Node Duration
 
 DSQL follows the standard PostgreSQL EXPLAIN convention: `actual time` is reported **per iteration**, not cumulative. The node's total wall-clock time is:
@@ -256,3 +290,36 @@ Assess row width overhead:
 - Flag tables with 50+ columns or estimated row width >5,000 bytes
 
 Wide projections increase I/O on Storage Lookups and memory usage in Hash Joins. Impact scales with result set size.
+
+## Cost Number Interpretation
+
+DSQL cost numbers appear much higher than equivalent PostgreSQL plans. This is expected — the cost model accounts for distributed round-trips.
+
+**Format:** `startup_cost..total_cost` (e.g., `100.28..208.29`)
+
+- **Startup cost ~100** is normal — reflects fixed overhead of initiating a storage round-trip
+- **Total cost** includes incremental per-row processing, network transfer, and page access
+
+**MUST NOT** compare cost numbers across queries to determine which is "better." Cost units are internal to the optimizer and non-comparable. Use DPU estimates instead.
+
+## DPU Interpretation
+
+`EXPLAIN ANALYZE VERBOSE` appends a `Statement DPU Estimate` block:
+
+```
+Statement DPU Estimate:
+  Compute: 0.01724 DPU
+  Read:    0.01202 DPU
+  Write:   0.00000 DPU
+  Total:   0.02926 DPU
+```
+
+**Read DPU** is the primary optimization signal for read-heavy queries. High Read DPU with selective filters means those filters aren't pushed down far enough (Level 3 or 2 when they could be Level 1).
+
+**Optimization loop:**
+
+1. Run `EXPLAIN ANALYZE VERBOSE` on the unoptimized query — note Total DPU
+2. Apply fix (add index, add INCLUDE columns, restructure query)
+3. Re-run — compare DPU delta
+
+**MUST** use DPU as the before/after comparison metric, not cost numbers or execution time (which varies with load).
diff --git a/plugins/databases-on-aws/skills/dsql/references/query-plan/query-rewrites-dsql-specific.md b/plugins/databases-on-aws/skills/dsql/references/query-plan/query-rewrites-dsql-specific.md
@@ -4,7 +4,8 @@ SQL rewrites that address Aurora DSQL-specific behaviors and optimizer constrain
 
 ## Available Rewrites
 
-| Pattern Detected                | Reference File                                                |
-| ------------------------------- | ------------------------------------------------------------- |
-| COUNT(*) timeout on large table | [reltuples-estimate.md](query-rewrites/reltuples-estimate.md) |
-| Join count exceeds DP threshold | [split-large-joins.md](query-rewrites/split-large-joins.md)   |
+| Pattern Detected                                  | Reference File                                                            |
+| ------------------------------------------------- | ------------------------------------------------------------------------- |
+| COUNT(*) timeout on large table                   | [reltuples-estimate.md](query-rewrites/reltuples-estimate.md)             |
+| Join count exceeds DP threshold                   | [split-large-joins.md](query-rewrites/split-large-joins.md)               |
+| Storage Lookup with high loops + LIMIT discarding | [cte-late-materialization.md](query-rewrites/cte-late-materialization.md) |
diff --git a/plugins/databases-on-aws/skills/dsql/references/query-plan/query-rewrites/cte-late-materialization.md b/plugins/databases-on-aws/skills/dsql/references/query-plan/query-rewrites/cte-late-materialization.md
@@ -0,0 +1,36 @@
+# Rewrite: CTE Late Materialization to Defer Storage Lookups (DSQL-Specific)
+
+When a query combines filtering, ordering, and LIMIT with columns not fully covered by an index, DSQL performs a Storage Lookup for every matching row — including rows discarded by LIMIT. Use a CTE to narrow first using only indexed columns, then join back for remaining columns on only the final rows.
+
+**SHOULD apply when:** The query has a LIMIT that returns far fewer rows than the filter matches, and the EXPLAIN plan shows a Storage Lookup with a high loop count relative to the final row count.
+
+**SHOULD skip when:** The filter is already highly selective (matching close to the LIMIT count), or all projected columns are in the index.
+
+```sql
+-- Before: Storage Lookup on every matching row, LIMIT discards most
+SELECT customer_id, balance, status, created_at
+FROM account
+WHERE status = 'active'
+ORDER BY created_at DESC
+LIMIT 10;
+
+-- After: CTE narrows to 10 rows using indexed columns, then fetches remaining
+WITH candidates AS (
+    SELECT customer_id, created_at
+    FROM account
+    WHERE status = 'active'
+    ORDER BY created_at DESC
+    LIMIT 10
+)
+SELECT a.customer_id, a.balance, a.status, a.created_at
+FROM candidates c
+JOIN account a ON a.customer_id = c.customer_id;
+```
+
+```sql
+-- Not applicable: filter already selective (returns ~10 rows)
+SELECT customer_id, balance
+FROM account
+WHERE customer_id = '4b18a761-5870-4d7c-95ce-0a48eca3fceb'::uuid
+LIMIT 10;
+```
diff --git a/plugins/databases-on-aws/skills/dsql/references/query-plan/workflow.md b/plugins/databases-on-aws/skills/dsql/references/query-plan/workflow.md
@@ -75,7 +75,9 @@ SHOULD also load these index files to identify applicable rewrites at Phase 2:
 
 ## Phase 1: Capture the Plan
 
-**ALWAYS** run `readonly_query("EXPLAIN ANALYZE VERBOSE …")` on the user's query verbatim (SELECT form) — **ALWAYS** capture a fresh plan from the cluster, even when the user describes the plan or reports an anomaly. **MAY** leverage `get_schema` or `information_schema` for schema sanity checks.
+For queries the user reports as expensive or slow (execution time >30s, high DPU, or timeout), start with plain `EXPLAIN` (without ANALYZE) to see the optimizer's plan without executing the query. Then run `EXPLAIN ANALYZE VERBOSE` to get actual row counts and DPU.
+
+For all other queries, run `readonly_query("EXPLAIN ANALYZE VERBOSE …")` directly on the user's query verbatim (SELECT form) — **ALWAYS** capture a fresh plan from the cluster, even when the user describes the plan or reports an anomaly. **MAY** leverage `get_schema` or `information_schema` for schema sanity checks.
 
 When EXPLAIN errors (`relation does not exist`, `column does not exist`), **MUST** report the error verbatim — **MUST NOT** invent DSQL-specific semantics (e.g., case sensitivity, identifier quoting) as the root cause.