Skip to content

Commit e91ae0d

Browse files
MagicalTuxclaude
andcommitted
perf(planner): port SQLite's covering-scan width cost model (B9h slice)
A no-`WHERE` bare projection (and `count(*)`) is answered from a covering secondary index only when that index's estimated row is *strictly* narrower than the table's — SQLite's `estimateTableWidth` / `estimateIndexWidth` cost model, now ported exactly rather than approximated. The table width is the sum of each column's size estimate (`+1` when there is no INTEGER PRIMARY KEY); an index's is the sum of its key columns' estimates plus one for the trailing rowid; the per-column estimate is 1 for an integer/real/numeric/untyped column, 5 for a `TEXT`/`BLOB`, and `k/4+1` for a sized `VARCHAR(k)` (new `col_szest`), compared in `LogEst` units (new `logest`). Among qualifying indexes the narrowest wins, ties broken by the most-recently-created. This fixes both directions of the previous heuristic: the over-use (a covering index no narrower than the table — e.g. `SELECT b,c` over a two-column index on a three-column table — now `SCAN`s the table) and the under-use (`count(*)` over two or more indexes now counts the cheapest instead of falling back to a table scan). `count_covering_index` delegates its choice to the shared model. The covered scan's no-`ORDER BY` row order (index order) matches SQLite in lockstep. The width test applies only to a pure projection — a `GROUP BY` / `DISTINCT` / `ORDER BY` still reads a covering index to supply its ordering regardless of width. Verified against a from-source build of sqlite3 3.50.4 (`tests/eqp_covering_index_cost.rs`; `tests/count_covering.rs` updated to the cost-correct expectations). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent cea2c7d commit e91ae0d

5 files changed

Lines changed: 316 additions & 46 deletions

File tree

ROADMAP.md

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1261,15 +1261,22 @@ With B9a-seek and `FOR IN-OPERATOR` shipped, the only open EQP-fidelity thread i
12611261
- **B9h — cost-model single-table index *choice*.** SQLite prefers, among indexes
12621262
sharing an equality prefix, the one whose walk does the most work (composite `(b,c)`
12631263
over `(b)` for a trailing range / `GROUP BY`/`ORDER BY c`; a *covering* index over a
1264-
narrower one; the smallest covering index for `count(*)`/`IN`), and decides *whether*
1265-
a no-WHERE query covers at all (narrow index vs wide-row table scan). graphite picks
1266-
by longest-equality-prefix only. The covering-scan no-`ORDER BY` row-order parity
1267-
(investigated 2026-07-04 as B9i — graphite already walks index order, so it is *not* an
1268-
execution-order bug) and the secondary-index `SEARCH` + `GROUP BY`/`DISTINCT` b-tree
1269-
(left open by B9d) both ride here. **Deferred by design:** the pinned oracle has no
1270-
stat4, so its choices depend on row-width / index-width / index-count heuristics
1271-
graphite can't reproduce without diverging the EQP corpus — same class as B1b/B4;
1272-
needs a stat4-enabled oracle.
1264+
narrower one), and decides *whether* a no-WHERE query covers at all.
1265+
**The no-`WHERE` covering-scan choice is DONE (2026-07-05).** It turned out to be a
1266+
purely *structural* cost — not stat4-dependent — so it is exactly reproducible against
1267+
the pinned oracle: graphite now ports SQLite's `estimateTableWidth`/`estimateIndexWidth`
1268+
(`col_szest` + `logest` free fns) and uses a covering index for a no-`WHERE` scan /
1269+
`count(*)` only when its estimated row is *strictly narrower* than the table's, picking
1270+
the narrowest (ties → newest / highest rootpage). This fixed both the over-use
1271+
(`SELECT b,c` over a 2-col-index-on-a-3-col-table now `SCAN t`) and the under-use
1272+
(`count(*)` over ≥2 indexes now picks the cheapest instead of bailing), and the covered
1273+
scan's no-`ORDER BY` row order (index order) matches in lockstep (`covering_scan` +
1274+
`count_covering_index` delegate to the shared model; `tests/eqp_covering_index_cost.rs`,
1275+
`tests/count_covering.rs`). **Still open (rides here):** the composite-vs-narrow choice
1276+
*with* a WHERE equality prefix, covering-scan *with* a WHERE predicate on covered
1277+
columns (`SELECT b FROM t WHERE c>0` → covering `(b,c)`), and the secondary-index
1278+
`SEARCH` + `GROUP BY`/`DISTINCT` b-tree (left open by B9d). These are structural too and
1279+
now unblocked — a stat4 oracle is only needed for genuinely data-driven choices (B4).
12731280
- **B9j — collation-aware index *selection* for a non-default-collation index.**
12741281
`collect_eq_constraints` / `collect_range_constraints` compare an explicit `COLLATE`
12751282
to the *column's* collation. When an index carries a *non-default* collation

src/exec/mod.rs

Lines changed: 140 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -17916,20 +17916,78 @@ impl Connection {
1791617916
if self.group_by_is_rowid(sel, meta, label) {
1791717917
return None;
1791817918
}
17919-
let mut covering = self.indexes_of(&t.name).ok()?.into_iter().filter(|idx| {
17920-
idx.partial.is_none()
17921-
&& idx.key_exprs.is_none()
17922-
&& self.query_cols_covered(sel, meta, &idx.cols)
17923-
});
17924-
let chosen = covering.next()?;
17925-
// Ambiguous (two or more covering indexes): keep the plain scan rather
17926-
// than guess which one sqlite's cost model would pick.
17927-
if covering.next().is_some() {
17928-
return None;
17929-
}
17919+
let covering: Vec<_> = self
17920+
.indexes_of(&t.name)
17921+
.ok()?
17922+
.into_iter()
17923+
.filter(|idx| {
17924+
idx.partial.is_none()
17925+
&& idx.key_exprs.is_none()
17926+
&& self.query_cols_covered(sel, meta, &idx.cols)
17927+
})
17928+
.collect();
17929+
// A `GROUP BY` / `DISTINCT` / `ORDER BY` query walks the index to produce
17930+
// its keys in order (avoiding a full sort — for a partial sort the index
17931+
// still supplies the leading terms), so SQLite reads from a covering index
17932+
// there *regardless* of width; only a bare projection is a pure width
17933+
// choice. (A fully sort-satisfying scan already bailed above via
17934+
// `order_satisfied_by_scan`.) For these we keep the conservative
17935+
// single-candidate rule — which index an ambiguous walk picks is the
17936+
// ordered-scan path's job.
17937+
if !sel.group_by.is_empty() || sel.distinct || !sel.order_by.is_empty() {
17938+
if covering.len() != 1 {
17939+
return None;
17940+
}
17941+
let chosen = covering.into_iter().next()?;
17942+
return Some((chosen.name, chosen.root, chosen.cols));
17943+
}
17944+
// Plain no-`WHERE` projection: port SQLite's covering-scan cost choice
17945+
// (`estimateTableWidth` / `estimateIndexWidth`): the table's estimated row
17946+
// width is `Σ szEst(col) (+1 if no INTEGER PRIMARY KEY)`; an index's is
17947+
// `Σ szEst(key col) + 1` (the trailing rowid). A covering index is used only
17948+
// when its width (in `LogEst` units) is *strictly* less than the table's,
17949+
// and among the candidates the narrowest wins — ties broken by the
17950+
// most-recently-created index (highest rootpage; SQLite considers indexes
17951+
// newest-first and keeps the first of an equal cost). Verified against the
17952+
// sqlite3 3.50.4 planner.
17953+
let szests = self.table_col_szests(&t.name).unwrap_or_default();
17954+
let szest_of = |i: usize| szests.get(i).copied().unwrap_or(1);
17955+
let mut wtable: u32 = (0..meta.columns.len()).map(szest_of).sum();
17956+
if meta.ipk.is_none() {
17957+
wtable += 1;
17958+
}
17959+
let sz_tab = logest(u64::from(wtable) * 4);
17960+
let chosen = covering
17961+
.into_iter()
17962+
.map(|idx| {
17963+
let widx: u32 = idx.cols.iter().map(|&c| szest_of(c)).sum::<u32>() + 1;
17964+
(logest(u64::from(widx) * 4), idx)
17965+
})
17966+
.filter(|(sz_idx, _)| *sz_idx < sz_tab)
17967+
.min_by(|(sa, ia), (sb, ib)| sa.cmp(sb).then(ib.root.cmp(&ia.root)))?
17968+
.1;
1793017969
Some((chosen.name, chosen.root, chosen.cols))
1793117970
}
1793217971

17972+
/// The per-column [`col_szest`] estimates for a rowid table, aligned with its
17973+
/// declared column order (which matches `TableMeta::columns` for a rowid
17974+
/// table). Parses the stored `CREATE TABLE` for the raw declared type of each
17975+
/// column (an untyped column is `1`, not the `BLOB` fallback other paths use).
17976+
/// Returns an empty vector when the table can't be resolved, so callers fall
17977+
/// back to a size of `1` per column.
17978+
fn table_col_szests(&self, table: &str) -> Option<Vec<u32>> {
17979+
let obj = self.schema.table(table)?;
17980+
let Ok(Statement::CreateTable(ct)) = sql::parse_one(obj.sql.as_deref()?) else {
17981+
return None;
17982+
};
17983+
Some(
17984+
ct.columns
17985+
.iter()
17986+
.map(|c| col_szest(c.type_name.as_deref()))
17987+
.collect(),
17988+
)
17989+
}
17990+
1793317991
/// SQLite's min/max optimization: a query whose only aggregate is a single
1793417992
/// `min(col)` / `max(col)` (no `GROUP BY`, no `HAVING`, no `WHERE`, no second
1793517993
/// aggregate; the call may be wrapped in scalar expressions and may be
@@ -18311,18 +18369,13 @@ impl Connection {
1831118369
if meta.without_rowid {
1831218370
return None;
1831318371
}
18314-
// Exactly one full (non-partial, non-expression) secondary index.
18315-
let mut chosen: Option<(String, u32)> = None;
18316-
for idx in self.indexes_of(&t.name).ok()? {
18317-
if idx.partial.is_some() || idx.key_exprs.is_some() {
18318-
continue;
18319-
}
18320-
if chosen.is_some() {
18321-
return None; // ambiguous: more than one candidate
18322-
}
18323-
chosen = Some((idx.name, idx.root));
18324-
}
18325-
chosen
18372+
// A `count(*)` needs no columns, so every full secondary index "covers" it.
18373+
// The choice — and whether a covering scan is cheaper than a plain table
18374+
// scan at all — is the shared cost model in `covering_scan` (which picks the
18375+
// narrowest index strictly narrower than the table, or `None` so the caller
18376+
// `SCAN`s the table).
18377+
let (name, root, _) = self.covering_scan(sel, &meta, &Params::default())?;
18378+
Some((name, root))
1832618379
}
1832718380

1832818381
/// Whether a single-table scan already yields rows in the query's `ORDER BY`
@@ -27013,6 +27066,70 @@ fn column_resolves_scoped(
2701327066
})
2701427067
}
2701527068

27069+
/// SQLite's `sqlite3LogEst` — an integer approximation of `10*log2(x)`, the unit
27070+
/// the query planner costs rows and row-widths in. Ported verbatim so a covering
27071+
/// index's estimated width can be compared exactly the way SQLite does.
27072+
fn logest(mut x: u64) -> i16 {
27073+
const A: [i16; 8] = [0, 2, 3, 5, 6, 7, 8, 9];
27074+
let mut y: i16 = 40;
27075+
if x < 8 {
27076+
if x < 2 {
27077+
return 0;
27078+
}
27079+
while x < 8 {
27080+
y -= 10;
27081+
x <<= 1;
27082+
}
27083+
} else {
27084+
while x > 255 {
27085+
y += 40;
27086+
x >>= 4;
27087+
}
27088+
while x > 15 {
27089+
y += 10;
27090+
x >>= 1;
27091+
}
27092+
}
27093+
A[(x & 7) as usize] + y - 10
27094+
}
27095+
27096+
/// The estimated per-column size SQLite records (`estimateTableWidth` via
27097+
/// `sqlite3AffinityType`), scaled so an integer/real/numeric or untyped column is
27098+
/// `1`. A `TEXT`/`BLOB`/`CLOB`/`CHAR` with no size is `5`; a sized `VARCHAR(k)` /
27099+
/// `CHAR(k)` / `BLOB(k)` is `k/4 + 1` (capped at 255). Only TEXT/BLOB-affinity
27100+
/// columns carry a size; numeric affinities are always `1`.
27101+
fn col_szest(type_name: Option<&str>) -> u32 {
27102+
let Some(t) = type_name else { return 1 };
27103+
if t.trim().is_empty() {
27104+
return 1;
27105+
}
27106+
let up = t.to_ascii_uppercase();
27107+
// The first unsigned integer literal in `s`, if any.
27108+
fn first_uint(s: &str) -> Option<u32> {
27109+
let start = s.find(|c: char| c.is_ascii_digit())?;
27110+
let end = s[start..]
27111+
.find(|c: char| !c.is_ascii_digit())
27112+
.map(|e| start + e)
27113+
.unwrap_or(s.len());
27114+
s[start..end].parse().ok()
27115+
}
27116+
let v: u32 = match eval::Affinity::from_type(Some(t)) {
27117+
// A size for a text column sits after the "CHAR" token (`VARCHAR(k)`,
27118+
// `CHAR(k)`); a bare `TEXT`/`CLOB` carries none → 16 (→ szEst 5).
27119+
eval::Affinity::Text => up
27120+
.rfind("CHAR")
27121+
.and_then(|p| first_uint(&up[p + 4..]))
27122+
.unwrap_or(16),
27123+
// A `BLOB(k)` size sits immediately after "BLOB("; a bare `BLOB` → 16.
27124+
eval::Affinity::Blob => match up.find("BLOB") {
27125+
Some(p) if up[p + 4..].starts_with('(') => first_uint(&up[p + 4..]).unwrap_or(16),
27126+
_ => 16,
27127+
},
27128+
_ => 0,
27129+
};
27130+
(v / 4 + 1).min(255)
27131+
}
27132+
2701627133
fn walk_shallow_columns(e: &Expr, f: &mut impl FnMut(Option<&str>, Option<&str>, &str, bool)) {
2701727134
match e {
2701827135
Expr::Column {

tests/count_covering.rs

Lines changed: 34 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,12 @@
11
//! `SELECT count(*)` answered via a covering secondary index (roadmap B2b).
22
//!
3-
//! When a single rowid table has exactly one full secondary index, sqlite (and
4-
//! graphitesql) counts that index's entries — `EXPLAIN QUERY PLAN` reports
5-
//! `SCAN t USING COVERING INDEX <name>`. With zero or multiple such indexes,
6-
//! graphitesql conservatively keeps the plain `SCAN t` plan (no guessing).
3+
//! `count(*)` is answered from a covering secondary index when that index's
4+
//! estimated row is *strictly narrower* than the table's (SQLite's
5+
//! `estimateTableWidth`/`estimateIndexWidth` cost model). `EXPLAIN QUERY PLAN`
6+
//! then reports `SCAN t USING COVERING INDEX <name>`, choosing the narrowest
7+
//! qualifying index (ties → the most-recently-created). An index no narrower than
8+
//! the table — e.g. the sole non-key column indexed on a two-column table — is
9+
//! not used, and the plan stays `SCAN t`.
710
811
#![cfg(feature = "std")]
912

@@ -33,15 +36,35 @@ fn count(conn: &Connection, sql: &str) -> i64 {
3336

3437
#[test]
3538
fn one_index_uses_covering_index_in_eqp() {
39+
// The index (b + rowid = 2 units) is narrower than the 3-column table, so it
40+
// is used to count. (On a two-column table it would tie the table and SCAN.)
41+
let mut c = Connection::open_memory().unwrap();
42+
c.execute("CREATE TABLE t(a INTEGER PRIMARY KEY, b, c)")
43+
.unwrap();
44+
c.execute("CREATE INDEX ib ON t(b)").unwrap();
45+
c.execute("INSERT INTO t VALUES(1,10,100),(2,20,200)")
46+
.unwrap();
47+
assert_eq!(
48+
detail(&c, "EXPLAIN QUERY PLAN SELECT count(*) FROM t"),
49+
["SCAN t USING COVERING INDEX ib"]
50+
);
51+
}
52+
53+
#[test]
54+
fn index_no_narrower_than_table_keeps_plain_scan() {
55+
// On a two-column table, indexing the sole non-key column gives an index
56+
// (b + rowid = 2) exactly as wide as the table (a + b = 2), so SQLite scans
57+
// the table rather than the index.
3658
let mut c = Connection::open_memory().unwrap();
3759
c.execute("CREATE TABLE t(a INTEGER PRIMARY KEY, b)")
3860
.unwrap();
3961
c.execute("CREATE INDEX ib ON t(b)").unwrap();
4062
c.execute("INSERT INTO t VALUES(1,10),(2,20)").unwrap();
4163
assert_eq!(
4264
detail(&c, "EXPLAIN QUERY PLAN SELECT count(*) FROM t"),
43-
["SCAN t USING COVERING INDEX ib"]
65+
["SCAN t"]
4466
);
67+
assert_eq!(count(&c, "SELECT count(*) FROM t"), 2);
4568
}
4669

4770
#[test]
@@ -69,29 +92,30 @@ fn no_index_keeps_plain_scan() {
6992
}
7093

7194
#[test]
72-
fn multiple_indexes_fall_back_to_plain_scan() {
95+
fn multiple_indexes_pick_cheapest_covering_index() {
7396
let mut c = Connection::open_memory().unwrap();
7497
c.execute("CREATE TABLE t(a INTEGER PRIMARY KEY, b, c)")
7598
.unwrap();
7699
c.execute("CREATE INDEX ib ON t(b)").unwrap();
77100
c.execute("CREATE INDEX ic ON t(c)").unwrap();
78101
c.execute("INSERT INTO t VALUES(1,10,100),(2,20,200)")
79102
.unwrap();
80-
// Ambiguous index choice => keep the plain SCAN (no guessing).
103+
// Both indexes are narrower than the table and equally wide, so SQLite counts
104+
// the most-recently-created one (ic), matching its cost-model tie-break.
81105
assert_eq!(
82106
detail(&c, "EXPLAIN QUERY PLAN SELECT count(*) FROM t"),
83-
["SCAN t"]
107+
["SCAN t USING COVERING INDEX ic"]
84108
);
85109
assert_eq!(count(&c, "SELECT count(*) FROM t"), 2);
86110
}
87111

88112
#[test]
89113
fn count_correct_after_delete() {
90114
let mut c = Connection::open_memory().unwrap();
91-
c.execute("CREATE TABLE t(a INTEGER PRIMARY KEY, b)")
115+
c.execute("CREATE TABLE t(a INTEGER PRIMARY KEY, b, c)")
92116
.unwrap();
93117
c.execute("CREATE INDEX ib ON t(b)").unwrap();
94-
c.execute("INSERT INTO t VALUES(1,10),(2,20),(3,30),(4,40)")
118+
c.execute("INSERT INTO t VALUES(1,10,1),(2,20,2),(3,30,3),(4,40,4)")
95119
.unwrap();
96120
assert_eq!(count(&c, "SELECT count(*) FROM t"), 4);
97121
c.execute("DELETE FROM t WHERE a IN (2,3)").unwrap();

tests/covering_scan.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -124,18 +124,18 @@ fn order_by_rowid_keeps_table_scan() {
124124
}
125125

126126
#[test]
127-
fn ambiguous_covering_indexes_keep_plain_scan() {
127+
fn two_covering_indexes_pick_cheapest() {
128128
let mut c = Connection::open_memory().unwrap();
129129
c.execute("CREATE TABLE t(a INTEGER PRIMARY KEY, b, c)")
130130
.unwrap();
131131
c.execute("CREATE INDEX ib ON t(b)").unwrap();
132132
c.execute("CREATE INDEX ic ON t(c)").unwrap();
133133
c.execute("INSERT INTO t VALUES (1,10,100),(2,20,200)")
134134
.unwrap();
135-
// count(*) is covered by both ib and ic; rather than guess sqlite's pick,
136-
// graphite keeps the plain scan.
135+
// count(*) is covered by both ib and ic; both are narrower than the table and
136+
// equally wide, so SQLite's cost model counts the most-recently-created (ic).
137137
assert_eq!(
138138
plan(&c, "EXPLAIN QUERY PLAN SELECT count(*) FROM t"),
139-
"SCAN t"
139+
"SCAN t USING COVERING INDEX ic"
140140
);
141141
}

0 commit comments

Comments
 (0)