Skip to content

Commit d24e5cb

Browse files
author
Claude Agent
committed
Merge planner (Task C) cost-model routing
Brings in `dfa/planner.rs` (~430 LOC): `ScanPlanner`, `ScanContext`, `ScanPlan` (with reserved `ShiftOr` variant), `ArchProfile` (CPUID once at construction), and a calibrated `estimated_cost_ns` cost model. Refactors `FoldedContainsDfa`, `FlatContainsDfa`, `MultiContainsDfa::scan_to_bitbuf` to dispatch via the planner into per-path `run_*` helpers; `ssa_saturated` / `escape_pair_targets` consolidated into the planner module. Adds `test_planner_matches_legacy_cascade` (12 corpus × needle pairs from `benches/fsst_like.rs`) plus 11 unit tests covering each routing decision row. New `VORTEX_FSST_PLAN_TRACE=1` env var prints planner inputs + chosen plan + estimated cost. Conflicts resolved: - `folded_contains.rs`: kept Fat Teddy's accessor methods (`bucketed_pair_codes_slice`, `single_step_accept_codes_slice`) AND the planner's `scan_plan_name` refactor. - `tests.rs`: kept Fat Teddy's `MultiNeedleMatcher` test section AND the planner's `test_planner_matches_legacy_cascade` bench-parity regression test. After all three subagent merges (Shift-Or + Fat Teddy + planner), 210 tests pass with `_test-harness`. `cargo +nightly fmt --all` clean. Deferred TODOs preserved: - Cross-bucket FDR for ESCAPE_CODE in Fat Teddy. - AVX-512 / NEON variants of `fat_teddy_pass_*`. - Planner integration of the `ShiftOr` plan (reserved slot exists; routing decision is still made in `try_new_with`). Signed-off-by: Claude <noreply@anthropic.com>
2 parents 666ad9c + b994a06 commit d24e5cb

6 files changed

Lines changed: 1250 additions & 274 deletions

File tree

encodings/fsst/src/dfa/flat_contains.rs

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,9 @@ use super::build_fused_table;
3636
use super::build_symbol_transitions;
3737
use super::kmp_byte_transitions;
3838
use super::needle_bytes_absent_from_all_symbols;
39+
use super::planner::ScanContext;
40+
use super::planner::ScanPlan;
41+
use super::planner::ScanPlanner;
3942
use super::scan_to_bitbuf_with;
4043
use super::skip::SkipStrategy;
4144

@@ -58,6 +61,10 @@ pub(crate) struct FlatContainsDfa {
5861
/// with a single `memmem` over `all_bytes` rather than running the
5962
/// sentinel-branching per-code DFA on every row.
6063
escape_only_pattern: Option<Vec<u8>>,
64+
/// Routing engine. The flat DFA only routes between `EscapeOnly`
65+
/// and `RowLoop`, but going through the planner keeps the
66+
/// dispatch surface uniform across the three contains DFAs.
67+
planner: ScanPlanner,
6168
}
6269

6370
impl FlatContainsDfa {
@@ -118,6 +125,7 @@ impl FlatContainsDfa {
118125
skip,
119126
anchor,
120127
escape_only_pattern,
128+
planner: ScanPlanner::new(),
121129
})
122130
}
123131

@@ -176,10 +184,18 @@ impl FlatContainsDfa {
176184
where
177185
T: vortex_array::dtype::IntegerPType,
178186
{
179-
if let Some(pattern) = self.escape_only_pattern.as_deref() {
180-
return self.scan_via_escape_only_memmem(n, offsets, all_bytes, pattern, negated);
187+
let ctx = ScanContext::for_flat_or_multi(n, all_bytes, self.escape_only_pattern.is_some());
188+
match self.planner.plan_flat_or_multi(&ctx) {
189+
ScanPlan::EscapeOnly => {
190+
let pattern = self
191+
.escape_only_pattern
192+
.as_deref()
193+
.vortex_expect("EscapeOnly plan requires escape_only_pattern");
194+
self.scan_via_escape_only_memmem(n, offsets, all_bytes, pattern, negated)
195+
}
196+
// The planner only emits these two for the flat DFA today.
197+
_ => scan_to_bitbuf_with(n, offsets, all_bytes, negated, |codes| self.matches(codes)),
181198
}
182-
scan_to_bitbuf_with(n, offsets, all_bytes, negated, |codes| self.matches(codes))
183199
}
184200

185201
/// Single-`memmem` prefilter for the escape-only regime. Each hit is

0 commit comments

Comments
 (0)