transform: equality-saturation MIR optimizer (mz_transform::eqsat)#37160
transform: equality-saturation MIR optimizer (mz_transform::eqsat)#37160antiguru wants to merge 38 commits into
Conversation
3c41464 to
58565b3
Compare
antiguru
left a comment
There was a problem hiding this comment.
Critical review: eqsat MIR optimizer
I read the engine end-to-end plus the design-review / status / roadmap / polarity-extractor docs. This is genuinely impressive work and the self-assessment in the status doc is unusually honest. The notes below are the gaps I'd want closed (or explicitly punted with eyes open) before this leaves draft, followed by the bigger-picture framing.
The bigger picture first
By the team's own framing, the live logical pass is "parity, not improvement." Every active rule duplicates a rewrite the pipeline already performs, and the one genuine divergence (the WCOJ→DeltaQuery decision) ships only behind the second, default-off flag (enable_eqsat_physical_optimizer). So in the default-on configuration this PR proposes, the user-visible plan benefit today is ≈0, while it adds a 16.5k-line optimizer surface, real optimize latency, and new soundness obligations. The justification is strangler-fig: grow eqsat until it can delete pipeline passes. That's a sound strategy — but no deletion has landed yet, and the one cutover that had landed (skipping CanonicalizeMfp under the flag) is reverted in this very PR because it was unsound (lib.rs:958 comment). So the current state is "extra work for parity," and the merge decision should be judged on that basis, not on the eventual vision.
That's the lens for the blockers below.
Blockers (internal contradictions that make the PR un-mergeable in either flag state)
1. The flag default contradicts itself across the PR. The PR body says "the flag is temporarily defaulted on so CI exercises the pass… Revert to off before un-drafting," and the status doc lists the same revert under "What is left." But definitions.rs (enable_eqsat_optimizer) now carries the opposite comment: "Defaulted on… the intended steady state… not a pre-merge revert." Pick one. Shipping a brand-new experimental optimizer default-on to every user, for zero plan benefit today, is a big call that needs an explicit owner sign-off — not an ambiguity buried in a comment.
2. The SLT goldens are silently coupled to the default-on flag, and nothing pins it. Ten SLT files are rewritten to eqsat-on output (not-null-propagation.slt +218/-… , locally_optimized_plan.slt 118 lines, joins, chbench, aoc_*, record, subquery, demand, reduction_pushdown), but no SLT file sets enable_eqsat_optimizer (grep across test/sqllogictest/ finds zero references). So those goldens pass only because the global default is true. This makes #1 not just a doc nit but a hard contradiction: if you "revert to off before un-drafting," these tests immediately fail; if you keep it on, you're shipping on. There is no flag state in which the PR as written is both green and consistent with its own stated plan. Resolution options: (a) commit to default-on and fix the PR body/status doc; (b) gate each eqsat-divergent golden behind an explicit per-file SET enable_eqsat_optimizer (or mzcompose override) and revert the default to off; (c) revert the golden rewrites. (b) is the honest one if the intent is really "off at merge."
3. Synthesized empty-constant types are placeholders, and the pass is now live. raise::repr_type_of_arity (raise.rs:391) types every column of a synthesized Empty (from union_cancel / empty_false_filter) as Int64?. Its own doc comment still says "The pass is offline; the surrounding optimizer recomputes column types when this is ever wired live" — but it is wired live now. The boundary arity guard in EqSatTransform (transform.rs:64) catches arity drift but not column-type drift. So when one of those rules collapses a branch whose true column types aren't Int64 (e.g. a text/numeric column), eqsat emits an internally-consistent-but-wrong-typed plan. Because Typecheck recomputes bottom-up, the wrong type is self-consistent and may not be rejected — which means it can silently change a query's result-column type, or feed a type-incompatible arm to a sibling Union. Please carry the real ColumnType through the Empty node (it's available at lower time), and add a regression test: an empty-collapsed branch with a non-Int64 column must round-trip its type and survive the final strict Typecheck.
Soundness surface that is asserted but not enforced
4. Negate-join rules are enabled; their soundness rests entirely on the polarity-aware extractor; the matching Lean obligations are sorry. The status doc says the two unsound negate-join rules "were removed," but they're live in the rule file (relational.rewrite:131 distribute_negate_join, :136 factor_negate_join). The newest workstream (2026-06-21-polarity-aware-extractor.md) re-enabled them, with soundness moved into the extractor ("no Negate-rooted representative directly under a non-linear Reduce/TopK") rather than a rule guard. That's a defensible design — but it makes the extractor's polarity lattice a load-bearing correctness mechanism, and the corresponding Lean theorems (negative-multiplicity / non-linear-reduce) are among the 24 sorrys. So the formal-soundness story for the riskiest rules is currently aspirational.
5. The Lean spec is not gated by CI and can rot. There's no CI job that builds src/transform/lean/, and no check that gen-lean's Generated.lean is in sync with the rule file (unlike the repo's other generated artifacts, which have sync gates). 24/32 theorems are sorry. Without a gate, (a) a rule change can desync Generated.lean unnoticed, and (b) the sorry count can grow silently. Either wire a real gate (build Lean, fail on new sorry, assert gen-lean output is regenerated) or state plainly in the docs that the Lean artifact is advisory, not enforced — right now the prose implies more assurance than CI provides.
6. raise alone can emit un-renderable plans; correctness depends on a bundled cleanup. lib.rs/raise.rs note that raise can produce a single Reduce mixing reduction types that ReducePlan::create_from panics on, and only the bundled logical_fixpoint_02 re-run (raise.rs:306) splits it via ReduceReduction. That's an implicit invariant on a comment. Any future caller of raise that skips the cleanup crashes rendering. Make it an enforced invariant + test ("raise output never contains a mixed-reduction Reduce"), or split the reduction inside raise itself.
Architecture / cost
7. The keystone the design-review prescribed was only half-done. The 2026-06-19 design review's #1 recommendation was to retarget ENode onto real MirRelationExpr/MirScalarExpr and delete the dual Rel/lower/raise IR, calling the dual IR "structurally unsuited to replace production passes" (lossy round-trips, type loss, per-operator bail treadmill). What landed is a partial retarget — scalar payloads are real MirScalarExpr, but the relational dual IR remains. So every problem that recommendation targeted persists: raise still drops constant-singleton identity inputs via join_scalars (raise.rs:170), still synthesizes placeholder types (#3), and still bails per-operator. Please reconcile this explicitly: is direct-on-MIR still the plan, or is the dual IR the accepted steady state? The docs don't close the loop with the design review's verdict.
8. The default-on path runs fixpoint_logical_02 twice, back-to-back, plus a full Demand/ProjectionPushdown and two coalesce_mfp passes. EqSatTransform is pushed right after fixpoint_logical_02() in logical_optimizer (lib.rs:815 then the push), and optimize_logical re-runs the entire fixpoint_logical_02 inside raise (raise.rs:306), plus coalesce_mfp twice and full Demand+ProjectionPushdown (eqsat.rs:139-164). For zero plan benefit today, that's a lot of repeated optimizer work. Combined with MAX_PLAN_SIZE = 200 (which makes the pass a no-op on most non-trivial plans anyway), the net effect at default-on is "added latency on small plans, parity output." Please measure the added optimize latency (the status doc itself cites 27s and ~6.5s episodes that were tamed by caps, not by being fast) and let that inform the default-on decision.
9. The "wins" aren't wins. compare_real reports 3 wins / 0 losses / 17 ties, but the 3 are self-described as "cost-model artifacts (eqsat omits a canonicalizing Project)." An omitted canonicalizing projection is plausibly a less-canonical plan, not an improvement. I'd treat the harness as showing parity, and double-check the omitted Project isn't a latent canonicalization regression rather than a feature.
Repo hygiene / process
10. ~3k lines of agent working docs are committed under docs/superpowers/ (plus .superpowers/sdd/e1-report.md), several already stale relative to the code: the design review predates the retarget; the status doc claims "32 rules / negate rules removed / CanonicalizeMfp cutover done" while the code has 34 rules, negate rules enabled, and the cutover reverted. Decide whether these belong in the product repo at all (vs. doc/developer/design/ per existing convention) and, if kept, mark the historical ones as such so they don't read as current truth.
11. 61 commits / 63 files / 16.5k lines in one PR is effectively un-reviewable by a human as a unit. Strongly consider splitting for landing: (a) flag plumbing + no-op registration, (b) the engine, (c) the Lean spec + its CI gate, (d) the SLT golden changes. Each is independently reviewable and bisectable.
What I'd do to "complete the work"
- Resolve #1/#2 as a unit — decide the merge-time flag state and make the goldens consistent with it (I'd default off and gate the divergent goldens explicitly until the first real pass deletion lands, since default-on buys nothing today and adds risk).
- Fix the empty-constant type loss (#3) and add the type round-trip test; promote the mixed-
Reduceinvariant (#6) from comment to test. - Add the Lean CI gate or downgrade the soundness claims in prose (#5); track the negate-join
sorrys (#4) as explicit release-blockers for the physical flag, not the logical one. - Write the docs reconciliation: dual-IR-vs-direct-on-MIR decision (#7), refresh the stale status doc, and quantify the latency (#8).
- Then the strangler-fig payoff becomes real: land the first actual fixpoint deletion behind the flag with a green SLT gate — that's the milestone that converts this from "parity at a cost" into "a pass worth defaulting on."
Happy to dig into any single thread (the polarity lattice, the cost model's memory axis, or the empty-type path) in more depth.
Generated by Claude Code
Introduce an equality-saturation optimizer that lowers a subset of MirRelationExpr into an e-graph, applies confluent rewrite rules plus congruence closure to a bounded fixpoint, and extracts the cheapest equivalent plan under a two-axis (memory-first, time-second) cost model. The pass registers in the logical and physical optimizers behind per-phase feature flags, equivalence- and type-preserving at the live boundary (adopt_if_type_preserving), with a MAX_PLAN_SIZE guard. It reuses the production CanonicalizeMfp, Demand, and ProjectionPushdown machinery after raise, and seeds index availability and literal-filter lookups from the oracle in the physical phase. Includes: the Rel IR with opaque bail, lower/raise round-trip, the saturation engine with budgets and scope peeling for Let/LetRec, lattice e-class analyses (NonNeg, Keys, Monotonic, Equivalences, ConstantColumns), greedy and ILP extractors, extraction-time CSE, a build-time rule DSL with per-rule codegen, a pluggable join orderer (DpSub default, Dpccp, DphypBushy), a Lean 4 soundness-spec emitter, and the architecture reference doc. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019m58NEHNTg4NQ2YuQmrpzQ
0bb7950 to
806aa7e
Compare
Snapshot EXPLAIN PHYSICAL PLAN for the LDBC BI queries (bi and eager variants) with the equality-saturation passes disabled, so these goldens reflect the production optimizer. The companion commit flips the eqsat flags to `true` and rewrites the expected plans, making the diff between the two commits exactly eqsat's effect on the physical plans. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019m58NEHNTg4NQ2YuQmrpzQ
Flip the eqsat flags to `true` and rewrite the LDBC physical-plan goldens. The diff against the parent commit is exactly the equality-saturation optimizer's effect on the physical plans: join-order changes in the delta/differential implementations and a net reduction in plan size. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019m58NEHNTg4NQ2YuQmrpzQ
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The rule's `reads_indexed_global` guard reads index availability, which only the physical pass populates. In the logical pass availability is empty, so the guard is always false and the rule never fires there; gating it `phase physical` removes wasted logical match attempts and matches the existing pattern of `arrange_idempotent`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The pull-up rule now reads this guard via generated matcher code, so the method is live and the "lands in a later change" comment is stale. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Design and implementation plan for pull_project_out_of_join_first, which restores the LDBC BI Q20 delta joins by exposing the bare index-backed Get. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The ILP selects a finite DAG and cannot represent a cyclic plan. Let-free fragments are acyclic by construction (recursive references stay opaque), so a cycle is an upstream invariant violation; detect it with a DFS back-edge check and fall back to greedy up front rather than recursing 500 frames deep in reconstruction. Defense in depth informed by Tensat (MLSys 2021), which found in-ILP acyclicity constraints too costly and moved cycle handling out of band. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The projection pull-up rule and the CSE worth_binding guard reshaped many plans (projections pulled above joins so reads/arrangements share directly, CTE boundaries regrouped), but only ldbc_bi_physical.slt was regoldened when they landed. Regenerate the remaining 19 affected goldens. Plan-text only; no query result rows change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Two query shapes explained with the eqsat passes off then on. An EXCEPT over two reads of the same table: legacy reads it twice, eqsat binds the read once as a CTE. A doubly-nested EXISTS: legacy duplicates the base read and joins through a CrossJoin with a constant, eqsat shares the read and cancels the cross join into a union. Small fully-indexed joins do not differentiate the optimizers, so the file targets subquery-derived plans. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…on notes Three design notes: a survey of e-graph extraction-under-non-local-cost literature against extract.rs; an integration plan for eqsat knobs (time/size budget, per-entrypoint select-vs-maintained split); and a feasibility study of converting the remaining Opaque enodes, including the MirRelationExpr<Inner> functor with a defaulted child type. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The earlier sweep only ran files that differed from upstream, so it missed goldens that equal upstream because they were never updated for eqsat. CI (at --replica-size=scale=1,workers=2) flagged three such slt files and the explain-analyze.td testdrive golden. The slt changes are plan reshaping; the .td change is a differential-join drive-order flip (%1 » %0 -> %0 » %1) in a builtin index dataflow. No query result rows change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…ntity Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…and cost Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…on versions ENode::LocalGet now carries the iteration version, so two references to the same recursive binding that differ only in version (Prev(j) vs Cur(j)) intern to distinct e-classes. The e-graph interns ENode, not Rel, so the version must ride on ENode for congruence to keep the versions apart. With the lift flag off the version is always None and hashing is byte-identical to before. The Rel->ENode and both ENode->Rel boundaries (greedy extract in egraph.rs and the ILP extractor in extract.rs) carry the version through. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…acement With enable_eqsat_wmr_lift on, a subterm that two or more recursive bindings compute identically over the same versioned leaves is hoisted into one new LetRec binding. The new binding is placed at an order slot that realizes its versions: just after the maximum position among the Cur(j) sources it reads, which is before every consumer. Prev(m) reads impose m at or after the binding, satisfied because consumers sit at or after every Prev target. An empty placement interval signals a real bug, so the pass panics with the constraints rather than silently skipping the hoist. The flag threads from ctx.features.enable_eqsat_wmr_lift through optimize_logical into lowering (version tagging) and CSE. With the flag off no versioned subterm hoists and the output is identical to before, so existing goldens are unchanged. Raise's LetRec arm now seeds each binding's type into scope after raising it, so a CSE-introduced reference to a hoisted binding resolves its type. The order placement guarantees a hoisted binding is raised before its consumers. Validated structurally (binding count grows, a fresh binding is read by two or more siblings) and by a strict Typecheck of the raised plan. End-to-end result equivalence is deferred to a SQL-level differential test (Task 7): the mz-transform unit harness has no differential runtime. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Extend `hoist_shared_letrec_subterms` to recognize loop-invariant shared subterms (those that contain no versioned recursive leaf) and hoist them as ordinary `Rel::Let` bindings wrapping the whole `LetRec`, so they are computed once before the fixpoint loop starts rather than on every iteration. Previously, only subterms with at least one versioned leaf were candidates for the in-loop CSE pass (Task 5). Loop-invariant subterms embedded inside a versioned shared subterm escaped the outer CSE after Task 5 reduced their appearance count to one. Task 6 processes invariant candidates alongside versioned ones (in ascending size order), hoisting each invariant subterm as an outer Let before Task 5 consumes its container. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…ibution) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Replace Clobbered/Redundant verdict labels with NetNeutral. The prior labels derived from intrinsic < final_on, but the intrinsic snapshot is taken before JoinImplementation selection, so every Join carries Unimplemented and the comparison was an artifact of snapshot timing, not a real signal. Verdict is now purely: EqsatWins if final_on < final_off, NetNeutral otherwise. The intrinsic count and eqsat_committed_joins are retained as informational columns showing logical-shape change and whether eqsat committed any join plans (currently false everywhere). Add two divergence-regime fixtures (g1/cse-shared-filter-then-join and g2/phase-order-filter-reduce) targeting cases where eqsat's global search could plausibly produce fewer arrangements than the greedy pipeline. Both measure NetNeutral: production already captures the opportunities via its greedy pass ordering, which is the honest measured finding. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…ortunity Add a per-collection distinct-key diagnostic (multi_key_collections / report_multi_key) over the eqsat-OFF fully-optimized plan, flagging any collection JI arranges by two or more distinct keys. Add five probe fixtures (p1-p4) that try to force such a duplication where a single key would serve both join sites. The diagnostic reports zero multi-key collections everywhere: JI plus CSE already arrange each collection once on its equivalence-forced key and reuse it at every site. Conclusive negative for a JI cross-join arrangement-sharing opportunity. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Brute-force every join order on adversarial multi-way-join fixtures (chain, star, cycle, clique, indexed chain) and compare the minimum differential arrangement count over orders to what production JoinImplementation commits. Validates the per-order model against the real planner on JI's own order and against plan_as_delta_query. Finding: on index-free joins JI's order is arrangement-minimal (strong negative for the order lever); the only order-dependent saving appears when an existing index is reusable under one order but not JI's chosen one (chain4_idx: 5 vs 6). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Add count_scalar_ops and count_distinct_scalar_subexprs functions that measure total MirScalarExpr node count and distinct subexpression count across every scalar-bearing relational operator in a plan. Extend Measurement with scalar_ops_off/on and scalar_distinct_off/on fields populated from the eqsat-off and eqsat-on final plans. Print four new columns (sc_off, sc_on, sd_off, sd_on) in the per-fixture table and add a scalar summary line. Result: passthrough confirmed on all 15 fixtures (eqsat-on == eqsat-off on both scalar metrics), establishing the baseline that eqsat treats scalars as opaque. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Phase 0 of the scalar equality-saturation canonicalizer: a contained scalar e-graph with hash-cons, union-find, congruence closure, and a bounded saturation-loop scaffold, plus faithful lower and raise bridges. No rewrite rules, analyses, cost bonuses, or CanonicalizeMfp wiring yet. The round-trip gate asserts the identity invariant raise(lower(e)) == e over a corpus of hand-built MirScalarExpr, since with zero rules the bridge must reproduce the input exactly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Durable progress ledgers, task briefs/reports, and implementation plans for the equality-saturation optimizer work. These normally live in git-ignored scratch. Snapshotting them onto the branch so the project state survives on the remote. Excludes the regenerable review-package diffs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The Literal(Result<Row, EvalError>, _) payload deviation exists so error literals survive the lower/raise bridge. Phase 0 had no test exercising the Err arm. Add one (review finding). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Adds the e-class analysis framework to ScalarEGraph that Phase 1 rewrite rules will use as side-condition guards. `ClassAnalysis` carries two fields, both conservative upper bounds: - `could_error: bool`: OR over all nodes in the class, mirroring MirScalarExpr::could_error (scalar.rs:1226) at the e-node level. - `literal: Option<(Result<Row,EvalError>, ReprColumnType)>`: Some iff the class contains a Literal node. The `analysis` HashMap<Id, ClassAnalysis> is maintained in ScalarEGraph: - `add`: computes the new class's analysis via `analysis::make` immediately after interning (bottom-up, so all child analyses exist). - `union`: calls `analysis::merge` (OR / or()) to fold rb into ra and removes the rb entry. - `rebuild`: clears and recomputes all analyses from scratch after the congruence fixpoint, using a fixpoint loop to handle unknown topological order among surviving classes. Exposes `pub fn analysis(&self, id: Id) -> &ClassAnalysis` for Phase 1. Ships no rewrite rules. All 12 existing round-trip tests still pass. 10 new analysis tests cover: could_error false/true for each leaf variant, propagation through CallBinary (safe func + safe children, erroring child, erroring func), literal Some/None, and merge behavior (OR survives union, literal survives union with non-literal). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
union now requires both classes to carry an analysis via expect rather than defaulting a missing entry to could_error=false. false is the unsound direction for a conservative analysis: a rule reading it as a guard could drop an operand that can error. A missing entry is a bug, so fail loudly. Also restore the Debug derive on ScalarEGraph and assert the non-empty-class invariant in the rebuild analysis recompute, where its violation would otherwise surface as a later analysis() panic. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Snapshot of SDD progress through Phase 1 task 1a (analyses, reviewed) and the 1b brief (rule infra + constant folding, in flight). Scratch ledger. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
An e-graph / equality-saturation optimizer over a subset of MIR, wired into the logical optimizer behind
enable_eqsat_optimizer(default on) and the physical optimizer behindenable_eqsat_physical_optimizer(default off).It lowers MIR to a relational IR with real
MirScalarExprpayloads, saturates a ported rule engine over an e-graph, extracts the cheapest plan under a memory-primary cost model, and raises back. Unsupported subtrees bail to opaque leaves carried verbatim, so the pass is equivalence preserving on arbitrary input; an arity guard at the transform boundary makes any mismatch a no-op.The intended win is the cyclic-join decision: on a cyclic (e.g. triangle) join the AGM cost model proves worst-case-optimal (delta) evaluation dominates the binary join on memory and time, where
JoinImplementationpicks the dominated binary plan. The physical pass commits that decision by tagging the joinJoinImplementation::DeltaQuery(synthesized by reusing the real delta planner), whichJoinImplementationthen leaves unre-planned. WcoJoin is created only for cyclic joins, and a binary join's cost charges its persistent arrangements rather than its transient N-squared output, so the choice is delta where it wins and binary parity elsewhere.Supporting infrastructure: a polarity-aware extractor that never selects a multiplicity-signed representative as the input of a non-linear
Reduce/TopK(with a graceful no-op when a root is otherwise unextractable), a logical/physicalphaseannotation on rewrite rules, and a constant-column e-class analysis. These are equivalence preserving and scoped to the appropriate pass.A prior cutover that skipped
CanonicalizeMfpunder the flag is reverted, since the e-graph does not process every plan and so cannot subsume it.The logical flag is default-on as the intended steady state (the strangler-fig plan: grow eqsat until it can delete pipeline passes). CI exercises the pass across the SLT corpus with the flag on and surfaces any plan or result diffs; the affected goldens are rewritten to the eqsat-on output. The logical pass is parity, not improvement, today; the genuine plan win is the physical worst-case-optimal-join decision behind the second, default-off flag.
🤖 Generated with Claude Code
https://claude.ai/code/session_016CwwG1wCgqf3v8g6uhf1Nx