Skip to content

transform: equality-saturation MIR optimizer (mz_transform::eqsat)#37160

Draft
antiguru wants to merge 38 commits into
MaterializeInc:mainfrom
antiguru:claude/mir-equality-optimizer-sodbej
Draft

transform: equality-saturation MIR optimizer (mz_transform::eqsat)#37160
antiguru wants to merge 38 commits into
MaterializeInc:mainfrom
antiguru:claude/mir-equality-optimizer-sodbej

Conversation

@antiguru

@antiguru antiguru commented Jun 19, 2026

Copy link
Copy Markdown
Member

An e-graph / equality-saturation optimizer over a subset of MIR, wired into the logical optimizer behind enable_eqsat_optimizer (default on) and the physical optimizer behind enable_eqsat_physical_optimizer (default off).

It lowers MIR to a relational IR with real MirScalarExpr payloads, saturates a ported rule engine over an e-graph, extracts the cheapest plan under a memory-primary cost model, and raises back. Unsupported subtrees bail to opaque leaves carried verbatim, so the pass is equivalence preserving on arbitrary input; an arity guard at the transform boundary makes any mismatch a no-op.

The intended win is the cyclic-join decision: on a cyclic (e.g. triangle) join the AGM cost model proves worst-case-optimal (delta) evaluation dominates the binary join on memory and time, where JoinImplementation picks the dominated binary plan. The physical pass commits that decision by tagging the join JoinImplementation::DeltaQuery (synthesized by reusing the real delta planner), which JoinImplementation then leaves unre-planned. WcoJoin is created only for cyclic joins, and a binary join's cost charges its persistent arrangements rather than its transient N-squared output, so the choice is delta where it wins and binary parity elsewhere.

Supporting infrastructure: a polarity-aware extractor that never selects a multiplicity-signed representative as the input of a non-linear Reduce/TopK (with a graceful no-op when a root is otherwise unextractable), a logical/physical phase annotation on rewrite rules, and a constant-column e-class analysis. These are equivalence preserving and scoped to the appropriate pass.

A prior cutover that skipped CanonicalizeMfp under the flag is reverted, since the e-graph does not process every plan and so cannot subsume it.

The logical flag is default-on as the intended steady state (the strangler-fig plan: grow eqsat until it can delete pipeline passes). CI exercises the pass across the SLT corpus with the flag on and surfaces any plan or result diffs; the affected goldens are rewritten to the eqsat-on output. The logical pass is parity, not improvement, today; the genuine plan win is the physical worst-case-optimal-join decision behind the second, default-off flag.

🤖 Generated with Claude Code
https://claude.ai/code/session_016CwwG1wCgqf3v8g6uhf1Nx

@antiguru antiguru force-pushed the claude/mir-equality-optimizer-sodbej branch from 3c41464 to 58565b3 Compare June 20, 2026 10:56

@antiguru antiguru left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical review: eqsat MIR optimizer

I read the engine end-to-end plus the design-review / status / roadmap / polarity-extractor docs. This is genuinely impressive work and the self-assessment in the status doc is unusually honest. The notes below are the gaps I'd want closed (or explicitly punted with eyes open) before this leaves draft, followed by the bigger-picture framing.

The bigger picture first

By the team's own framing, the live logical pass is "parity, not improvement." Every active rule duplicates a rewrite the pipeline already performs, and the one genuine divergence (the WCOJ→DeltaQuery decision) ships only behind the second, default-off flag (enable_eqsat_physical_optimizer). So in the default-on configuration this PR proposes, the user-visible plan benefit today is ≈0, while it adds a 16.5k-line optimizer surface, real optimize latency, and new soundness obligations. The justification is strangler-fig: grow eqsat until it can delete pipeline passes. That's a sound strategy — but no deletion has landed yet, and the one cutover that had landed (skipping CanonicalizeMfp under the flag) is reverted in this very PR because it was unsound (lib.rs:958 comment). So the current state is "extra work for parity," and the merge decision should be judged on that basis, not on the eventual vision.

That's the lens for the blockers below.

Blockers (internal contradictions that make the PR un-mergeable in either flag state)

1. The flag default contradicts itself across the PR. The PR body says "the flag is temporarily defaulted on so CI exercises the pass… Revert to off before un-drafting," and the status doc lists the same revert under "What is left." But definitions.rs (enable_eqsat_optimizer) now carries the opposite comment: "Defaulted on… the intended steady state… not a pre-merge revert." Pick one. Shipping a brand-new experimental optimizer default-on to every user, for zero plan benefit today, is a big call that needs an explicit owner sign-off — not an ambiguity buried in a comment.

2. The SLT goldens are silently coupled to the default-on flag, and nothing pins it. Ten SLT files are rewritten to eqsat-on output (not-null-propagation.slt +218/-… , locally_optimized_plan.slt 118 lines, joins, chbench, aoc_*, record, subquery, demand, reduction_pushdown), but no SLT file sets enable_eqsat_optimizer (grep across test/sqllogictest/ finds zero references). So those goldens pass only because the global default is true. This makes #1 not just a doc nit but a hard contradiction: if you "revert to off before un-drafting," these tests immediately fail; if you keep it on, you're shipping on. There is no flag state in which the PR as written is both green and consistent with its own stated plan. Resolution options: (a) commit to default-on and fix the PR body/status doc; (b) gate each eqsat-divergent golden behind an explicit per-file SET enable_eqsat_optimizer (or mzcompose override) and revert the default to off; (c) revert the golden rewrites. (b) is the honest one if the intent is really "off at merge."

3. Synthesized empty-constant types are placeholders, and the pass is now live. raise::repr_type_of_arity (raise.rs:391) types every column of a synthesized Empty (from union_cancel / empty_false_filter) as Int64?. Its own doc comment still says "The pass is offline; the surrounding optimizer recomputes column types when this is ever wired live" — but it is wired live now. The boundary arity guard in EqSatTransform (transform.rs:64) catches arity drift but not column-type drift. So when one of those rules collapses a branch whose true column types aren't Int64 (e.g. a text/numeric column), eqsat emits an internally-consistent-but-wrong-typed plan. Because Typecheck recomputes bottom-up, the wrong type is self-consistent and may not be rejected — which means it can silently change a query's result-column type, or feed a type-incompatible arm to a sibling Union. Please carry the real ColumnType through the Empty node (it's available at lower time), and add a regression test: an empty-collapsed branch with a non-Int64 column must round-trip its type and survive the final strict Typecheck.

Soundness surface that is asserted but not enforced

4. Negate-join rules are enabled; their soundness rests entirely on the polarity-aware extractor; the matching Lean obligations are sorry. The status doc says the two unsound negate-join rules "were removed," but they're live in the rule file (relational.rewrite:131 distribute_negate_join, :136 factor_negate_join). The newest workstream (2026-06-21-polarity-aware-extractor.md) re-enabled them, with soundness moved into the extractor ("no Negate-rooted representative directly under a non-linear Reduce/TopK") rather than a rule guard. That's a defensible design — but it makes the extractor's polarity lattice a load-bearing correctness mechanism, and the corresponding Lean theorems (negative-multiplicity / non-linear-reduce) are among the 24 sorrys. So the formal-soundness story for the riskiest rules is currently aspirational.

5. The Lean spec is not gated by CI and can rot. There's no CI job that builds src/transform/lean/, and no check that gen-lean's Generated.lean is in sync with the rule file (unlike the repo's other generated artifacts, which have sync gates). 24/32 theorems are sorry. Without a gate, (a) a rule change can desync Generated.lean unnoticed, and (b) the sorry count can grow silently. Either wire a real gate (build Lean, fail on new sorry, assert gen-lean output is regenerated) or state plainly in the docs that the Lean artifact is advisory, not enforced — right now the prose implies more assurance than CI provides.

6. raise alone can emit un-renderable plans; correctness depends on a bundled cleanup. lib.rs/raise.rs note that raise can produce a single Reduce mixing reduction types that ReducePlan::create_from panics on, and only the bundled logical_fixpoint_02 re-run (raise.rs:306) splits it via ReduceReduction. That's an implicit invariant on a comment. Any future caller of raise that skips the cleanup crashes rendering. Make it an enforced invariant + test ("raise output never contains a mixed-reduction Reduce"), or split the reduction inside raise itself.

Architecture / cost

7. The keystone the design-review prescribed was only half-done. The 2026-06-19 design review's #1 recommendation was to retarget ENode onto real MirRelationExpr/MirScalarExpr and delete the dual Rel/lower/raise IR, calling the dual IR "structurally unsuited to replace production passes" (lossy round-trips, type loss, per-operator bail treadmill). What landed is a partial retarget — scalar payloads are real MirScalarExpr, but the relational dual IR remains. So every problem that recommendation targeted persists: raise still drops constant-singleton identity inputs via join_scalars (raise.rs:170), still synthesizes placeholder types (#3), and still bails per-operator. Please reconcile this explicitly: is direct-on-MIR still the plan, or is the dual IR the accepted steady state? The docs don't close the loop with the design review's verdict.

8. The default-on path runs fixpoint_logical_02 twice, back-to-back, plus a full Demand/ProjectionPushdown and two coalesce_mfp passes. EqSatTransform is pushed right after fixpoint_logical_02() in logical_optimizer (lib.rs:815 then the push), and optimize_logical re-runs the entire fixpoint_logical_02 inside raise (raise.rs:306), plus coalesce_mfp twice and full Demand+ProjectionPushdown (eqsat.rs:139-164). For zero plan benefit today, that's a lot of repeated optimizer work. Combined with MAX_PLAN_SIZE = 200 (which makes the pass a no-op on most non-trivial plans anyway), the net effect at default-on is "added latency on small plans, parity output." Please measure the added optimize latency (the status doc itself cites 27s and ~6.5s episodes that were tamed by caps, not by being fast) and let that inform the default-on decision.

9. The "wins" aren't wins. compare_real reports 3 wins / 0 losses / 17 ties, but the 3 are self-described as "cost-model artifacts (eqsat omits a canonicalizing Project)." An omitted canonicalizing projection is plausibly a less-canonical plan, not an improvement. I'd treat the harness as showing parity, and double-check the omitted Project isn't a latent canonicalization regression rather than a feature.

Repo hygiene / process

10. ~3k lines of agent working docs are committed under docs/superpowers/ (plus .superpowers/sdd/e1-report.md), several already stale relative to the code: the design review predates the retarget; the status doc claims "32 rules / negate rules removed / CanonicalizeMfp cutover done" while the code has 34 rules, negate rules enabled, and the cutover reverted. Decide whether these belong in the product repo at all (vs. doc/developer/design/ per existing convention) and, if kept, mark the historical ones as such so they don't read as current truth.

11. 61 commits / 63 files / 16.5k lines in one PR is effectively un-reviewable by a human as a unit. Strongly consider splitting for landing: (a) flag plumbing + no-op registration, (b) the engine, (c) the Lean spec + its CI gate, (d) the SLT golden changes. Each is independently reviewable and bisectable.

What I'd do to "complete the work"

  1. Resolve #1/#2 as a unit — decide the merge-time flag state and make the goldens consistent with it (I'd default off and gate the divergent goldens explicitly until the first real pass deletion lands, since default-on buys nothing today and adds risk).
  2. Fix the empty-constant type loss (#3) and add the type round-trip test; promote the mixed-Reduce invariant (#6) from comment to test.
  3. Add the Lean CI gate or downgrade the soundness claims in prose (#5); track the negate-join sorrys (#4) as explicit release-blockers for the physical flag, not the logical one.
  4. Write the docs reconciliation: dual-IR-vs-direct-on-MIR decision (#7), refresh the stale status doc, and quantify the latency (#8).
  5. Then the strangler-fig payoff becomes real: land the first actual fixpoint deletion behind the flag with a green SLT gate — that's the milestone that converts this from "parity at a cost" into "a pass worth defaulting on."

Happy to dig into any single thread (the polarity lattice, the cost model's memory axis, or the empty-type path) in more depth.


Generated by Claude Code

Introduce an equality-saturation optimizer that lowers a subset of
MirRelationExpr into an e-graph, applies confluent rewrite rules plus
congruence closure to a bounded fixpoint, and extracts the cheapest
equivalent plan under a two-axis (memory-first, time-second) cost model.

The pass registers in the logical and physical optimizers behind per-phase
feature flags, equivalence- and type-preserving at the live boundary
(adopt_if_type_preserving), with a MAX_PLAN_SIZE guard. It reuses the
production CanonicalizeMfp, Demand, and ProjectionPushdown machinery after
raise, and seeds index availability and literal-filter lookups from the
oracle in the physical phase.

Includes: the Rel IR with opaque bail, lower/raise round-trip, the saturation
engine with budgets and scope peeling for Let/LetRec, lattice e-class analyses
(NonNeg, Keys, Monotonic, Equivalences, ConstantColumns), greedy and ILP
extractors, extraction-time CSE, a build-time rule DSL with per-rule codegen,
a pluggable join orderer (DpSub default, Dpccp, DphypBushy), a Lean 4
soundness-spec emitter, and the architecture reference doc.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019m58NEHNTg4NQ2YuQmrpzQ
@antiguru antiguru force-pushed the claude/mir-equality-optimizer-sodbej branch from 0bb7950 to 806aa7e Compare June 24, 2026 09:57
antiguru and others added 26 commits June 24, 2026 14:02
Snapshot EXPLAIN PHYSICAL PLAN for the LDBC BI queries (bi and eager variants)
with the equality-saturation passes disabled, so these goldens reflect the
production optimizer. The companion commit flips the eqsat flags to `true` and
rewrites the expected plans, making the diff between the two commits exactly
eqsat's effect on the physical plans.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019m58NEHNTg4NQ2YuQmrpzQ
Flip the eqsat flags to `true` and rewrite the LDBC physical-plan goldens. The
diff against the parent commit is exactly the equality-saturation optimizer's
effect on the physical plans: join-order changes in the delta/differential
implementations and a net reduction in plan size.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019m58NEHNTg4NQ2YuQmrpzQ
The rule's `reads_indexed_global` guard reads index availability, which
only the physical pass populates. In the logical pass availability is
empty, so the guard is always false and the rule never fires there;
gating it `phase physical` removes wasted logical match attempts and
matches the existing pattern of `arrange_idempotent`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The pull-up rule now reads this guard via generated matcher code, so the
method is live and the "lands in a later change" comment is stale.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Design and implementation plan for pull_project_out_of_join_first, which
restores the LDBC BI Q20 delta joins by exposing the bare index-backed Get.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The ILP selects a finite DAG and cannot represent a cyclic plan. Let-free
fragments are acyclic by construction (recursive references stay opaque), so a
cycle is an upstream invariant violation; detect it with a DFS back-edge check
and fall back to greedy up front rather than recursing 500 frames deep in
reconstruction. Defense in depth informed by Tensat (MLSys 2021), which found
in-ILP acyclicity constraints too costly and moved cycle handling out of band.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The projection pull-up rule and the CSE worth_binding guard reshaped many
plans (projections pulled above joins so reads/arrangements share directly,
CTE boundaries regrouped), but only ldbc_bi_physical.slt was regoldened when
they landed. Regenerate the remaining 19 affected goldens. Plan-text only; no
query result rows change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Two query shapes explained with the eqsat passes off then on. An EXCEPT over
two reads of the same table: legacy reads it twice, eqsat binds the read once
as a CTE. A doubly-nested EXISTS: legacy duplicates the base read and joins
through a CrossJoin with a constant, eqsat shares the read and cancels the
cross join into a union. Small fully-indexed joins do not differentiate the
optimizers, so the file targets subquery-derived plans.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…on notes

Three design notes: a survey of e-graph extraction-under-non-local-cost
literature against extract.rs; an integration plan for eqsat knobs (time/size
budget, per-entrypoint select-vs-maintained split); and a feasibility study of
converting the remaining Opaque enodes, including the MirRelationExpr<Inner>
functor with a defaulted child type.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The earlier sweep only ran files that differed from upstream, so it missed
goldens that equal upstream because they were never updated for eqsat. CI
(at --replica-size=scale=1,workers=2) flagged three such slt files and the
explain-analyze.td testdrive golden. The slt changes are plan reshaping; the
.td change is a differential-join drive-order flip (%1 » %0 -> %0 » %1) in a
builtin index dataflow. No query result rows change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…and cost

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…on versions

ENode::LocalGet now carries the iteration version, so two references to the
same recursive binding that differ only in version (Prev(j) vs Cur(j)) intern
to distinct e-classes. The e-graph interns ENode, not Rel, so the version must
ride on ENode for congruence to keep the versions apart. With the lift flag off
the version is always None and hashing is byte-identical to before.

The Rel->ENode and both ENode->Rel boundaries (greedy extract in egraph.rs and
the ILP extractor in extract.rs) carry the version through.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…acement

With enable_eqsat_wmr_lift on, a subterm that two or more recursive bindings
compute identically over the same versioned leaves is hoisted into one new
LetRec binding. The new binding is placed at an order slot that realizes its
versions: just after the maximum position among the Cur(j) sources it reads,
which is before every consumer. Prev(m) reads impose m at or after the binding,
satisfied because consumers sit at or after every Prev target. An empty
placement interval signals a real bug, so the pass panics with the constraints
rather than silently skipping the hoist.

The flag threads from ctx.features.enable_eqsat_wmr_lift through optimize_logical
into lowering (version tagging) and CSE. With the flag off no versioned subterm
hoists and the output is identical to before, so existing goldens are unchanged.

Raise's LetRec arm now seeds each binding's type into scope after raising it, so
a CSE-introduced reference to a hoisted binding resolves its type. The order
placement guarantees a hoisted binding is raised before its consumers.

Validated structurally (binding count grows, a fresh binding is read by two or
more siblings) and by a strict Typecheck of the raised plan. End-to-end result
equivalence is deferred to a SQL-level differential test (Task 7): the
mz-transform unit harness has no differential runtime.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Extend `hoist_shared_letrec_subterms` to recognize loop-invariant shared
subterms (those that contain no versioned recursive leaf) and hoist them as
ordinary `Rel::Let` bindings wrapping the whole `LetRec`, so they are computed
once before the fixpoint loop starts rather than on every iteration.

Previously, only subterms with at least one versioned leaf were candidates for
the in-loop CSE pass (Task 5). Loop-invariant subterms embedded inside a
versioned shared subterm escaped the outer CSE after Task 5 reduced their
appearance count to one. Task 6 processes invariant candidates alongside
versioned ones (in ascending size order), hoisting each invariant subterm as an
outer Let before Task 5 consumes its container.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…ibution)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
antiguru and others added 11 commits June 26, 2026 22:12
Replace Clobbered/Redundant verdict labels with NetNeutral. The prior labels
derived from intrinsic < final_on, but the intrinsic snapshot is taken before
JoinImplementation selection, so every Join carries Unimplemented and the
comparison was an artifact of snapshot timing, not a real signal.

Verdict is now purely: EqsatWins if final_on < final_off, NetNeutral
otherwise. The intrinsic count and eqsat_committed_joins are retained as
informational columns showing logical-shape change and whether eqsat committed
any join plans (currently false everywhere).

Add two divergence-regime fixtures (g1/cse-shared-filter-then-join and
g2/phase-order-filter-reduce) targeting cases where eqsat's global search
could plausibly produce fewer arrangements than the greedy pipeline. Both
measure NetNeutral: production already captures the opportunities via its
greedy pass ordering, which is the honest measured finding.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
…ortunity

Add a per-collection distinct-key diagnostic (multi_key_collections /
report_multi_key) over the eqsat-OFF fully-optimized plan, flagging any
collection JI arranges by two or more distinct keys. Add five probe
fixtures (p1-p4) that try to force such a duplication where a single key
would serve both join sites. The diagnostic reports zero multi-key
collections everywhere: JI plus CSE already arrange each collection once
on its equivalence-forced key and reuse it at every site. Conclusive
negative for a JI cross-join arrangement-sharing opportunity.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Brute-force every join order on adversarial multi-way-join fixtures
(chain, star, cycle, clique, indexed chain) and compare the minimum
differential arrangement count over orders to what production
JoinImplementation commits. Validates the per-order model against the
real planner on JI's own order and against plan_as_delta_query.

Finding: on index-free joins JI's order is arrangement-minimal (strong
negative for the order lever); the only order-dependent saving appears
when an existing index is reusable under one order but not JI's chosen
one (chain4_idx: 5 vs 6).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Add count_scalar_ops and count_distinct_scalar_subexprs functions that
measure total MirScalarExpr node count and distinct subexpression count
across every scalar-bearing relational operator in a plan. Extend
Measurement with scalar_ops_off/on and scalar_distinct_off/on fields
populated from the eqsat-off and eqsat-on final plans. Print four new
columns (sc_off, sc_on, sd_off, sd_on) in the per-fixture table and add
a scalar summary line. Result: passthrough confirmed on all 15 fixtures
(eqsat-on == eqsat-off on both scalar metrics), establishing the baseline
that eqsat treats scalars as opaque.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Phase 0 of the scalar equality-saturation canonicalizer: a contained
scalar e-graph with hash-cons, union-find, congruence closure, and a
bounded saturation-loop scaffold, plus faithful lower and raise bridges.
No rewrite rules, analyses, cost bonuses, or CanonicalizeMfp wiring yet.

The round-trip gate asserts the identity invariant raise(lower(e)) == e
over a corpus of hand-built MirScalarExpr, since with zero rules the
bridge must reproduce the input exactly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Durable progress ledgers, task briefs/reports, and implementation plans for
the equality-saturation optimizer work. These normally live in git-ignored
scratch. Snapshotting them onto the branch so the project state survives on
the remote. Excludes the regenerable review-package diffs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
The Literal(Result<Row, EvalError>, _) payload deviation exists so error
literals survive the lower/raise bridge. Phase 0 had no test exercising the
Err arm. Add one (review finding).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Adds the e-class analysis framework to ScalarEGraph that Phase 1 rewrite
rules will use as side-condition guards.

`ClassAnalysis` carries two fields, both conservative upper bounds:
- `could_error: bool`: OR over all nodes in the class, mirroring
  MirScalarExpr::could_error (scalar.rs:1226) at the e-node level.
- `literal: Option<(Result<Row,EvalError>, ReprColumnType)>`: Some iff
  the class contains a Literal node.

The `analysis` HashMap<Id, ClassAnalysis> is maintained in ScalarEGraph:
- `add`: computes the new class's analysis via `analysis::make` immediately
  after interning (bottom-up, so all child analyses exist).
- `union`: calls `analysis::merge` (OR / or()) to fold rb into ra and
  removes the rb entry.
- `rebuild`: clears and recomputes all analyses from scratch after the
  congruence fixpoint, using a fixpoint loop to handle unknown topological
  order among surviving classes.

Exposes `pub fn analysis(&self, id: Id) -> &ClassAnalysis` for Phase 1.

Ships no rewrite rules. All 12 existing round-trip tests still pass.
10 new analysis tests cover: could_error false/true for each leaf variant,
propagation through CallBinary (safe func + safe children, erroring child,
erroring func), literal Some/None, and merge behavior (OR survives union,
literal survives union with non-literal).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
union now requires both classes to carry an analysis via expect rather than
defaulting a missing entry to could_error=false. false is the unsound
direction for a conservative analysis: a rule reading it as a guard could drop
an operand that can error. A missing entry is a bug, so fail loudly.

Also restore the Debug derive on ScalarEGraph and assert the non-empty-class
invariant in the rebuild analysis recompute, where its violation would
otherwise surface as a later analysis() panic.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Snapshot of SDD progress through Phase 1 task 1a (analyses, reviewed) and the
1b brief (rule infra + constant folding, in flight). Scratch ledger.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lyxbqj9pctLPT2nSTy8DLu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant