Skip to content

expose SpjNormalForm::predicate() + Predicate Display impl#120

Merged
xudong963 merged 2 commits into
datafusion-contrib:mainfrom
zhuqi-lucas:qizhu/expose-spj-predicate
Jun 25, 2026
Merged

expose SpjNormalForm::predicate() + Predicate Display impl#120
xudong963 merged 2 commits into
datafusion-contrib:mainfrom
zhuqi-lucas:qizhu/expose-spj-predicate

Conversation

@zhuqi-lucas

Copy link
Copy Markdown
Contributor

Summary

SpjNormalForm already exposes public getters for output_schema, output_exprs, and referenced_tables, but predicate stays private. Downstream consumers that want to surface the SPJ normal form via catalog metadata or for debugging view-matching engagement currently have to walk the original LogicalPlan and re-collect Filter / TableScan.filters themselves, which loses the normalization the analyzer applied.

This PR closes that gap:

  • Promotes Predicate and ColumnEquivalenceClass from private to pub so they can be referenced from public APIs.
  • Adds pub fn predicate(&self) -> &Predicate on SpjNormalForm.
  • Adds a Display impl on Predicate that AND-joins the three internal buckets the subsumption tests already use:
    • pairwise equalities derived from each multi-column equivalence class (singletons emit nothing),
    • narrowed range intervals per equivalence class (classes whose interval is still the default unbounded interval are skipped, so only meaningful constraints surface),
    • residual filter expressions, rendered via their Expr Display and sorted for deterministic output.

The internal fields stay private, so this does not lock in any particular representation -- only the read-only view is published.

Motivation

Downstream of this crate (in atlas), we are exposing two catalog tables meta.mv_normal_form / meta.mv_candidate_map so engineers can inspect why view matching does or does not engage on a given table (e.g., why is tickers_by_market_inactive_v1 only a partial-coverage MV? Answer: it carries active = false). The current workaround walks the un-normalized LogicalPlan, which works but mirrors the analyzer's job. Routing through the normalized predicate keeps the catalog output consistent with what the matcher actually consumes.

Tests

Existing 21 tests pass. Adds one new test test_predicate_getter_and_display that exercises the new getter + Display on a predicate populating all three buckets (equality class, range, residual).

`SpjNormalForm` already exposes public getters for `output_schema`,
`output_exprs`, and `referenced_tables`, but the `predicate` field is
private. Downstream consumers that want to surface the SPJ normal form
(e.g. atlas catalog tables exposing `mv_normal_form` for debugging why
view matching does or does not engage) currently have to walk the
original `LogicalPlan` and re-collect Filter / TableScan filters as a
string, which loses the normalization the analyzer applied.

This change:

* Promotes `Predicate` and `ColumnEquivalenceClass` from private to
  `pub` so they can be referenced from public APIs.
* Adds `pub fn predicate(&self) -> &Predicate` on `SpjNormalForm`.
* Adds a `Display` impl on `Predicate` that AND-joins:
  - pairwise equalities derived from each multi-column equivalence
    class (singletons emit nothing),
  - narrowed range intervals per equivalence class (classes whose
    interval is still the default unbounded interval are skipped, so
    only meaningful constraints surface),
  - residual filter expressions, sorted for deterministic output.

The internal fields stay private, so this does not lock in any
particular representation; only the read-only view is published.

A unit test exercises the new getter + Display on a predicate that
populates all three buckets (eq class, range, residual).
Copilot AI review requested due to automatic review settings June 23, 2026 06:04

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR exposes the normalized filter predicate from SpjNormalForm so downstream consumers can surface/debug the analyzer’s SPJ-normalized filters, and adds a human-readable Display rendering for that predicate.

Changes:

  • Adds pub fn predicate(&self) -> &Predicate to SpjNormalForm.
  • Promotes Predicate and ColumnEquivalenceClass to pub and implements Display for Predicate.
  • Adds a new unit test covering the getter + Display.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/rewrite/normal_form.rs
Comment thread src/rewrite/normal_form.rs Outdated
Comment thread src/rewrite/normal_form.rs
- Reword the `Predicate` doc comment so it no longer implies field
  accessors that don't exist; describe the internal buckets instead.
- Make `Display for Predicate` render `FALSE` when any equivalence
  class has an empty range (`None` in `ranges_by_equivalence_class`
  after a prior `Interval::intersect` collapsed). Previously the
  unsatisfiable predicate looked identical to "no filter", which is
  misleading for catalog surfacing.
- Revert `pub struct ColumnEquivalenceClass` to `struct`; the type is
  not referenced from any public signature, so keeping it private avoids
  needlessly expanding the crate's public surface.
- Add `debug_assert!` calls on the in-loop `continue` paths in Display
  that fire only on broken internal invariants (length mismatches,
  empty classes, missing schema entries). The intentional unbounded
  skip stays a silent continue.
- Strengthen `test_predicate_getter_and_display` so it actually asserts
  the range bucket (the `a >= 5` narrowing) renders, not just the
  equality + residual buckets.
@xudong963 xudong963 merged commit 7aaabea into datafusion-contrib:main Jun 25, 2026
8 checks passed
@github-actions github-actions Bot mentioned this pull request Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants