Commit fe8dbfa
authored
Support distinct-from predicates in Parquet pruning (#22084)
## Which issue does this PR close?
- Closes #.
## Rationale for this change
Parquet statistics pruning did not rewrite `IS DISTINCT FROM` or `IS NOT
DISTINCT FROM`, so row groups that could be proven irrelevant from
min/max and null-count statistics were still kept.
## What changes are included in this PR?
- Adds null-aware pruning rewrites for `IS DISTINCT FROM` and `IS NOT
DISTINCT FROM`.
- Treats distinct-from operators as symmetric when normalizing
scalar-left predicates.
- Refactors shared min/max and null-count pruning expression builders.
- Adds unit tests for pruning predicate evaluation and Parquet row-group
regression coverage.
## Are these changes tested?
## Are there any user-facing changes?
No API changes. Queries using `IS DISTINCT FROM` and `IS NOT DISTINCT
FROM` can now benefit from Parquet statistics pruning.1 parent 04fbade commit fe8dbfa
3 files changed
Lines changed: 318 additions & 52 deletions
File tree
- datafusion
- core/tests/parquet
- expr-common/src
- pruning/src
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1777 | 1777 | | |
1778 | 1778 | | |
1779 | 1779 | | |
| 1780 | + | |
| 1781 | + | |
| 1782 | + | |
| 1783 | + | |
| 1784 | + | |
| 1785 | + | |
| 1786 | + | |
| 1787 | + | |
| 1788 | + | |
1780 | 1789 | | |
1781 | 1790 | | |
1782 | 1791 | | |
| |||
1793 | 1802 | | |
1794 | 1803 | | |
1795 | 1804 | | |
| 1805 | + | |
| 1806 | + | |
| 1807 | + | |
| 1808 | + | |
| 1809 | + | |
| 1810 | + | |
| 1811 | + | |
| 1812 | + | |
| 1813 | + | |
| 1814 | + | |
| 1815 | + | |
| 1816 | + | |
| 1817 | + | |
| 1818 | + | |
| 1819 | + | |
| 1820 | + | |
| 1821 | + | |
| 1822 | + | |
| 1823 | + | |
| 1824 | + | |
| 1825 | + | |
| 1826 | + | |
| 1827 | + | |
| 1828 | + | |
| 1829 | + | |
| 1830 | + | |
| 1831 | + | |
| 1832 | + | |
| 1833 | + | |
| 1834 | + | |
| 1835 | + | |
| 1836 | + | |
| 1837 | + | |
| 1838 | + | |
| 1839 | + | |
| 1840 | + | |
| 1841 | + | |
| 1842 | + | |
| 1843 | + | |
| 1844 | + | |
| 1845 | + | |
| 1846 | + | |
| 1847 | + | |
| 1848 | + | |
| 1849 | + | |
| 1850 | + | |
| 1851 | + | |
| 1852 | + | |
| 1853 | + | |
| 1854 | + | |
| 1855 | + | |
| 1856 | + | |
| 1857 | + | |
| 1858 | + | |
| 1859 | + | |
| 1860 | + | |
| 1861 | + | |
| 1862 | + | |
| 1863 | + | |
| 1864 | + | |
| 1865 | + | |
| 1866 | + | |
| 1867 | + | |
| 1868 | + | |
| 1869 | + | |
| 1870 | + | |
1796 | 1871 | | |
1797 | 1872 | | |
1798 | 1873 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
255 | 255 | | |
256 | 256 | | |
257 | 257 | | |
258 | | - | |
259 | | - | |
260 | | - | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
261 | 261 | | |
262 | 262 | | |
263 | 263 | | |
| |||
0 commit comments