perf: use sargable (Search ARGument ABLE) range predicates for datetime search filters#809
Conversation
Replace date_trunc() (which wraps columns in date_format() on MySQL) with range-based comparisons on the raw column. This allows the database to use indexes on datetime columns instead of performing full table scans.
b820667 to
d738805
Compare
There was a problem hiding this comment.
Pull request overview
This PR optimizes datetime search filter performance by replacing non-sargable date_trunc() function calls with index-friendly range predicates. The old implementation wrapped datetime columns in date_format() (MySQL) or strftime() (SQLite) which prevented database index usage, causing full table scans. The new implementation uses direct range comparisons (col >= start AND col < end) that allow the database to utilize indexes on datetime columns, resulting in significant performance improvements (e.g., 0.649s vs 12.806s in the example provided).
Changes:
- Removed
date_trunc()wrapper and import frombase.py - Added three new helper functions to compute datetime period boundaries and build sargable range expressions
- Modified
apply_search_filtersto use range predicates for all datetime search operations (eq, neq, gt, lt, in, not in) across all precision levels (YEAR, MONTH, DAY, HOUR, MINUTE, SECOND)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Replace raw string comparisons with ScalarSearchOperator, VectorSearchOperator, and a new TimeResolution StrEnum for type safety in apply_search_filters and the datetime range helpers.
aldbr
left a comment
There was a problem hiding this comment.
Oops I don't know what happened but it looks like some of my comments were removed 😅
Remove the date_trunc function from functions.py as it has no remaining callers. Replace elif with if in range builder functions since each branch returns early.
…me search filters (DIRACGrid#809) * perf: use sargable range predicates for datetime search filters Replace date_trunc() (which wraps columns in date_format() on MySQL) with range-based comparisons on the raw column. This allows the database to use indexes on datetime columns instead of performing full table scans.
apply_search_filterspreviously useddate_trunc()to wrap datetime columns indate_format()(MySQL) /strftime()(SQLite), which prevents the database from using indexes on those columnscol >= start AND col < endinstead ofdate_format(col, '%Y-%m-%d') = '2025-08-25')Closes #642