File tree Expand file tree Collapse file tree
benchmarks/queries/clickbench Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -241,8 +241,24 @@ These queries test the performance of the `FIRST_VALUE` aggregation function wit
241241| Q12 | ` WatchID ` | ` Int64 ` | ` OS ` | ` Int16 ` | 91 |
242242
243243
244+ ### Q13: Filter-only URL prefix match
244245
246+ ** Question** : "Which counters have the most page views with URLs that look like HTTP URLs?"
245247
248+ ** Important Query Properties** : Filter-only string prefix match. The ` URL `
249+ column is used only by the pushed-down filter and is not projected or
250+ aggregated. This makes the query useful for measuring optimizations that can
251+ skip RowFilter evaluation when Parquet row group statistics prove that all rows
252+ in a row group satisfy the prefix predicate.
253+
254+ ``` sql
255+ SELECT " CounterID" , COUNT (* ) AS page_views
256+ FROM hits
257+ WHERE " URL" LIKE ' http%'
258+ GROUP BY " CounterID"
259+ ORDER BY page_views DESC
260+ LIMIT 10 ;
261+ ```
246262
247263## Data Notes
248264
Original file line number Diff line number Diff line change 1+ -- Must set for ClickBench hits_partitioned dataset. See https://github.com/apache/datafusion/issues/16591
2+ -- set datafusion.execution.parquet.binary_as_string = true
3+
4+ SELECT " CounterID" , COUNT (* ) AS page_views FROM hits WHERE " URL" LIKE ' http%' GROUP BY " CounterID" ORDER BY page_views DESC LIMIT 10 ;
You can’t perform that action at this time.
0 commit comments