Commit 6f333fc
committed
Force hash/merge join on no-BY rewrite to fix N-call OS round-trip blowup
A perf A/B on a local 20k-doc index uncovered a real problem in the
no-BY rewrite that was hidden by tests with bounded result sets.
Before this fix, the no-BY case emitted:
EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
leftScan (returns N rows)
rightScan (returns 1 row — the COUNT() scalar)
Calcite's NestedLoopJoin contract calls Enumerable.enumerator() on the
right side once per left tuple. Each enumerator open on a
CalciteEnumerableIndexScan triggers a fresh OpenSearch _search request.
For a 10k-row left side that means 10k OpenSearch calls. On a remote
cluster (1-10ms RTT per call), the head-less query would take tens of
seconds.
Measured on the local node, 20k docs, single shard, no head:
before: 10,004 OS calls per PPL query, ~1055ms wall
after: 4 OS calls per PPL query, ~174ms wall
That's ~6x faster wall and ~2500x fewer OS round-trips, with no
correctness change (results identical).
Fix: in the no-BY branch of rewriteWindowAsAggregateJoin, project a
literal-0 key column onto both sides (left: append after orig cols;
right: append after agg outputs) and join on equality. The equi-join
condition makes the planner pick EnumerableHashJoin, which drains the
single-row right side once into a hash table and probes per left row
in O(1).
Pushdown still fires on the right side — verified via EXPLAIN that the
right scan still carries `AGGREGATION->...COUNT()` and `size:0` in
PushDownContext; the literal-0 projection is a top-level wrapper that
doesn't disrupt the Aggregate→Scan operand chain
AggregateIndexScanRule.AGGREGATE_SCAN matches.
The BY case is unchanged — it already has an equi-join condition (or
IS NOT DISTINCT FROM for bucketNullable=true) which Calcite handles
correctly via EnumerableMergeJoin.
Test expectation updates:
- CalcitePPLEventstatsTest.testEventstatsCount / testEventstatsAvg
- CalcitePPLEventstatsEarliestLatestTest no-BY variants (4 tests)
- explain_eventstats_dc.json (no-BY, pushdown)
- explain_eventstats_earliest_latest_no_group.json (no-BY, both modes)
Outer LogicalProject now appears in the no-BY case because we must
strip the literal-0 key columns from the join output — it's no longer
a no-op passthrough that Calcite folds.
Signed-off-by: Jialiang Liang <jiallian@amazon.com>1 parent bb8d9b2 commit 6f333fc
6 files changed
Lines changed: 116 additions & 51 deletions
File tree
- core/src/main/java/org/opensearch/sql/calcite
- integ-test/src/test/resources/expectedOutput
- calcite_no_pushdown
- calcite
- ppl/src/test/java/org/opensearch/sql/ppl/calcite
Lines changed: 44 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2165 | 2165 | | |
2166 | 2166 | | |
2167 | 2167 | | |
2168 | | - | |
2169 | | - | |
2170 | | - | |
2171 | | - | |
2172 | | - | |
2173 | | - | |
| 2168 | + | |
| 2169 | + | |
| 2170 | + | |
| 2171 | + | |
| 2172 | + | |
| 2173 | + | |
| 2174 | + | |
| 2175 | + | |
2174 | 2176 | | |
| 2177 | + | |
| 2178 | + | |
| 2179 | + | |
| 2180 | + | |
| 2181 | + | |
| 2182 | + | |
| 2183 | + | |
| 2184 | + | |
| 2185 | + | |
| 2186 | + | |
| 2187 | + | |
| 2188 | + | |
| 2189 | + | |
| 2190 | + | |
| 2191 | + | |
| 2192 | + | |
| 2193 | + | |
| 2194 | + | |
| 2195 | + | |
| 2196 | + | |
| 2197 | + | |
| 2198 | + | |
2175 | 2199 | | |
2176 | 2200 | | |
2177 | 2201 | | |
2178 | | - | |
| 2202 | + | |
| 2203 | + | |
| 2204 | + | |
| 2205 | + | |
| 2206 | + | |
2179 | 2207 | | |
2180 | 2208 | | |
2181 | 2209 | | |
| |||
2199 | 2227 | | |
2200 | 2228 | | |
2201 | 2229 | | |
2202 | | - | |
2203 | | - | |
2204 | | - | |
2205 | | - | |
| 2230 | + | |
| 2231 | + | |
| 2232 | + | |
2206 | 2233 | | |
2207 | 2234 | | |
2208 | 2235 | | |
2209 | 2236 | | |
2210 | 2237 | | |
2211 | 2238 | | |
2212 | 2239 | | |
2213 | | - | |
| 2240 | + | |
| 2241 | + | |
| 2242 | + | |
| 2243 | + | |
2214 | 2244 | | |
2215 | 2245 | | |
2216 | 2246 | | |
| |||
2219 | 2249 | | |
2220 | 2250 | | |
2221 | 2251 | | |
| 2252 | + | |
| 2253 | + | |
2222 | 2254 | | |
2223 | 2255 | | |
2224 | 2256 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
4 | | - | |
| 3 | + | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
4 | | - | |
| 3 | + | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
4 | | - | |
| 3 | + | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
0 commit comments