Commit f111fca
authored
[PPL][analytics-engine-compatibility] Lower isempty/isblank to CHAR_LENGTH = 0 instead of multiset IS_EMPTY (#5439)
* [PPL] Lower isempty/isblank to CHAR_LENGTH = 0 instead of multiset IS_EMPTY
PPLFuncImpTable was emitting SqlStdOperatorTable.IS_EMPTY against string
operands as part of its isempty(x) / isblank(x) lowering. That operator
is the SQL:2003 multiset emptiness predicate — its OperandTypeChecker is
COLLECTION_OR_MAP and its enumerable-runtime binding calls
java.util.Collection.isEmpty() reflectively. Passing a string slipped
through only because RexBuilder.makeCall bypasses the operand checker
and Calcite's Linq4j codegen emits a bare `target.isEmpty()` that happens
to bind to String.isEmpty() at Janino compile time. Outside of the
script-pushdown path, the abuse breaks: any backend that does not run
RexNodes through Calcite's enumerable codegen — most notably Substrait
emitters such as the analytics-engine route — has no isthmus mapping
for SqlStdOperatorTable.IS_EMPTY and rejects the plan with
"unresolved function reference IS EMPTY".
Replace the lowering with the predicate that actually matches the doc:
isempty(x) -> OR(IS_NULL(x), CHAR_LENGTH(x) = 0)
isblank(x) -> OR(IS_NULL(x), CHAR_LENGTH(TRIM(BOTH ' ' FROM x)) = 0)
CHAR_LENGTH is a standard string operator with explicit type semantics,
backed by SqlFunctions.charLength(String) in Calcite's enumerable
runtime, and it round-trips through Substrait's default extension
catalog. Same observable behavior on existing PPL paths; correct
behavior on Substrait-based paths.
Also remove PredicateAnalyzer.containIsEmptyFunction. It existed solely
to abort DSL pushdown when it saw the OR(IS_NULL, IS_EMPTY) shape
because the SHOULD branches would NPE on null operands evaluating
.isEmpty() through the script. The new shape produces
OR(IS_NULL, CHAR_LENGTH = 0); the IS_NULL arm pushes down to
must_not.exists, the CHAR_LENGTH arm to a script that returns null
(not NPE) for null fields, and the SHOULD union answers correctly.
Test plan:
- CalcitePPLConditionBuiltinFunctionIT.testIsEmpty / testIsBlank pass.
- CalciteExplainIT.testExplainIsEmpty / testExplainIsBlank /
testExplainIsEmptyOrOthers pass with regenerated golden plans for
both pushdown-enabled and no-pushdown variants.
- PredicateAnalyzerTest passes, including the unrelated
equals_scriptPushDown_Struct test that still uses
SqlStdOperatorTable.IS_EMPTY for its legitimate map-emptiness
semantics.
Signed-off-by: Songkan Tang <songkant@amazon.com>
* [PPL] Strengthen isempty pushdown test to assert mixed-shape bool.should
The previous test (isNullOr_ScriptPushDown) only asserted that the builder
string contained the opensearch_compounded_script lang marker. With the
isempty -> OR(IS_NULL, CHAR_LENGTH=0) lowering, the IS_NULL arm now pushes
down as a native bool.must_not.exists clause and only the CHAR_LENGTH=0
arm remains a script — so the substring assertion still passed without
verifying the actual structure. Renamed and rewritten to assert:
- Top-level builder is a BoolQueryBuilder with exactly two should arms
and no must / top-level must_not clauses.
- Arm 0 is a BoolQueryBuilder whose mustNot wraps an ExistsQueryBuilder
(the IS_NULL lowering).
- Arm 1 is a ScriptQueryBuilder using the opensearch_compounded_script
lang (the CHAR_LENGTH=0 lowering).
Also documents inline why the prior containIsEmptyFunction NPE-prevention
detector is no longer needed: CHAR_LENGTH(null) follows Calcite's STRICT
null policy and returns null instead of NPE, so DSL's non-short-circuiting
should evaluation is safe.
Signed-off-by: Songkan Tang <songkant@amazon.com>
* Apply spotless line-wrap fix
Signed-off-by: Songkan Tang <songkant@amazon.com>
---------
Signed-off-by: Songkan Tang <songkant@amazon.com>1 parent c5ae56f commit f111fca
9 files changed
Lines changed: 89 additions & 41 deletions
File tree
- core/src/main/java/org/opensearch/sql/expression/function
- integ-test/src/test/resources/expectedOutput
- calcite_no_pushdown
- calcite
- opensearch/src
- main/java/org/opensearch/sql/opensearch/request
- test/java/org/opensearch/sql/opensearch/request
Lines changed: 25 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1278 | 1278 | | |
1279 | 1279 | | |
1280 | 1280 | | |
| 1281 | + | |
| 1282 | + | |
| 1283 | + | |
| 1284 | + | |
| 1285 | + | |
| 1286 | + | |
| 1287 | + | |
| 1288 | + | |
| 1289 | + | |
| 1290 | + | |
| 1291 | + | |
| 1292 | + | |
| 1293 | + | |
1281 | 1294 | | |
1282 | 1295 | | |
1283 | 1296 | | |
1284 | 1297 | | |
1285 | 1298 | | |
1286 | 1299 | | |
1287 | 1300 | | |
1288 | | - | |
| 1301 | + | |
| 1302 | + | |
| 1303 | + | |
| 1304 | + | |
1289 | 1305 | | |
1290 | 1306 | | |
1291 | 1307 | | |
| |||
1295 | 1311 | | |
1296 | 1312 | | |
1297 | 1313 | | |
1298 | | - | |
| 1314 | + | |
1299 | 1315 | | |
1300 | | - | |
1301 | | - | |
1302 | | - | |
1303 | | - | |
| 1316 | + | |
| 1317 | + | |
| 1318 | + | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
1304 | 1323 | | |
1305 | 1324 | | |
1306 | 1325 | | |
| |||
Lines changed: 3 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
| 9 | + | |
Lines changed: 3 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
| 9 | + | |
Lines changed: 3 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
| 9 | + | |
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
0 commit comments