Skip to content

Commit eeff7d2

Browse files
committed
Catch up PR with current upstream/main and feature/mustang-ppl-integration
Brings the catch-up branch up to current upstream/main (4 commits since this PR was opened) and current feature/mustang-ppl-integration (9 commits since this PR was opened), so the PR is mergeable into feature/mustang-ppl-integration without conflicts. Squashed (rather than two real merge commits) for the same DCO reason the original commit was squashed: upstream commits authored by many contributors with inconsistent or missing Signed-off-by trailers would otherwise be brought into this PR's history. Newer main commits absorbed (4): - opensearch-project#5419 (LENGTH/REGEXP_REPLACE/DATE_TRUNC unified function spec) - opensearch-project#5408 (datetime type normalization) - opensearch-project#5414 (Gradle wrapper bump + @ignore exclusion) - opensearch-project#5399 (FGAC-scoped SQL cursor continuation) Newer feature commits absorbed (9): - opensearch-project#5403 (analytics-engine optional dependency — major rewiring) - opensearch-project#5407 (Carry CalciteEvalCommandIT through helper-managed index path) - opensearch-project#5413 (Default plugins.calcite.enabled=true on unified path) - opensearch-project#5415, opensearch-project#5416, opensearch-project#5417, opensearch-project#5409, opensearch-project#5400, opensearch-project#5406 (smaller carryovers + bumps) Conflict resolutions (10 from main side, 3 from feature side): api/spec/* (LanguageSpec, UnifiedFunctionSpec, UnifiedPplSpec, UnifiedSqlSpec): took main. Main is a strict superset — adds postAnalysisRules and preCompilationRules extension points, the new FunctionSpecBuilder DSL, SCALAR category for length/regexp_replace/ date_trunc, the DatetimeExtension on PPL spec, and the CoreExtension wiring on SQL spec. PR's RELEVANCE category is preserved unchanged. api/UnifiedQueryPlanner.java, api/compiler/UnifiedQueryCompiler.java: took main. Both adopt the new postAnalysisRules / preCompilationRules hooks introduced in opensearch-project#5408 / opensearch-project#5419. core/executor/QueryService.java: composed both sides — kept HEAD's CalciteClassLoaderHelper.withCalciteClassLoader wrapper around main's StageErrorHandler stage tracking. Same pattern as the original PR resolution; both improvements are orthogonal. legacy/plugin/RestSqlAction.java: took HEAD. The 3-way merge produced a duplicated handleException/getRawErrorCode block; HEAD already contained both the delegateToV2Engine refactor and the ErrorReport unwrap from main, so HEAD is the correct superset. integ-test/build.gradle: took feature. Both sides added the same @ignore exclusion block; feature has alphabetical ordering and a more detailed comment explaining the Gradle 9.4.1 TestEventReporterAsListener cast bug. integ-test/.../CalciteEvalCommandIT.java: composed both sides. Took feature's helper-managed test_eval provisioning (createIndexByRestClient + isIndexExist guard, from opensearch-project#5407) so analytics-engine compatibility runs get a parquet-backed index. Added back PR HEAD's test_eval_agent setup (needed by the dotted-path eval tests for opensearch-project#5351) wrapped in its own isIndexExist guard for the same parquet-aware idempotency. plugin/.../TransportPPLQueryAction.java: took feature. PR opensearch-project#5403 made analytics-engine an optional dependency by moving QueryPlanExecutor from a required constructor parameter to an @Inject(optional=true) setter. Feature's design supersedes our prior wiring. plugin/.../SQLPlugin.java: took feature. The same opensearch-project#5403 simplification removed loadExtensions/EngineExtensionsHolder/executionEngineExtensions plumbing (no longer needed once analytics-engine is optionally bound). Feature retains the createSqlAnalyticsRouter method this PR introduced. plugin/.../config/EngineExtensionsHolder.java: deleted. Unreferenced after taking feature's SQLPlugin/TransportPPLQueryAction; not present on feature branch. Build: :api, :core, :opensearch-sql-plugin, :legacy compileJava + :integ-test compileTestJava all pass; unit tests pass; spotlessCheck clean. Signed-off-by: Kai Huang <ahkcs@amazon.com>
1 parent af831d3 commit eeff7d2

32 files changed

Lines changed: 1654 additions & 277 deletions

api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55

66
package org.opensearch.sql.api;
77

8+
import static org.opensearch.sql.common.setting.Settings.Key.CALCITE_ENGINE_ENABLED;
89
import static org.opensearch.sql.common.setting.Settings.Key.PPL_JOIN_SUBSEARCH_MAXOUT;
910
import static org.opensearch.sql.common.setting.Settings.Key.PPL_SUBSEARCH_MAXOUT;
1011
import static org.opensearch.sql.common.setting.Settings.Key.QUERY_SIZE_LIMIT;
@@ -119,13 +120,21 @@ public static class Builder {
119120
/**
120121
* Setting values with defaults from SysLimit.DEFAULT. Only includes planning-required settings
121122
* to avoid coupling with OpenSearchSettings.
123+
*
124+
* <p>{@link Settings.Key#CALCITE_ENGINE_ENABLED} defaults to {@code true} here because the
125+
* unified query path is by definition Calcite-based — every query reaching this context flows
126+
* through Calcite's planner, never the v2 engine. The PPL {@link
127+
* org.opensearch.sql.api.parser.PPLQueryParser} reuses the v2 {@code AstBuilder}, which gates
128+
* Calcite-only commands (e.g. {@code visitTableCommand}) on this setting; without the default,
129+
* those commands fail at parse time even when the cluster setting is true.
122130
*/
123131
private final Map<Settings.Key, Object> settings =
124132
new HashMap<Settings.Key, Object>(
125133
Map.of(
126134
QUERY_SIZE_LIMIT, SysLimit.DEFAULT.querySizeLimit(),
127135
PPL_SUBSEARCH_MAXOUT, SysLimit.DEFAULT.subsearchLimit(),
128-
PPL_JOIN_SUBSEARCH_MAXOUT, SysLimit.DEFAULT.joinSubsearchLimit()));
136+
PPL_JOIN_SUBSEARCH_MAXOUT, SysLimit.DEFAULT.joinSubsearchLimit(),
137+
CALCITE_ENGINE_ENABLED, true));
129138

130139
/**
131140
* Sets the query language frontend to be used.

api/src/main/java/org/opensearch/sql/api/UnifiedQueryPlanner.java

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,15 @@ public UnifiedQueryPlanner(UnifiedQueryContext context) {
6060
*/
6161
public RelNode plan(String query) {
6262
try {
63-
return context.measure(ANALYZE, () -> strategy.plan(query));
63+
return context.measure(
64+
ANALYZE,
65+
() -> {
66+
RelNode plan = strategy.plan(query);
67+
for (var shuttle : context.getLangSpec().postAnalysisRules()) {
68+
plan = plan.accept(shuttle);
69+
}
70+
return plan;
71+
});
6472
} catch (SyntaxCheckException | UnsupportedOperationException e) {
6573
throw e;
6674
} catch (Exception e) {

api/src/main/java/org/opensearch/sql/api/compiler/UnifiedQueryCompiler.java

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,11 @@ public PreparedStatement compile(@NonNull RelNode plan) {
5555
}
5656

5757
private PreparedStatement doCompile(RelNode plan) throws Exception {
58+
// Apply pre-compilation rules (e.g., late-binding function impl)
59+
for (var rule : context.getLangSpec().preCompilationRules()) {
60+
plan = plan.accept(rule);
61+
}
62+
5863
// Apply shuttle to convert LogicalTableScan to BindableTableScan
5964
final RelHomogeneousShuttle shuttle =
6065
new RelHomogeneousShuttle() {
Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
6+
package org.opensearch.sql.api.spec;
7+
8+
import java.util.List;
9+
import java.util.Objects;
10+
import java.util.function.BiFunction;
11+
import javax.annotation.Nullable;
12+
import lombok.RequiredArgsConstructor;
13+
import org.apache.calcite.rel.type.RelDataType;
14+
import org.apache.calcite.rel.type.RelDataTypeFactory;
15+
import org.apache.calcite.rex.RexBuilder;
16+
import org.apache.calcite.rex.RexCall;
17+
import org.apache.calcite.rex.RexNode;
18+
import org.apache.calcite.sql.SqlCallBinding;
19+
import org.apache.calcite.sql.SqlFunction;
20+
import org.apache.calcite.sql.SqlFunctionCategory;
21+
import org.apache.calcite.sql.SqlIdentifier;
22+
import org.apache.calcite.sql.SqlKind;
23+
import org.apache.calcite.sql.SqlOperandCountRange;
24+
import org.apache.calcite.sql.SqlOperator;
25+
import org.apache.calcite.sql.parser.SqlParserPos;
26+
import org.apache.calcite.sql.type.InferTypes;
27+
import org.apache.calcite.sql.type.OperandTypes;
28+
import org.apache.calcite.sql.type.SqlOperandCountRanges;
29+
import org.apache.calcite.sql.type.SqlOperandMetadata;
30+
import org.apache.calcite.sql.type.SqlReturnTypeInference;
31+
import org.apache.calcite.sql.type.SqlTypeFamily;
32+
import org.apache.calcite.sql.validate.SqlUserDefinedFunction;
33+
34+
/** Fluent DSL for building {@link UnifiedFunctionSpec} instances. */
35+
@RequiredArgsConstructor
36+
class FunctionSpecBuilder {
37+
/** Function name to register. */
38+
private final String name;
39+
40+
/**
41+
* Wraps an existing Calcite operator, preserving its native type system and RexImpTable
42+
* implementation for in-memory execution.
43+
*
44+
* @param op the Calcite operator to delegate to
45+
* @return a builder that produces the spec on {@code build()}
46+
*/
47+
DelegateFunctionBuilder delegateTo(SqlOperator op) {
48+
return new DelegateFunctionBuilder(name, op);
49+
}
50+
51+
/**
52+
* Builds a pushdown-only UDF with relaxed type checking. The resulting function has no local
53+
* implementation and delegates execution to the data source via pushdown.
54+
*
55+
* @param paramNames required parameter names for signature display
56+
* @return a builder that produces the spec on {@code build()}
57+
*/
58+
CatalogFunctionBuilder vararg(String... paramNames) {
59+
return new CatalogFunctionBuilder(name, List.of(paramNames));
60+
}
61+
62+
/**
63+
* Builds a typed SqlFunction with strict operand type checking. Optionally accepts a late-binding
64+
* {@code impl} that rewrites the function into executable Calcite expressions at compilation
65+
* time.
66+
*
67+
* @param families operand type families for validation
68+
* @return a builder that produces the spec on {@code build()}
69+
*/
70+
DefaultFunctionBuilder operands(SqlTypeFamily... families) {
71+
return new DefaultFunctionBuilder(name, families);
72+
}
73+
74+
@RequiredArgsConstructor
75+
static class DefaultFunctionBuilder {
76+
private final String name;
77+
private final SqlTypeFamily[] operandFamilies;
78+
private SqlReturnTypeInference returnType;
79+
private SqlFunctionCategory category = SqlFunctionCategory.USER_DEFINED_FUNCTION;
80+
private @Nullable BiFunction<RexBuilder, RexCall, RexNode> impl;
81+
82+
DefaultFunctionBuilder returns(SqlReturnTypeInference type) {
83+
this.returnType = type;
84+
return this;
85+
}
86+
87+
DefaultFunctionBuilder category(SqlFunctionCategory cat) {
88+
this.category = cat;
89+
return this;
90+
}
91+
92+
/**
93+
* Defines how this function executes by rewriting to existing Calcite operators. Applied only
94+
* at compilation time (late binding) — the logical plan preserves the original function call.
95+
*
96+
* @param impl rewrite function that converts this call into executable RexNodes
97+
* @return this builder
98+
*/
99+
DefaultFunctionBuilder impl(BiFunction<RexBuilder, RexCall, RexNode> impl) {
100+
this.impl = impl;
101+
return this;
102+
}
103+
104+
UnifiedFunctionSpec build() {
105+
Objects.requireNonNull(returnType, "returns() is required");
106+
SqlFunction op =
107+
new SqlFunction(
108+
name.toUpperCase(),
109+
SqlKind.OTHER_FUNCTION,
110+
returnType,
111+
null,
112+
OperandTypes.family(operandFamilies),
113+
category);
114+
return new UnifiedFunctionSpec(name.toLowerCase(), op, impl);
115+
}
116+
}
117+
118+
@RequiredArgsConstructor
119+
static class DelegateFunctionBuilder {
120+
private final String name;
121+
private final SqlOperator operator;
122+
123+
UnifiedFunctionSpec build() {
124+
return new UnifiedFunctionSpec(name.toLowerCase(), operator, null);
125+
}
126+
}
127+
128+
@RequiredArgsConstructor
129+
static class CatalogFunctionBuilder {
130+
private final String name;
131+
private final List<String> paramNames;
132+
private SqlReturnTypeInference returnType;
133+
134+
CatalogFunctionBuilder returnType(SqlReturnTypeInference type) {
135+
this.returnType = type;
136+
return this;
137+
}
138+
139+
UnifiedFunctionSpec build() {
140+
Objects.requireNonNull(returnType, "returnType is required");
141+
return new UnifiedFunctionSpec(
142+
name,
143+
new SqlUserDefinedFunction(
144+
new SqlIdentifier(name, SqlParserPos.ZERO),
145+
SqlKind.OTHER_FUNCTION,
146+
returnType,
147+
InferTypes.ANY_NULLABLE,
148+
new VariadicOperandMetadata(paramNames),
149+
List::of), // Pushdown-only: no local implementation
150+
null);
151+
}
152+
}
153+
154+
/**
155+
* Custom operand metadata that bypasses Calcite's built-in type checking. Calcite's {@code
156+
* FamilyOperandTypeChecker} rejects variadic calls (CALCITE-5366), so this implementation accepts
157+
* any operand types and delegates validation to pushdown.
158+
*/
159+
record VariadicOperandMetadata(List<String> paramNames) implements SqlOperandMetadata {
160+
161+
@Override
162+
public List<String> paramNames() {
163+
return paramNames;
164+
}
165+
166+
@Override
167+
public List<RelDataType> paramTypes(RelDataTypeFactory tf) {
168+
return List.of();
169+
}
170+
171+
@Override
172+
public boolean checkOperandTypes(SqlCallBinding binding, boolean throwOnFailure) {
173+
return true;
174+
}
175+
176+
@Override
177+
public SqlOperandCountRange getOperandCountRange() {
178+
return SqlOperandCountRanges.from(paramNames.size());
179+
}
180+
181+
@Override
182+
public String getAllowedSignatures(SqlOperator op, String opName) {
183+
return opName + "(" + String.join(", ", paramNames) + "[, option=value ...])";
184+
}
185+
}
186+
}

api/src/main/java/org/opensearch/sql/api/spec/LanguageSpec.java

Lines changed: 42 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77

88
import java.util.ArrayList;
99
import java.util.List;
10+
import org.apache.calcite.rel.RelShuttle;
1011
import org.apache.calcite.sql.SqlNode;
1112
import org.apache.calcite.sql.SqlOperatorTable;
1213
import org.apache.calcite.sql.fun.SqlStdOperatorTable;
@@ -17,8 +18,8 @@
1718

1819
/**
1920
* Language specification defining the dialect the engine accepts. Provides parser configuration,
20-
* validator configuration, and composable {@link LanguageExtension}s that contribute operators and
21-
* post-parse rewrite rules.
21+
* validator configuration, and composable {@link LanguageExtension}s that contribute operators,
22+
* post-parse rewrite rules, and post-analysis rewrite rules.
2223
*
2324
* <p>Implementations define a complete language surface — for example, {@link UnifiedSqlSpec}
2425
* provides ANSI and extended SQL modes. A future PPL spec would implement this same interface once
@@ -27,8 +28,9 @@
2728
public interface LanguageSpec {
2829

2930
/**
30-
* A composable language extension that contributes operators and post-parse rewrite rules. All
31-
* methods have defaults so extensions only override what they need.
31+
* A composable language extension that contributes operators, post-parse rewrite rules, and
32+
* post-analysis rewrite rules. All methods have defaults so extensions only override what they
33+
* need.
3234
*/
3335
interface LanguageExtension {
3436

@@ -47,6 +49,23 @@ default SqlOperatorTable operators() {
4749
default List<SqlVisitor<SqlNode>> postParseRules() {
4850
return List.of();
4951
}
52+
53+
/**
54+
* RelNode rewrite rules applied after analysis and before execution. Each rule transforms the
55+
* logical plan tree. Rules within a single extension are applied in list order.
56+
*/
57+
default List<RelShuttle> postAnalysisRules() {
58+
return List.of();
59+
}
60+
61+
/**
62+
* Pre-compilation rules applied only before in-memory execution. Each rule transforms the
63+
* logical plan (e.g., binding function implementations). Not applied when the plan is consumed
64+
* by external engines.
65+
*/
66+
default List<RelShuttle> preCompilationRules() {
67+
return List.of();
68+
}
5069
}
5170

5271
/**
@@ -62,9 +81,9 @@ default List<SqlVisitor<SqlNode>> postParseRules() {
6281
SqlValidator.Config validatorConfig();
6382

6483
/**
65-
* Language extensions registered with this spec. Each extension contributes operators and
66-
* post-parse rewrite rules that are composed by {@link #operatorTable()} and {@link
67-
* #postParseRules()}.
84+
* Language extensions registered with this spec. Each extension contributes operators, post-parse
85+
* rewrite rules, and post-analysis rewrite rules composed by {@link #operatorTable()}, {@link
86+
* #postParseRules()}, and {@link #postAnalysisRules()}.
6887
*/
6988
List<LanguageExtension> extensions();
7089

@@ -86,4 +105,20 @@ default SqlOperatorTable operatorTable() {
86105
default List<SqlVisitor<SqlNode>> postParseRules() {
87106
return extensions().stream().flatMap(ext -> ext.postParseRules().stream()).toList();
88107
}
108+
109+
/**
110+
* All post-analysis RelNode rewrite rules from registered extensions, flattened in registration
111+
* order. Applied to the logical plan after analysis and before execution.
112+
*/
113+
default List<RelShuttle> postAnalysisRules() {
114+
return extensions().stream().flatMap(ext -> ext.postAnalysisRules().stream()).toList();
115+
}
116+
117+
/**
118+
* All pre-compilation rules from registered extensions, flattened in registration order. Applied
119+
* only before in-memory execution.
120+
*/
121+
default List<RelShuttle> preCompilationRules() {
122+
return extensions().stream().flatMap(ext -> ext.preCompilationRules().stream()).toList();
123+
}
89124
}

0 commit comments

Comments
 (0)