Skip to content

Commit d538b74

Browse files
committed
Add Calcite native SQL planning in UnifiedQueryPlanner (opensearch-project#5257)
* feat(api): Add Calcite native SQL planning path in UnifiedQueryPlanner Add SQL support to the unified query API using Calcite's native parser pipeline (SqlParser → SqlValidator → SqlToRelConverter → RelNode), bypassing the ANTLR parser used by PPL. Changes: - UnifiedQueryPlanner: use PlanningStrategy to dispatch CalciteNativeStrategy vs CustomVisitorStrategy - CalciteNativeStrategy: Calcite Planner with try-with-resources for ANSI SQL - CustomVisitorStrategy: ANTLR-based path for PPL (and future SQL V2) - UnifiedQueryContext: SqlParser.Config with Casing.UNCHANGED to preserve lowercase OpenSearch index names Signed-off-by: Chen Dai <daichen@amazon.com> * test(api): Add SQL planner tests and refactor test base for multi-language support - Refactor UnifiedQueryTestBase with queryType() hook for subclass override - Add UnifiedSqlQueryPlannerTest covering SELECT, WHERE, GROUP BY, JOIN, ORDER BY, subquery, case sensitivity, namespaces, and error handling - Update UnifiedQueryContextTest to verify SQL context creation Signed-off-by: Chen Dai <daichen@amazon.com> * perf(benchmarks): Add SQL queries to UnifiedQueryBenchmark Add language (PPL/SQL) and queryPattern param dimensions for side-by-side comparison of equivalent queries across both languages. Remove separate UnifiedSqlQueryBenchmark in favor of unified class. Signed-off-by: Chen Dai <daichen@amazon.com> * docs(api): Update README to reflect SQL support in UnifiedQueryPlanner Signed-off-by: Chen Dai <daichen@amazon.com> * fix(api): Normalize trailing whitespace in assertPlan comparison RelOptUtil.toString() appends a trailing newline after the last plan node, which doesn't match Java text block expectations. Also add \r\n normalization for Windows CI compatibility, consistent with the existing pattern in core module tests. Signed-off-by: Chen Dai <daichen@amazon.com> --------- Signed-off-by: Chen Dai <daichen@amazon.com>
1 parent fc3d795 commit d538b74

2 files changed

Lines changed: 8 additions & 76 deletions

File tree

api/README.md

Lines changed: 1 addition & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,7 @@ This module provides components organized into two main areas aligned with the [
88

99
### Unified Language Specification
1010

11-
- **`UnifiedQueryParser`**: Parses PPL (Piped Processing Language) or SQL queries and returns the native parse result (`UnresolvedPlan` for PPL, `SqlNode` for Calcite SQL).
12-
- **`UnifiedQueryPlanner`**: Accepts PPL or SQL queries and returns Calcite `RelNode` logical plans as intermediate representation.
11+
- **`UnifiedQueryPlanner`**: Accepts PPL (Piped Processing Language) or SQL queries and returns Calcite `RelNode` logical plans as intermediate representation.
1312
- **`UnifiedQueryTranspiler`**: Converts Calcite logical plans (`RelNode`) into SQL strings for various target databases using different SQL dialects.
1413

1514
### Unified Execution Runtime
@@ -43,20 +42,6 @@ UnifiedQueryContext context = UnifiedQueryContext.builder()
4342
.build();
4443
```
4544

46-
### UnifiedQueryParser
47-
48-
Use `UnifiedQueryParser` to parse queries into their native parse tree. The parser is owned by `UnifiedQueryContext` and returns the native parse result for each language.
49-
50-
```java
51-
// PPL parsing
52-
UnresolvedPlan ast = (UnresolvedPlan) context.getParser().parse("source = logs | where status = 200");
53-
54-
// SQL parsing (with QueryType.SQL context)
55-
SqlNode sqlNode = (SqlNode) sqlContext.getParser().parse("SELECT * FROM logs WHERE status = 200");
56-
```
57-
58-
Callers can then use each language's native visitor infrastructure (`AbstractNodeVisitor` for PPL, `SqlBasicVisitor` for Calcite SQL) on the typed result for further analysis.
59-
6045
### UnifiedQueryPlanner
6146

6247
Use `UnifiedQueryPlanner` to parse and analyze PPL or SQL queries into Calcite logical plans. The planner accepts a `UnifiedQueryContext` and can be reused for multiple queries.
@@ -194,59 +179,6 @@ try (UnifiedQueryContext context = UnifiedQueryContext.builder()
194179
}
195180
```
196181

197-
## Profiling
198-
199-
The unified query API supports the same [profiling capability](../docs/user/ppl/interfaces/endpoint.md#profile-experimental) as the PPL REST endpoint. When enabled, each unified query component automatically collects per-phase timing metrics. For code outside unified query components (e.g., `PreparedStatement.executeQuery()` or response formatting), `context.measure()` records custom phases into the same profile.
200-
201-
```java
202-
try (UnifiedQueryContext context = UnifiedQueryContext.builder()
203-
.language(QueryType.PPL)
204-
.catalog("catalog", schema)
205-
.defaultNamespace("catalog")
206-
.profiling(true)
207-
.build()) {
208-
209-
// Auto-profiled: ANALYZE
210-
RelNode plan = new UnifiedQueryPlanner(context).plan(query);
211-
212-
// Auto-profiled: OPTIMIZE
213-
PreparedStatement stmt = new UnifiedQueryCompiler(context).compile(plan);
214-
215-
// User-profiled via measure()
216-
ResultSet rs = context.measure(MetricName.EXECUTE, stmt::executeQuery);
217-
String json = context.measure(MetricName.FORMAT, () -> formatter.format(result));
218-
219-
// Retrieve profile snapshot
220-
QueryProfile profile = context.getProfile();
221-
}
222-
```
223-
224-
The returned `QueryProfile` follows the same JSON structure as the REST API:
225-
226-
```json
227-
{
228-
"summary": {
229-
"total_time_ms": 33.34
230-
},
231-
"phases": {
232-
"analyze": { "time_ms": 8.68 },
233-
"optimize": { "time_ms": 18.2 },
234-
"execute": { "time_ms": 4.87 },
235-
"format": { "time_ms": 0.05 }
236-
},
237-
"plan": {
238-
"node": "EnumerableCalc",
239-
"time_ms": 4.82,
240-
"rows": 2,
241-
"children": [
242-
{ "node": "CalciteEnumerableIndexScan", "time_ms": 4.12, "rows": 2 }
243-
]
244-
}
245-
}
246-
```
247-
248-
When profiling is disabled (the default), all components execute with zero overhead.
249-
250182
## Development & Testing
251183

252184
A set of unit tests is provided to validate planner behavior.

api/src/testFixtures/java/org/opensearch/sql/api/UnifiedQueryTestBase.java

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ protected Map<String, Table> getTableMap() {
5959
}
6060
};
6161

62-
context = contextBuilder().build();
62+
context = buildContext(queryType());
6363
planner = new UnifiedQueryPlanner(context);
6464
}
6565

@@ -70,12 +70,12 @@ protected QueryType queryType() {
7070
return QueryType.PPL;
7171
}
7272

73-
/**
74-
* Creates a pre-configured context builder with test schema. Subclasses can override to customize
75-
* context configuration (e.g., enable profiling).
76-
*/
77-
protected UnifiedQueryContext.Builder contextBuilder() {
78-
return UnifiedQueryContext.builder().language(queryType()).catalog(DEFAULT_CATALOG, testSchema);
73+
/** Builds a UnifiedQueryContext with the test schema for the given query type. */
74+
protected UnifiedQueryContext buildContext(QueryType queryType) {
75+
return UnifiedQueryContext.builder()
76+
.language(queryType)
77+
.catalog(DEFAULT_CATALOG, testSchema)
78+
.build();
7979
}
8080

8181
@After

0 commit comments

Comments
 (0)