Skip to content

Commit d4daa34

Browse files
authored
Add unified query transpiler API (opensearch-project#4871)
* Add basic transpiler impl Signed-off-by: Chen Dai <daichen@amazon.com> * Add builder Signed-off-by: Chen Dai <daichen@amazon.com> * Use lombok builder Signed-off-by: Chen Dai <daichen@amazon.com> * Modify unified query planner UT to extend new test base class Signed-off-by: Chen Dai <daichen@amazon.com> * Update doc with API design caveat Signed-off-by: Chen Dai <daichen@amazon.com> * Move opensearch spark sql dialect out of test folder Signed-off-by: Chen Dai <daichen@amazon.com> * Update doc and test assertion message Signed-off-by: Chen Dai <daichen@amazon.com> * Fix line separator and license header Signed-off-by: Chen Dai <daichen@amazon.com> --------- Signed-off-by: Chen Dai <daichen@amazon.com>
1 parent 9930665 commit d4daa34

8 files changed

Lines changed: 238 additions & 39 deletions

File tree

api/README.md

Lines changed: 55 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,21 @@ This module provides a high-level integration layer for the Calcite-based query
44

55
## Overview
66

7-
The `UnifiedQueryPlanner` serves as the primary entry point for external consumers. It accepts PPL (Piped Processing Language) queries and returns Calcite `RelNode` logical plans as intermediate representation.
7+
This module provides two primary components:
8+
9+
- **`UnifiedQueryPlanner`**: Accepts PPL (Piped Processing Language) queries and returns Calcite `RelNode` logical plans as intermediate representation.
10+
- **`UnifiedQueryTranspiler`**: Converts Calcite logical plans (`RelNode`) into SQL strings for various target databases using different SQL dialects.
11+
12+
Together, these components enable a complete workflow: parse PPL queries into logical plans, then transpile those plans into target database SQL.
13+
14+
### Experimental API Design
15+
16+
**This API is currently experimental.** The design intentionally exposes Calcite abstractions (`Schema` for catalogs, `RelNode` as IR, `SqlDialect` for dialects) rather than creating custom wrapper interfaces. This is to avoid overdesign by leveraging the flexible Calcite interface in the short term. If a more abstracted API becomes necessary in the future, breaking changes may be introduced with the new abstraction layer.
817

918
## Usage
1019

20+
### UnifiedQueryPlanner
21+
1122
Use the declarative, fluent builder API to initialize the `UnifiedQueryPlanner`.
1223

1324
```java
@@ -21,6 +32,49 @@ UnifiedQueryPlanner planner = UnifiedQueryPlanner.builder()
2132
RelNode plan = planner.plan("source = opensearch.test");
2233
```
2334

35+
### UnifiedQueryTranspiler
36+
37+
Use `UnifiedQueryTranspiler` to convert Calcite logical plans into SQL strings for target databases. The transpiler supports various SQL dialects through Calcite's `SqlDialect` interface.
38+
39+
```java
40+
UnifiedQueryTranspiler transpiler = UnifiedQueryTranspiler.builder()
41+
.dialect(SparkSqlDialect.DEFAULT)
42+
.build();
43+
44+
String sql = transpiler.toSql(plan);
45+
```
46+
47+
### Complete Workflow Example
48+
49+
Combining both components to transpile PPL queries into target database SQL:
50+
51+
```java
52+
// Step 1: Initialize planner
53+
UnifiedQueryPlanner planner = UnifiedQueryPlanner.builder()
54+
.language(QueryType.PPL)
55+
.catalog("catalog", schema)
56+
.defaultNamespace("catalog")
57+
.build();
58+
59+
// Step 2: Parse PPL query into logical plan
60+
RelNode plan = planner.plan("source = employees | where age > 30");
61+
62+
// Step 3: Initialize transpiler with target dialect
63+
UnifiedQueryTranspiler transpiler = UnifiedQueryTranspiler.builder()
64+
.dialect(SparkSqlDialect.DEFAULT)
65+
.build();
66+
67+
// Step 4: Transpile to target SQL
68+
String sparkSql = transpiler.toSql(plan);
69+
// Result: SELECT * FROM `catalog`.`employees` WHERE `age` > 30
70+
```
71+
72+
Supported SQL dialects include:
73+
- `SparkSqlDialect.DEFAULT` - Apache Spark SQL
74+
- `PostgresqlSqlDialect.DEFAULT` - PostgreSQL
75+
- `MysqlSqlDialect.DEFAULT` - MySQL
76+
- And other Calcite-supported dialects
77+
2478
## Development & Testing
2579

2680
A set of unit tests is provided to validate planner behavior.

api/build.gradle

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55

66
plugins {
77
id 'java-library'
8+
id "io.freefair.lombok"
89
id 'jacoco'
910
id 'com.diffplug.spotless'
1011
}
@@ -25,6 +26,10 @@ spotless {
2526
exclude '**/build/**', '**/build-*/**', 'src/main/gen/**'
2627
}
2728
importOrder()
29+
licenseHeader("/*\n" +
30+
" * Copyright OpenSearch Contributors\n" +
31+
" * SPDX-License-Identifier: Apache-2.0\n" +
32+
" */\n\n")
2833
removeUnusedImports()
2934
trimTrailingWhitespace()
3035
endWithNewline()

api/src/main/java/org/opensearch/sql/api/EmptyDataSourceService.java

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
16
package org.opensearch.sql.api;
27

38
import java.util.Map;
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
6+
package org.opensearch.sql.api.transpiler;
7+
8+
import lombok.Builder;
9+
import org.apache.calcite.rel.RelNode;
10+
import org.apache.calcite.rel.rel2sql.RelToSqlConverter;
11+
import org.apache.calcite.sql.SqlDialect;
12+
import org.apache.calcite.sql.SqlNode;
13+
14+
/**
15+
* Transpiles Calcite logical plans ({@link RelNode}) into SQL strings for various target databases.
16+
* Uses Calcite's {@link RelToSqlConverter} to perform the conversion, respecting the specified SQL
17+
* dialect.
18+
*/
19+
@Builder
20+
public class UnifiedQueryTranspiler {
21+
22+
/** Target SQL dialect */
23+
private final SqlDialect dialect;
24+
25+
/**
26+
* Converts a Calcite logical plan to a SQL string using the configured target dialect.
27+
*
28+
* @param plan the logical plan to convert (must not be null)
29+
* @return the generated SQL string
30+
*/
31+
public String toSql(RelNode plan) {
32+
try {
33+
RelToSqlConverter converter = new RelToSqlConverter(dialect);
34+
SqlNode sqlNode = converter.visitRoot(plan).asStatement();
35+
return sqlNode.toSqlString(dialect).getSql();
36+
} catch (Exception e) {
37+
throw new IllegalStateException("Failed to transpile logical plan to SQL", e);
38+
}
39+
}
40+
}

api/src/test/java/org/opensearch/sql/api/UnifiedQueryPlannerTest.java

Lines changed: 14 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -8,41 +8,15 @@
88
import static org.junit.Assert.assertNotNull;
99
import static org.junit.Assert.assertThrows;
1010

11-
import java.util.List;
1211
import java.util.Map;
1312
import org.apache.calcite.rel.RelNode;
14-
import org.apache.calcite.rel.type.RelDataType;
15-
import org.apache.calcite.rel.type.RelDataTypeFactory;
1613
import org.apache.calcite.schema.Schema;
17-
import org.apache.calcite.schema.Table;
1814
import org.apache.calcite.schema.impl.AbstractSchema;
19-
import org.apache.calcite.schema.impl.AbstractTable;
20-
import org.apache.calcite.sql.type.SqlTypeName;
2115
import org.junit.Test;
2216
import org.opensearch.sql.common.antlr.SyntaxCheckException;
2317
import org.opensearch.sql.executor.QueryType;
2418

25-
public class UnifiedQueryPlannerTest {
26-
27-
/** Test schema consists of a test table with id and name columns */
28-
private final AbstractSchema testSchema =
29-
new AbstractSchema() {
30-
@Override
31-
protected Map<String, Table> getTableMap() {
32-
return Map.of(
33-
"index",
34-
new AbstractTable() {
35-
@Override
36-
public RelDataType getRowType(RelDataTypeFactory typeFactory) {
37-
return typeFactory.createStructType(
38-
List.of(
39-
typeFactory.createSqlType(SqlTypeName.INTEGER),
40-
typeFactory.createSqlType(SqlTypeName.VARCHAR)),
41-
List.of("id", "name"));
42-
}
43-
});
44-
}
45-
};
19+
public class UnifiedQueryPlannerTest extends UnifiedQueryTestBase {
4620

4721
/** Test catalog consists of test schema above */
4822
private final AbstractSchema testDeepSchema =
@@ -61,7 +35,7 @@ public void testPPLQueryPlanning() {
6135
.catalog("opensearch", testSchema)
6236
.build();
6337

64-
RelNode plan = planner.plan("source = opensearch.index | eval f = abs(id)");
38+
RelNode plan = planner.plan("source = opensearch.employees | eval f = abs(id)");
6539
assertNotNull("Plan should be created", plan);
6640
}
6741

@@ -74,8 +48,8 @@ public void testPPLQueryPlanningWithDefaultNamespace() {
7448
.defaultNamespace("opensearch")
7549
.build();
7650

77-
assertNotNull("Plan should be created", planner.plan("source = opensearch.index"));
78-
assertNotNull("Plan should be created", planner.plan("source = index"));
51+
assertNotNull("Plan should be created", planner.plan("source = opensearch.employees"));
52+
assertNotNull("Plan should be created", planner.plan("source = employees"));
7953
}
8054

8155
@Test
@@ -87,12 +61,12 @@ public void testPPLQueryPlanningWithDefaultNamespaceMultiLevel() {
8761
.defaultNamespace("catalog.opensearch")
8862
.build();
8963

90-
assertNotNull("Plan should be created", planner.plan("source = catalog.opensearch.index"));
91-
assertNotNull("Plan should be created", planner.plan("source = index"));
64+
assertNotNull("Plan should be created", planner.plan("source = catalog.opensearch.employees"));
65+
assertNotNull("Plan should be created", planner.plan("source = employees"));
9266

9367
// This is valid in SparkSQL, but Calcite requires "catalog" as the default root schema to
9468
// resolve it
95-
assertThrows(IllegalStateException.class, () -> planner.plan("source = opensearch.index"));
69+
assertThrows(IllegalStateException.class, () -> planner.plan("source = opensearch.employees"));
9670
}
9771

9872
@Test
@@ -105,7 +79,8 @@ public void testPPLQueryPlanningWithMultipleCatalogs() {
10579
.build();
10680

10781
RelNode plan =
108-
planner.plan("source = catalog1.index | lookup catalog2.index id | eval f = abs(id)");
82+
planner.plan(
83+
"source = catalog1.employees | lookup catalog2.employees id | eval f = abs(id)");
10984
assertNotNull("Plan should be created with multiple catalogs", plan);
11085
}
11186

@@ -119,7 +94,8 @@ public void testPPLQueryPlanningWithMultipleCatalogsAndDefaultNamespace() {
11994
.defaultNamespace("catalog2")
12095
.build();
12196

122-
RelNode plan = planner.plan("source = catalog1.index | lookup index id | eval f = abs(id)");
97+
RelNode plan =
98+
planner.plan("source = catalog1.employees | lookup employees id | eval f = abs(id)");
12399
assertNotNull("Plan should be created with multiple catalogs", plan);
124100
}
125101

@@ -132,7 +108,7 @@ public void testPPLQueryPlanningWithMetadataCaching() {
132108
.cacheMetadata(true)
133109
.build();
134110

135-
RelNode plan = planner.plan("source = opensearch.index");
111+
RelNode plan = planner.plan("source = opensearch.employees");
136112
assertNotNull("Plan should be created", plan);
137113
}
138114

@@ -166,7 +142,7 @@ public void testUnsupportedStatementType() {
166142
.catalog("opensearch", testSchema)
167143
.build();
168144

169-
planner.plan("explain source = index"); // explain statement
145+
planner.plan("explain source = employees"); // explain statement
170146
}
171147

172148
@Test(expected = SyntaxCheckException.class)
@@ -177,6 +153,6 @@ public void testPlanPropagatingSyntaxCheckException() {
177153
.catalog("opensearch", testSchema)
178154
.build();
179155

180-
planner.plan("source = index | eval"); // Trigger syntax error from parser
156+
planner.plan("source = employees | eval"); // Trigger syntax error from parser
181157
}
182158
}
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
6+
package org.opensearch.sql.api;
7+
8+
import java.util.List;
9+
import java.util.Map;
10+
import org.apache.calcite.rel.type.RelDataType;
11+
import org.apache.calcite.rel.type.RelDataTypeFactory;
12+
import org.apache.calcite.schema.Table;
13+
import org.apache.calcite.schema.impl.AbstractSchema;
14+
import org.apache.calcite.schema.impl.AbstractTable;
15+
import org.apache.calcite.sql.type.SqlTypeName;
16+
import org.junit.Before;
17+
import org.opensearch.sql.executor.QueryType;
18+
19+
/** Base class for unified query tests providing common test schema and utilities. */
20+
public abstract class UnifiedQueryTestBase {
21+
22+
/** Test schema containing sample tables for testing */
23+
protected AbstractSchema testSchema;
24+
25+
/** Unified query planner configured with test schema */
26+
protected UnifiedQueryPlanner planner;
27+
28+
@Before
29+
public void setUp() {
30+
testSchema =
31+
new AbstractSchema() {
32+
@Override
33+
protected Map<String, Table> getTableMap() {
34+
return Map.of("employees", createEmployeesTable());
35+
}
36+
};
37+
38+
planner =
39+
UnifiedQueryPlanner.builder()
40+
.language(QueryType.PPL)
41+
.catalog("catalog", testSchema)
42+
.defaultNamespace("catalog")
43+
.build();
44+
}
45+
46+
protected Table createEmployeesTable() {
47+
return new AbstractTable() {
48+
@Override
49+
public RelDataType getRowType(RelDataTypeFactory typeFactory) {
50+
return typeFactory.createStructType(
51+
List.of(
52+
typeFactory.createSqlType(SqlTypeName.INTEGER),
53+
typeFactory.createSqlType(SqlTypeName.VARCHAR),
54+
typeFactory.createSqlType(SqlTypeName.INTEGER),
55+
typeFactory.createSqlType(SqlTypeName.VARCHAR)),
56+
List.of("id", "name", "age", "department"));
57+
}
58+
};
59+
}
60+
}
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
/*
2+
* Copyright OpenSearch Contributors
3+
* SPDX-License-Identifier: Apache-2.0
4+
*/
5+
6+
package org.opensearch.sql.api.transpiler;
7+
8+
import static org.junit.Assert.assertEquals;
9+
10+
import org.apache.calcite.rel.RelNode;
11+
import org.apache.calcite.sql.dialect.SparkSqlDialect;
12+
import org.junit.Before;
13+
import org.junit.Test;
14+
import org.opensearch.sql.api.UnifiedQueryTestBase;
15+
import org.opensearch.sql.ppl.calcite.OpenSearchSparkSqlDialect;
16+
17+
public class UnifiedQueryTranspilerTest extends UnifiedQueryTestBase {
18+
19+
private UnifiedQueryTranspiler transpiler;
20+
21+
@Before
22+
public void setUp() {
23+
super.setUp();
24+
transpiler = UnifiedQueryTranspiler.builder().dialect(SparkSqlDialect.DEFAULT).build();
25+
}
26+
27+
@Test
28+
public void testToSql() {
29+
String pplQuery = "source = employees";
30+
RelNode plan = planner.plan(pplQuery);
31+
32+
String actualSql = transpiler.toSql(plan);
33+
String expectedSql = normalize("SELECT *\nFROM `catalog`.`employees`");
34+
assertEquals(
35+
"Transpiled SQL using SparkSqlDialect should match expected SQL", expectedSql, actualSql);
36+
}
37+
38+
@Test
39+
public void testToSqlWithCustomDialect() {
40+
String pplQuery = "source = employees | where name = 123";
41+
RelNode plan = planner.plan(pplQuery);
42+
43+
UnifiedQueryTranspiler customTranspiler =
44+
UnifiedQueryTranspiler.builder().dialect(OpenSearchSparkSqlDialect.DEFAULT).build();
45+
String actualSql = customTranspiler.toSql(plan);
46+
String expectedSql =
47+
normalize(
48+
"SELECT *\nFROM `catalog`.`employees`\nWHERE TRY_CAST(`name` AS DOUBLE) = 1.230E2");
49+
assertEquals(
50+
"Transpiled query using OpenSearchSparkSqlDialect should translate SAFE_CAST to TRY_CAST",
51+
expectedSql,
52+
actualSql);
53+
}
54+
55+
/** Normalizes line endings to platform-specific format for cross-platform test compatibility. */
56+
private String normalize(String sql) {
57+
return sql.replace("\n", System.lineSeparator());
58+
}
59+
}

ppl/src/test/java/org/opensearch/sql/ppl/calcite/OpenSearchSparkSqlDialect.java renamed to ppl/src/main/java/org/opensearch/sql/ppl/calcite/OpenSearchSparkSqlDialect.java

File renamed without changes.

0 commit comments

Comments
 (0)