Skip to content

Commit 669b461

Browse files
committed
Merge remote-tracking branch 'origin/main' into cast-datetime
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
2 parents 8d1d530 + 39eefc8 commit 669b461

30 files changed

Lines changed: 662 additions & 46 deletions

docs/dev/intro-v3-engine.md

Lines changed: 4 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# PPL Engine V3 (for 3.0.0-beta)
1+
# PPL Engine V3 (for 3.0.0)
22

33
---
44
## 1. Motivations
@@ -24,7 +24,7 @@ Find more details in [V3 Architecture](./intro-v3-architecture.md).
2424
---
2525
## 2. What's New
2626

27-
In the initial release of the V3 engine (3.0.0-beta), the main new features focus on enhancing the PPL language while maintaining maximum compatibility with V2 behavior.
27+
In the initial release of the V3 engine (3.0.0), the main new features focus on enhancing the PPL language while maintaining maximum compatibility with V2 behavior.
2828

2929
* **[Join](../user/ppl/cmd/join.rst) Command**
3030
* **[Lookup](../user/ppl/cmd/lookup.rst) Command**
@@ -35,7 +35,7 @@ In the initial release of the V3 engine (3.0.0-beta), the main new features focu
3535

3636
### 3.1 Breaking Changes
3737

38-
Because of implementation changed internally, following behaviors are changed from 3.0.0-beta. (Behaviors in V3 is correct)
38+
Because of implementation changed internally, following behaviors are changed from 3.0.0. (Behaviors in V3 is correct)
3939

4040
| Item | V2 | V3 |
4141
|:------------------------------------------------:|:---------:|:--------------------:|
@@ -51,7 +51,7 @@ Because of implementation changed internally, following behaviors are changed fr
5151

5252
### 3.2 Fallback Mechanism
5353

54-
As v3 engine is experimental in 3.0.0-beta, not all PPL commands could work under this new engine. Those unsupported queries will be forwarded to V2 engine by fallback mechanism. To avoid impact on your side, normally you won't see any difference in a query response. If you want to check if and why your query falls back to be handled by V2 engine, please check OpenSearch log for "Fallback to V2 query engine since ...".
54+
As v3 engine is experimental in 3.0.0, not all PPL commands could work under this new engine. Those unsupported queries will be forwarded to V2 engine by fallback mechanism. To avoid impact on your side, normally you won't see any difference in a query response. If you want to check if and why your query falls back to be handled by V2 engine, please check OpenSearch log for "Fallback to V2 query engine since ...".
5555

5656
### 3.3 Limitations
5757

@@ -66,24 +66,12 @@ For the following functionalities in V3 engine, the query will be forwarded to t
6666

6767
#### Unsupported functionalities
6868
- All SQL queries
69-
- `trendline`
70-
- `show datasource`
71-
- `explain`
72-
- `describe`
73-
- `top` and `rare`
74-
- `fillnull`
75-
- `patterns`
7669
- `dedup` with `consecutive=true`
7770
- Search relevant commands
7871
- AD
7972
- ML
8073
- Kmeans
8174
- Commands with `fetch_size` parameter
82-
- query with metadata fields, `_id`, `_doc`, etc.
83-
- Json relevant functions
84-
- cast to json
85-
- json
86-
- json_valid
8775
- Search relevant functions
8876
- match
8977
- match_phrase
@@ -105,6 +93,5 @@ If you're interested in the new query engine, please find more details in [V3 Ar
10593
The following items are on our roadmap with high priority:
10694
- Resolve the [V3 limitation](#33-limitations).
10795
- Advancing pushdown optimization and benchmarking
108-
- Backport to 2.19.x
10996
- Unified the PPL syntax between [PPL-on-OpenSearch](https://github.com/opensearch-project/sql/blob/main/ppl/src/main/antlr/OpenSearchPPLParser.g4) and [PPL-on-Spark](https://github.com/opensearch-project/opensearch-spark/blob/main/ppl-spark-integration/src/main/antlr4/OpenSearchPPLParser.g4)
11097
- Support more DSL aggregation

docs/user/limitations/limitations.rst

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,3 +130,31 @@ The response in JSON format is::
130130
},
131131
"status": 400
132132
}
133+
134+
Limitations on Calcite Engine
135+
=============================
136+
137+
Since 3.0.0, we introduce Apache Calcite as an experimental query engine. Please see `introduce v3 engine <../../../dev/intro-v3-engine.md>`_.
138+
For the following functionalities, the query will be forwarded to the V2 query engine.
139+
140+
* All SQL queries
141+
142+
* ``dedup`` with ``consecutive=true``
143+
144+
* Search relevant commands
145+
146+
* AD
147+
* ML
148+
* Kmeans
149+
150+
* Commands with ``fetch_size`` parameter
151+
152+
* Search relevant functions
153+
154+
* match
155+
* match_phrase
156+
* match_bool_prefix
157+
* match_phrase_prefix
158+
* simple_query_string
159+
* query_string
160+
* multi_match

docs/user/ppl/limitations/limitations.rst

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,3 +84,31 @@ plugins.query.field_type_tolerance setting is enabled, the SQL/PPL plugin will h
8484
scalar data types, allowing basic queries (e.g., source = tbl | where condition). However, using multi-value
8585
fields in expressions or functions will result in exceptions. If this setting is disabled or absent, only the
8686
first element of an array is returned, preserving the default behavior.
87+
88+
Unsupported Functionalities in Calcite Engine
89+
=============================================
90+
91+
Since 3.0.0, we introduce Apache Calcite as an experimental query engine. Please see `introduce v3 engine <../../../dev/intro-v3-engine.md>`_.
92+
For the following functionalities, the query will be forwarded to the V2 query engine.
93+
94+
* All SQL queries
95+
96+
* ``dedup`` with ``consecutive=true``
97+
98+
* Search relevant commands
99+
100+
* AD
101+
* ML
102+
* Kmeans
103+
104+
* Commands with ``fetch_size`` parameter
105+
106+
* Search relevant functions
107+
108+
* match
109+
* match_phrase
110+
* match_bool_prefix
111+
* match_phrase_prefix
112+
* simple_query_string
113+
* query_string
114+
* multi_match

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLSortIT.java

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -217,4 +217,22 @@ public void testSortWithNullValue() throws IOException {
217217
rows("Elinor", null),
218218
rows("Virginia", null));
219219
}
220+
221+
@Test
222+
public void testSortDate() throws IOException {
223+
JSONObject result =
224+
executeQuery(
225+
String.format(
226+
"source=%s | sort birthdate | fields firstname, birthdate", TEST_INDEX_BANK));
227+
verifySchema(result, schema("firstname", "string"), schema("birthdate", "timestamp"));
228+
verifyDataRowsInOrder(
229+
result,
230+
rows("Amber JOHnny", "2017-10-23 00:00:00"),
231+
rows("Hattie", "2017-11-20 00:00:00"),
232+
rows("Nanette", "2018-06-23 00:00:00"),
233+
rows("Elinor", "2018-06-27 00:00:00"),
234+
rows("Dillard", "2018-08-11 00:00:00"),
235+
rows("Virginia", "2018-08-19 00:00:00"),
236+
rows("Dale", "2018-11-13 23:33:20"));
237+
}
220238
}

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteSortCommandIT.java

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,13 @@
55

66
package org.opensearch.sql.calcite.remote;
77

8+
import static org.opensearch.sql.legacy.TestsConstants.TEST_INDEX_BANK;
9+
import static org.opensearch.sql.util.MatcherUtils.rows;
10+
import static org.opensearch.sql.util.MatcherUtils.verifyOrder;
11+
12+
import java.io.IOException;
13+
import org.json.JSONObject;
14+
import org.junit.Test;
815
import org.opensearch.sql.ppl.SortCommandIT;
916

1017
public class CalciteSortCommandIT extends SortCommandIT {
@@ -14,4 +21,12 @@ public void init() throws Exception {
1421
enableCalcite();
1522
disallowCalciteFallback();
1623
}
24+
25+
// TODO: Move this test to SortCommandIT once head-then-sort is fixed in v2.
26+
@Test
27+
public void testHeadThenSort() throws IOException {
28+
JSONObject result =
29+
executeQuery(String.format("source=%s | head 2 | sort age | fields age", TEST_INDEX_BANK));
30+
verifyOrder(result, rows(32), rows(36));
31+
}
1732
}

integ-test/src/test/java/org/opensearch/sql/calcite/standalone/CalcitePPLDateTimeBuiltinFunctionIT.java

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -439,12 +439,14 @@ public void testWeekAndWeekOfYearWithFilter() {
439439
JSONObject actual =
440440
executeQuery(
441441
String.format(
442+
Locale.ROOT,
442443
"source=%s | fields strict_date_optional_time"
443444
+ "| where YEAR(strict_date_optional_time) < 2000"
444445
+ "| where WEEK(DATE(strict_date_optional_time)) = %d"
445446
+ "| stats COUNT() AS CNT "
446447
+ "| head 1 ",
447-
TEST_INDEX_DATE_FORMATS, week19840412));
448+
TEST_INDEX_DATE_FORMATS,
449+
week19840412));
448450

449451
verifySchema(actual, schema("CNT", "bigint"));
450452

@@ -1039,8 +1041,8 @@ public void testDateFormatAndDatetimeAndFromDays() throws IOException {
10391041
"2017-11-02",
10401042
expectedDatetimeAtPlus8,
10411043
expectedDatetimeAtUTC,
1042-
String.format("%d 1984 %d", week19840412, week19840412),
1043-
String.format("%d %d 1984", week19840412Mode1, week19840412Mode1),
1044+
String.format(Locale.ROOT, "%d 1984 %d", week19840412, week19840412),
1045+
String.format(Locale.ROOT, "%d %d 1984", week19840412Mode1, week19840412Mode1),
10441046
"09:07:42.000123"));
10451047
}
10461048

integ-test/src/test/java/org/opensearch/sql/ppl/ExplainIT.java

Lines changed: 112 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,6 @@ public void testFilterAndAggPushDownExplain() throws IOException {
7474

7575
@Test
7676
public void testSortPushDownExplain() throws IOException {
77-
// TODO fix after https://github.com/opensearch-project/sql/issues/3380
7877
String expected =
7978
isCalciteEnabled()
8079
? loadFromFile("expectedOutput/calcite/explain_sort_push.json")
@@ -89,6 +88,117 @@ public void testSortPushDownExplain() throws IOException {
8988
+ "| fields age"));
9089
}
9190

91+
@Test
92+
public void testSortWithAggregationExplain() throws IOException {
93+
// Sorts whose by fields are aggregators should not be pushed down
94+
String expected =
95+
isCalciteEnabled()
96+
? loadFromFile("expectedOutput/calcite/explain_sort_agg_push.json")
97+
: loadFromFile("expectedOutput/ppl/explain_sort_agg_push.json");
98+
99+
assertJsonEqualsIgnoreId(
100+
expected,
101+
explainQueryToString(
102+
"source=opensearch-sql_test_index_account"
103+
+ "| stats avg(age) AS avg_age by state, city "
104+
+ "| sort avg_age"));
105+
106+
// sorts whose by fields are not aggregators can be pushed down.
107+
// This test is covered in testExplain
108+
}
109+
110+
@Test
111+
public void testMultiSortPushDownExplain() throws IOException {
112+
// TODO: Fix the expected output in expectedOutput/ppl/explain_multi_sort_push.json (v2)
113+
// balance and gender should take precedence over account_number and firstname
114+
String expected =
115+
isCalciteEnabled()
116+
? loadFromFile("expectedOutput/calcite/explain_multi_sort_push.json")
117+
: loadFromFile("expectedOutput/ppl/explain_multi_sort_push.json");
118+
119+
assertJsonEqualsIgnoreId(
120+
expected,
121+
explainQueryToString(
122+
"source=opensearch-sql_test_index_account "
123+
+ "| sort account_number, firstname, address, balance "
124+
+ "| sort - balance, - gender, address "
125+
+ "| fields account_number, firstname, address, balance, gender"));
126+
}
127+
128+
@Test
129+
public void testSortThenAggregatePushDownExplain() throws IOException {
130+
// TODO: Remove pushed-down sort in DSL in expectedOutput/ppl/explain_sort_then_agg_push.json
131+
// existing collations should be eliminated when pushing down aggregations (v2)
132+
String expected =
133+
isCalciteEnabled()
134+
? loadFromFile("expectedOutput/calcite/explain_sort_then_agg_push.json")
135+
: loadFromFile("expectedOutput/ppl/explain_sort_then_agg_push.json");
136+
137+
assertJsonEqualsIgnoreId(
138+
expected,
139+
explainQueryToString(
140+
"source=opensearch-sql_test_index_account"
141+
+ "| sort balance, age "
142+
+ "| stats avg(balance) by state"));
143+
}
144+
145+
@Test
146+
public void testSortWithRenameExplain() throws IOException {
147+
String expected =
148+
isCalciteEnabled()
149+
? loadFromFile("expectedOutput/calcite/explain_sort_rename_push.json")
150+
: loadFromFile("expectedOutput/ppl/explain_sort_rename_push.json");
151+
152+
assertJsonEqualsIgnoreId(
153+
expected,
154+
explainQueryToString(
155+
"source=opensearch-sql_test_index_account "
156+
+ "| rename firstname as name "
157+
+ "| eval alias = name "
158+
+ "| sort alias "
159+
+ "| fields alias"));
160+
}
161+
162+
/**
163+
* Pushdown SORT and LIMIT Sort should be pushed down since DSL process sort before limit when
164+
* they coexist
165+
*/
166+
@Test
167+
public void testSortThenLimitExplain() throws IOException {
168+
String expected =
169+
isCalciteEnabled()
170+
? loadFromFile("expectedOutput/calcite/explain_sort_then_limit_push.json")
171+
: loadFromFile("expectedOutput/ppl/explain_sort_then_limit_push.json");
172+
assertJsonEqualsIgnoreId(
173+
expected,
174+
explainQueryToString(
175+
"source=opensearch-sql_test_index_account"
176+
+ "| sort age "
177+
+ "| head 5 "
178+
+ "| fields age"));
179+
}
180+
181+
/**
182+
* Push down LIMIT only Sort should NOT be pushed down since DSL process limit before sort when
183+
* they coexist
184+
*/
185+
@Test
186+
public void testLimitThenSortExplain() throws IOException {
187+
// TODO: Fix the expected output in expectedOutput/ppl/explain_limit_then_sort_push.json (v2)
188+
// limit-then-sort should not be pushed down.
189+
String expected =
190+
isCalciteEnabled()
191+
? loadFromFile("expectedOutput/calcite/explain_limit_then_sort_push.json")
192+
: loadFromFile("expectedOutput/ppl/explain_limit_then_sort_push.json");
193+
assertJsonEqualsIgnoreId(
194+
expected,
195+
explainQueryToString(
196+
"source=opensearch-sql_test_index_account"
197+
+ "| head 5 "
198+
+ "| sort age "
199+
+ "| fields age"));
200+
}
201+
92202
@Test
93203
public void testLimitPushDownExplain() throws IOException {
94204
String expected =
@@ -240,6 +350,7 @@ public void testTrendlineWithSortPushDownExplain() throws IOException {
240350
? loadFromFile("expectedOutput/calcite/explain_trendline_sort_push.json")
241351
: loadFromFile("expectedOutput/ppl/explain_trendline_sort_push.json");
242352

353+
// Sort will not be pushed down because there's a head before it.
243354
assertJsonEqualsIgnoreId(
244355
expected,
245356
explainQueryToString(

integ-test/src/test/java/org/opensearch/sql/ppl/SortCommandIT.java

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,4 +154,11 @@ public void testSortMultipleFields() throws IOException {
154154
String.format("source=%s | sort dog_name, age | fields dog_name, age", TEST_INDEX_DOG));
155155
verifyOrder(result, rows("rex", 2), rows("snoopy", 4));
156156
}
157+
158+
@Test
159+
public void testSortThenHead() throws IOException {
160+
JSONObject result =
161+
executeQuery(String.format("source=%s | sort age | head 2 | fields age", TEST_INDEX_BANK));
162+
verifyOrder(result, rows(28), rows(32));
163+
}
157164
}
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"calcite": {
3+
"logical": "LogicalSort(sort0=[$0], dir0=[ASC])\n LogicalProject(age=[$8])\n LogicalSort(sort0=[$8], dir0=[ASC])\n LogicalSort(fetch=[5])\n CalciteLogicalIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]])\n",
4+
"physical": "EnumerableSort(sort0=[$0], dir0=[ASC])\n CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]], PushDownContext=[[LIMIT->5, PROJECT->[age]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":5,\"timeout\":\"1m\",\"_source\":{\"includes\":[\"age\"],\"excludes\":[]}}, requestedTotalSize=5, pageSize=null, startFrom=0)])\n"
5+
}
6+
}
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"calcite": {
3+
"logical": "LogicalSort(sort0=[$3], sort1=[$4], sort2=[$2], dir0=[DESC], dir1=[DESC], dir2=[ASC])\n LogicalProject(account_number=[$0], firstname=[$1], address=[$2], balance=[$3], gender=[$4])\n LogicalSort(sort0=[$3], sort1=[$4], sort2=[$2], dir0=[DESC], dir1=[DESC], dir2=[ASC])\n LogicalSort(sort0=[$0], sort1=[$1], sort2=[$2], sort3=[$3], dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC])\n CalciteLogicalIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]])\n",
4+
"physical": "CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]], PushDownContext=[[PROJECT->[account_number, firstname, address, balance, gender], SORT->[{\n \"balance\" : {\n \"order\" : \"desc\",\n \"missing\" : \"_first\"\n }\n}, {\n \"gender\" : {\n \"order\" : \"desc\",\n \"missing\" : \"_first\"\n }\n}, {\n \"address\" : {\n \"order\" : \"asc\",\n \"missing\" : \"_last\"\n }\n}]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"timeout\":\"1m\",\"_source\":{\"includes\":[\"account_number\",\"firstname\",\"address\",\"balance\",\"gender\"],\"excludes\":[]},\"sort\":[{\"balance\":{\"order\":\"desc\",\"missing\":\"_first\"}},{\"gender\":{\"order\":\"desc\",\"missing\":\"_first\"}},{\"address\":{\"order\":\"asc\",\"missing\":\"_last\"}}]}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n"
5+
}
6+
}

0 commit comments

Comments
 (0)