Skip to content

Commit 4c20af1

Browse files
committed
Merge remote-tracking branch 'origin/main' into issues/4136
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
2 parents e3f904b + c54acc4 commit 4c20af1

181 files changed

Lines changed: 11619 additions & 725 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Resolves #[Issue number to be closed when this PR is merged]
1010
- [ ] New functionality has been documented.
1111
- [ ] New functionality has javadoc added.
1212
- [ ] New functionality has a user manual doc added.
13-
- [ ] New PPL command [checklist](https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst#new-ppl-command-checklist) all confirmed.
13+
- [ ] New PPL command [checklist](https://github.com/opensearch-project/sql/blob/main/docs/dev/ppl-commands.md) all confirmed.
1414
- [ ] API changes companion pull request [created](https://github.com/opensearch-project/opensearch-api-specification/blob/main/DEVELOPER_GUIDE.md).
1515
- [ ] Commits are signed per the DCO using `--signoff` or `-s`.
1616
- [ ] Public documentation issue/PR [created](https://github.com/opensearch-project/documentation-website/issues/new/choose).

.github/maven-publish-utils.sh

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,10 @@
44

55
set -e
66

7+
# Flag to disable commit mapping functionality
8+
# Set to "true" to disable commit mapping operations
9+
DISABLE_COMMIT_MAPPING="${DISABLE_COMMIT_MAPPING:-true}"
10+
711
# Function to execute curl commands with retry and error handling
812
execute_curl_with_retry() {
913
local url="$1"
@@ -111,6 +115,11 @@ update_version_metadata() {
111115
local commit_id="$3"
112116
local snapshot_repo_url="${4:-$SNAPSHOT_REPO_URL}"
113117

118+
if [ "$DISABLE_COMMIT_MAPPING" = "true" ]; then
119+
echo "Skipping version metadata update (commit mapping disabled)"
120+
return 0
121+
fi
122+
114123
echo "Updating version metadata for ${artifact_id} version ${version} with commit ID ${commit_id}"
115124

116125
TEMP_DIR=$(mktemp -d)
@@ -204,6 +213,11 @@ update_commit_mapping() {
204213
local commit_map_filename="${5:-$COMMIT_MAP_FILENAME}"
205214
local snapshot_repo_url="${6:-$SNAPSHOT_REPO_URL}"
206215

216+
if [ "$DISABLE_COMMIT_MAPPING" = "true" ]; then
217+
echo "Skipping commit-version mapping update (commit mapping disabled)"
218+
return 0
219+
fi
220+
207221
echo "Updating commit-version mapping for ${artifact_id}"
208222

209223
# Create temp directory for work

.github/workflows/link-checker.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ jobs:
1818
id: lychee
1919
uses: lycheeverse/lychee-action@master
2020
with:
21-
args: --accept=200,403,429,999 "./**/*.html" "./**/*.md" "./**/*.txt" --exclude "https://aws.oss.sonatype.*|https://central.sonatype.*|http://localhost.*|https://localhost|https://odfe-node1:9200/|https://community.tableau.com/docs/DOC-17978|.*family.zzz|opensearch*|.*@amazon.com|.*email.com|.*@github.com|http://timestamp.verisign.com/scripts/timstamp.dll"
21+
args: --accept=200,403,429,999 "./**/*.html" "./**/*.md" "./**/*.txt" --exclude "https://aws.oss.sonatype.*|https://ci.opensearch.*|https://central.sonatype.*|http://localhost.*|https://localhost|https://odfe-node1:9200/|https://community.tableau.com/docs/DOC-17978|.*family.zzz|opensearch*|.*@amazon.com|.*email.com|.*@github.com|http://timestamp.verisign.com/scripts/timstamp.dll"
2222
env:
2323
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
2424
- name: Fail if there were link errors

DEVELOPER_GUIDE.rst

Lines changed: 40 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -17,15 +17,15 @@ Prerequisites
1717
JDK
1818
---
1919

20-
OpenSearch builds using Java 11 at a minimum and supports JDK 11, 14 and 17. This means you must have a JDK of supported version installed with the environment variable `JAVA_HOME` referencing the path to Java home for your JDK installation::
20+
OpenSearch SQL plugin requires Java 21 for development and runtime. This means you must have JDK 21 installed with the environment variable `JAVA_HOME` referencing the path to Java home for your JDK installation::
2121

2222
$ echo $JAVA_HOME
23-
/Library/Java/JavaVirtualMachines/adoptopenjdk-11.jdk/Contents/Home
23+
/Library/Java/JavaVirtualMachines/jdk-21.jdk/Contents/Home
2424

2525
$ java -version
26-
openjdk version "11.0.1" 2018-10-16
27-
OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
28-
OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)
26+
openjdk version "21.0.8" 2024-07-16 LTS
27+
OpenJDK Runtime Environment (build 21.0.8+13-LTS)
28+
OpenJDK 64-Bit Server VM (build 21.0.8+13-LTS, mixed mode, sharing)
2929

3030
Here are the official instructions on how to set ``JAVA_HOME`` for different platforms: https://docs.oracle.com/cd/E19182-01/820-7851/inst_cli_jdk_javahome_t/.
3131

@@ -78,12 +78,12 @@ You can develop the plugin in your favorite IDEs such as Eclipse and IntelliJ ID
7878
Java Language Level
7979
-------------------
8080

81-
Although later version of JDK is required to build the plugin, the Java language level needs to be Java 8 for compatibility. Only in this case your plugin works with OpenSearch running against JDK 8. Otherwise it will raise runtime exception when executing new API from new JDK. In case your IDE doesnt set it right, you may want to double check your project setting after import.
81+
The plugin requires Java 21 for both development and runtime. Make sure your IDE is configured to use Java 21 as the project SDK and language level. In case your IDE doesn't set it right, you may want to double check your project setting after import.
8282

8383
Remote Debugging
8484
----------------
8585

86-
Firstly you need to add the following configuration to the JVM used by your IDE. For Intellij IDEA, it should be added to ``<OpenSearch installation>/config/jvm.options`` file. After configuring this, an agent in JVM will listen on the port when your OpenSearch bootstraps and wait for IDE debugger to connect. So you should be able to debug by setting up a Remote Run/Debug Configuration::
86+
Firstly you need to add the following configuration to the JVM used by your IDE. For Intellij IDEA, it should be added to ``<OpenSearch installation>/config/jvm.options`` file. After configuring this, an agent in JVM will listen on the port when your OpenSearch bootstraps and wait for IDE debugger to connect. So you should be able to debug by setting up a "Remote Run/Debug Configuration"::
8787

8888
-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005
8989

@@ -94,26 +94,20 @@ running.
9494

9595
./gradlew opensearch-sql:run -DdebugJVM
9696

97-
To connect to the cluster with the debugger in an IDE, youll need to
97+
To connect to the cluster with the debugger in an IDE, you'll need to
9898
connect to that port. For IntelliJ, see `attaching to a remote process <https://www.jetbrains.com/help/idea/attach-to-process.html#attach-to-remote>`_.
9999

100100
License Header
101101
--------------
102102

103-
Because our code is licensed under Apache 2, you need to add the following license header to all new source code files. To automate this whenever creating new file, you can follow instructions for your IDE::
104-
105-
/*
106-
* Licensed under the Apache License, Version 2.0 (the "License").
107-
* You may not use this file except in compliance with the License.
108-
* A copy of the License is located at
109-
*
110-
* http://www.apache.org/licenses/LICENSE-2.0
111-
*
112-
* or in the "license" file accompanying this file. This file is distributed
113-
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
114-
* express or implied. See the License for the specific language governing
115-
* permissions and limitations under the License.
116-
*/
103+
Because our code is licensed under Apache 2, you need to add the following license header to all new source code files. To automate this whenever creating new file, you can follow instructions for your IDE.
104+
105+
.. code:: java
106+
107+
/*
108+
* Copyright OpenSearch Contributors
109+
* SPDX-License-Identifier: Apache-2.0
110+
*/
117111
118112
For example, `here are the instructions for adding copyright profiles in IntelliJ IDEA <https://www.jetbrains.com/help/idea/copyright.html>`__.
119113

@@ -138,10 +132,10 @@ The plugin codebase is in standard layout of Gradle project::
138132
├── build.gradle
139133
├── config
140134
├── docs
141-
   ├── attributions.md
142-
   ├── category.json
143-
   ├── dev
144-
   └── user
135+
├── attributions.md
136+
├── category.json
137+
├── dev
138+
└── user
145139
├── gradle.properties
146140
├── gradlew
147141
├── gradlew.bat
@@ -217,63 +211,21 @@ Java files are formatted using `Spotless <https://github.com/diffplug/spotless>`
217211
* - Javadoc format can be maintained by wrapping javadoc with `<pre></pre>` HTML tags
218212
* - Strings can be formatted on multiple lines with a `+` with the correct indentation for the string.
219213

220-
New PPL Command Checklist
221-
=========================
214+
Development Guidelines
215+
----------------------
222216

223-
If you are working on contributing a new PPL command, please read this guide and review all items in the checklist are done before code review. You also can leverage this checklist to guide how to add new PPL command.
217+
For detailed development documentation, please refer to the `development documentation <docs/dev/index.md>`_. For specific guidance on implementing PPL components, see the following resources:
224218

225-
Prerequisite
226-
------------
227-
228-
| ✅ Open an RFC issue before starting to code:
229-
- Describe the purpose of the new command
230-
- Include at least syntax definition, usage and examples
231-
- Implementation options are welcome if you have multiple ways to implement it
232-
| ✅ Obtain PM review approval for the RFC:
233-
- If PM unavailable, consult repository maintainers as alternative
234-
- An offline meeting might be required to discuss the syntax and usage
235-
236-
Coding & Tests
237-
--------------
238-
239-
| ✅ Lexer/Parser Updates:
240-
- Add new keywords to OpenSearchPPLLexer.g4
241-
- Add grammar rules to OpenSearchPPLParser.g4
242-
- Update ``commandName`` and ``keywordsCanBeId``
243-
| ✅ AST Implementation:
244-
- Add new tree nodes under package ``org.opensearch.sql.ast.tree``
245-
- Prefer reusing ``Argument`` for command arguments **over** creating new expression nodes under ``org.opensearch.sql.ast.expression``
246-
| ✅ Visitor Pattern:
247-
- Add ``visit*`` in ``AbstractNodeVisitor``
248-
- Overriding ``visit*`` in ``Analyzer``, ``CalciteRelNodeVisitor`` and ``PPLQueryDataAnonymizer``
249-
| ✅ Unit Tests:
250-
- Extend ``CalcitePPLAbstractTest``
251-
- Keep test queries minimal
252-
- Include ``verifyLogical()`` and ``verifyPPLToSparkSQL()``
253-
| ✅ Integration tests (pushdown):
254-
- Extend ``PPLIntegTestCase``
255-
- Use complex real-world queries
256-
- Include ``verifySchema()`` and ``verifyDataRows()``
257-
| ✅ Integration tests (Non-pushdown):
258-
- Add test class to ``CalciteNoPushdownIT``
259-
| ✅ Explain tests:
260-
- Add tests to ``ExplainIT`` or ``CalciteExplainIT``
261-
| ✅ Unsupported in v2 test:
262-
- Add a test in ``NewAddedCommandsIT``
263-
| ✅ Anonymizer tests:
264-
- Add a test in ``PPLQueryDataAnonymizerTest``
265-
| ✅ Cross-cluster Tests (optional, nice to have):
266-
- Add a test in ``CrossClusterSearchIT``
267-
| ✅ User doc:
268-
- Add a xxx.rst under ``docs/user/ppl/cmd`` and link the new doc to ``docs/user/ppl/index.rst``
219+
- `PPL Commands <docs/dev/ppl-commands.md>`_: Guidelines for adding new commands to PPL
220+
- `PPL Functions <docs/dev/ppl-functions.md>`_: Instructions for implementing and integrating custom functions
269221

270222
Building and Running Tests
271223
==========================
272224

273225
Gradle Build
274226
------------
275227

276-
Most of the time you just need to run ./gradlew build which will make sure you pass all checks and testing. While youre developing, you may want to run specific Gradle task only. In this case, you can run ./gradlew with task name which only triggers the task along with those it depends on. Here is a list for common tasks:
228+
Most of the time you just need to run ./gradlew build which will make sure you pass all checks and testing. While you're developing, you may want to run specific Gradle task only. In this case, you can run ./gradlew with task name which only triggers the task along with those it depends on. Here is a list for common tasks:
277229

278230
.. list-table::
279231
:widths: 30 50
@@ -294,7 +246,7 @@ Most of the time you just need to run ./gradlew build which will make sure you p
294246
* - ./gradlew :integ-test:yamlRestTest
295247
- Run rest integration test.
296248
* - ./gradlew :doctest:doctest
297-
- Run doctests
249+
- Run doctests in docs folder. You can use ``-Pdocs=file1,file2`` to run specific file(s). See more info in `Documentation <#documentation>`_ section.
298250
* - ./gradlew build
299251
- Build plugin by run all tasks above (this takes time).
300252
* - ./gradlew pitest
@@ -304,7 +256,7 @@ Most of the time you just need to run ./gradlew build which will make sure you p
304256
* - ./gradlew spotlessApply
305257
- Automatically apply spotless code style changes.
306258

307-
For integration test, you can use ``-Dtests.class`` UT full path to run a task individually. For example ``./gradlew :integ-test:integTest -Dtests.class="*QueryIT"``.
259+
For integration test, you can use ``-Dtests.class`` "UT full path" to run a task individually. For example ``./gradlew :integ-test:integTest -Dtests.class="*QueryIT"``.
308260

309261
To run the task above for specific module, you can do ``./gradlew :<module_name>:task``. For example, only build core module by ``./gradlew :core:build``.
310262

@@ -466,6 +418,18 @@ Doctest
466418

467419
Python doctest library makes our document executable which keeps it up-to-date to source code. The doc generator aforementioned served as scaffolding and generated many docs in short time. Now the examples inside is changed to doctest gradually. For more details please read `testing-doctest <./docs/dev/testing-doctest.md>`_.
468420

421+
.. code-block:: bash
422+
# Test all docs
423+
./gradlew :doctest:doctest
424+
425+
# Test single file using main doctest task
426+
./gradlew :doctest:doctest -Pdocs=search
427+
428+
# Test multiple files at once
429+
./gradlew :doctest:doctest -Pdocs=search,fields,basics
430+
431+
# With verbose output
432+
./gradlew :doctest:doctest -Pdocs=stats -Pverbose=true
469433
470434
Backports
471435
>>>>>>>>>

build.gradle

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -66,9 +66,9 @@ buildscript {
6666

6767
repositories {
6868
mavenLocal()
69-
maven { url "https://central.sonatype.com/repository/maven-snapshots/" }
70-
maven { url "https://aws.oss.sonatype.org/content/repositories/snapshots" }
7169
mavenCentral()
70+
maven { url "https://central.sonatype.com/repository/maven-snapshots/" }
71+
maven { url "https://ci.opensearch.org/ci/dbc/snapshots/" }
7272
}
7373

7474
dependencies {
@@ -92,10 +92,10 @@ apply plugin: 'opensearch.java-agent'
9292
// Repository on root level is for dependencies that project code depends on. And this block must be placed after plugins{}
9393
repositories {
9494
mavenLocal()
95-
maven { url "https://central.sonatype.com/repository/maven-snapshots/" }
96-
maven { url "https://aws.oss.sonatype.org/content/repositories/snapshots" }
9795
mavenCentral() // For Elastic Libs that you can use to get started coding until open OpenSearch libs are available
96+
maven { url "https://central.sonatype.com/repository/maven-snapshots/" }
9897
maven { url 'https://jitpack.io' }
98+
maven { url "https://ci.opensearch.org/ci/dbc/snapshots/" }
9999
}
100100

101101
spotless {
@@ -157,11 +157,10 @@ allprojects {
157157
subprojects {
158158
repositories {
159159
mavenLocal()
160-
maven { url "https://central.sonatype.com/repository/maven-snapshots/" }
161-
maven { url "https://aws.oss.sonatype.org/content/repositories/snapshots" }
162160
mavenCentral()
163-
maven { url "https://ci.opensearch.org/ci/dbc/snapshots/lucene/" }
161+
maven { url "https://central.sonatype.com/repository/maven-snapshots/" }
164162
maven { url 'https://jitpack.io' }
163+
maven { url "https://ci.opensearch.org/ci/dbc/snapshots/" }
165164
}
166165
}
167166

common/src/main/java/org/opensearch/sql/common/patterns/BrainLogParser.java

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,11 @@ public class BrainLogParser {
3939
"(\\d{4}-\\d{2}-\\d{2})[T"
4040
+ " ]?(\\d{2}:\\d{2}:\\d{2})(\\.\\d{3})?(Z|([+-]\\d{2}:?\\d{2}))?"),
4141
"<*DATETIME*>");
42+
// UUID
43+
DEFAULT_FILTER_PATTERN_VARIABLE_MAP.put(
44+
Pattern.compile(
45+
"\\b[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}\\b"),
46+
"<*UUID*>");
4247
// Hex Decimal, letters followed by digits, float numbers
4348
DEFAULT_FILTER_PATTERN_VARIABLE_MAP.put(
4449
Pattern.compile(

common/src/test/java/org/opensearch/sql/common/patterns/BrainLogParserTest.java

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,22 @@ public void testPreprocess() {
104104
assertEquals(expectedResult, result);
105105
}
106106

107+
@Test
108+
public void testPreprocessWithUUID() {
109+
String logMessage = "127.0.0.1 - 1234 something, user_id:c78ac970-f0c3-4954-8cf8-352a8458d01c";
110+
String logId = "log1";
111+
List<String> expectedResult =
112+
Arrays.asList("<*IP*>", "-", "<*>", "something", "user_id:<*UUID*>", "log1");
113+
List<String> result = parser.preprocess(logMessage, logId);
114+
assertEquals(expectedResult, result);
115+
// Test with different delimiter
116+
logMessage = "127.0.0.1=1234 something, user_id:c78ac970-f0c3-4954-8cf8-352a8458d01c";
117+
logId = "log2";
118+
expectedResult = Arrays.asList("<*IP*>=<*>", "something", "user_id:<*UUID*>", "log2");
119+
result = parser.preprocess(logMessage, logId);
120+
assertEquals(expectedResult, result);
121+
}
122+
107123
@Test
108124
public void testPreprocessWithIllegalInput() {
109125
String logMessage = "127.0.0.1 - 1234 something";

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@
5959
import org.opensearch.sql.ast.tree.AD;
6060
import org.opensearch.sql.ast.tree.Aggregation;
6161
import org.opensearch.sql.ast.tree.AppendCol;
62+
import org.opensearch.sql.ast.tree.Bin;
6263
import org.opensearch.sql.ast.tree.CloseCursor;
6364
import org.opensearch.sql.ast.tree.Dedupe;
6465
import org.opensearch.sql.ast.tree.Eval;
@@ -669,6 +670,12 @@ public LogicalPlan visitML(ML node, AnalysisContext context) {
669670
return new LogicalML(child, node.getArguments());
670671
}
671672

673+
@Override
674+
public LogicalPlan visitBin(Bin node, AnalysisContext context) {
675+
throw new UnsupportedOperationException(
676+
"Bin command is supported only when " + CALCITE_ENGINE_ENABLED.getKeyValue() + "=true");
677+
}
678+
672679
@Override
673680
public LogicalPlan visitExpand(Expand expand, AnalysisContext context) {
674681
throw new UnsupportedOperationException(

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@
4747
import org.opensearch.sql.ast.tree.AD;
4848
import org.opensearch.sql.ast.tree.Aggregation;
4949
import org.opensearch.sql.ast.tree.AppendCol;
50+
import org.opensearch.sql.ast.tree.Bin;
5051
import org.opensearch.sql.ast.tree.CloseCursor;
5152
import org.opensearch.sql.ast.tree.Dedupe;
5253
import org.opensearch.sql.ast.tree.Eval;
@@ -70,6 +71,7 @@
7071
import org.opensearch.sql.ast.tree.RelationSubquery;
7172
import org.opensearch.sql.ast.tree.Rename;
7273
import org.opensearch.sql.ast.tree.Reverse;
74+
import org.opensearch.sql.ast.tree.SPath;
7375
import org.opensearch.sql.ast.tree.Sort;
7476
import org.opensearch.sql.ast.tree.SubqueryAlias;
7577
import org.opensearch.sql.ast.tree.TableFunction;
@@ -213,6 +215,10 @@ public T visitBetween(Between node, C context) {
213215
return visitChildren(node, context);
214216
}
215217

218+
public T visitBin(Bin node, C context) {
219+
return visitChildren(node, context);
220+
}
221+
216222
public T visitArgument(Argument node, C context) {
217223
return visitChildren(node, context);
218224
}
@@ -237,6 +243,10 @@ public T visitParse(Parse node, C context) {
237243
return visitChildren(node, context);
238244
}
239245

246+
public T visitSpath(SPath node, C context) {
247+
return visitChildren(node, context);
248+
}
249+
240250
public T visitLet(Let node, C context) {
241251
return visitChildren(node, context);
242252
}

0 commit comments

Comments
 (0)