Skip to content

Commit 09e69f2

Browse files
qianheng-awsclaude
andcommitted
Add Calcite engine and integration test guidance to CLAUDE.md
Add a Calcite Engine section explaining dual-engine architecture, pushdown settings, and NoPushdown suite patterns. Add integration test base class hierarchy reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 513e1b2 commit 09e69f2

1 file changed

Lines changed: 159 additions & 0 deletions

File tree

CLAUDE.md

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
OpenSearch SQL plugin — enables SQL and PPL (Piped Processing Language) queries against OpenSearch. This is a multi-module Gradle project (Java 21) that functions as an OpenSearch plugin.
8+
9+
## Build Commands
10+
11+
```bash
12+
# Full build (compiles, tests, checks)
13+
./gradlew build
14+
15+
# Fast build (skip integration tests)
16+
./gradlew build -x integTest
17+
18+
# Build specific module
19+
./gradlew :core:build
20+
./gradlew :sql:build
21+
./gradlew :ppl:build
22+
23+
# Run unit tests only
24+
./gradlew test
25+
26+
# Run a single unit test class
27+
./gradlew :core:test --tests "org.opensearch.sql.analysis.AnalyzerTest"
28+
29+
# Run integration tests
30+
./gradlew :integ-test:integTest
31+
32+
# Run a single integration test
33+
./gradlew :integ-test:integTest -Dtests.class="*QueryIT"
34+
35+
# Skip Prometheus if unavailable
36+
./gradlew :integ-test:integTest -DignorePrometheus
37+
38+
# Code formatting
39+
./gradlew spotlessCheck # Check
40+
./gradlew spotlessApply # Auto-fix
41+
42+
# Regenerate ANTLR parsers from grammar files
43+
./gradlew generateGrammarSource
44+
45+
# Run plugin locally with OpenSearch
46+
./gradlew :opensearch-sql-plugin:run
47+
./gradlew :opensearch-sql-plugin:run -DdebugJVM # With remote debug on port 5005
48+
49+
# Run doctests
50+
./gradlew :doctest:doctest
51+
./gradlew :doctest:doctest -Pdocs=search # Single file
52+
```
53+
54+
## Code Style
55+
56+
- **Google Java Format** enforced via Spotless (2-space indent, 100 char line limit)
57+
- **Lombok** is used throughout — `@Getter`, `@Builder`, `@RequiredArgsConstructor`, etc.
58+
- **License header** required on all Java files (Apache 2.0). Missing headers fail the build.
59+
- Pre-commit hooks run `spotlessApply` automatically
60+
61+
## Architecture
62+
63+
### Query Pipeline
64+
65+
```
66+
User Query (SQL/PPL)
67+
→ Parsing (ANTLR) — produces parse tree
68+
→ AST Construction (AstBuilder visitor) — produces UnresolvedPlan
69+
→ Semantic Analysis (Analyzer) — resolves symbols/types → LogicalPlan
70+
→ Planning (Planner + LogicalPlanOptimizer) — produces PhysicalPlan
71+
→ Execution (ExecutionEngine) — streams ExprValue results
72+
→ Response Formatting (ResponseFormatter — JSON/CSV/JDBC)
73+
```
74+
75+
### Module Dependency Graph
76+
77+
```
78+
plugin (OpenSearch plugin entry point, Guice DI wiring)
79+
├── sql — SQL parsing (ANTLR → AST via SQLSyntaxParser/AstBuilder)
80+
├── ppl — PPL parsing (ANTLR → AST via PPLSyntaxParser/AstBuilder)
81+
├── core — Central module: Analyzer, Planner, ExecutionEngine interfaces,
82+
│ AST/LogicalPlan/PhysicalPlan node types, expression system, type system
83+
├── opensearch — OpenSearch storage engine, execution engine, client
84+
├── protocol — Response formatters (JSON, CSV, JDBC, YAML)
85+
├── common — Shared settings and utilities
86+
├── legacy — V1 SQL engine (backward compatibility fallback)
87+
├── datasources — Multi-datasource support (Glue, Security Lake, Prometheus)
88+
├── async-query / async-query-core — Spark-based async query execution
89+
├── direct-query / direct-query-core — Direct external datasource queries
90+
└── language-grammar — Centralized ANTLR .g4 grammar files
91+
```
92+
93+
`core` has no dependency on other modules. `sql` and `ppl` depend on `core` and `language-grammar`. `opensearch` implements `core` interfaces.
94+
95+
### Key Source Locations
96+
97+
| Area | Key Files |
98+
|------|-----------|
99+
| Plugin entry | `plugin/.../SQLPlugin.java`, `plugin/.../OpenSearchPluginModule.java` |
100+
| SQL parsing | `sql/.../sql/parser/AstBuilder.java`, `sql/.../SQLService.java` |
101+
| PPL parsing | `ppl/.../ppl/parser/AstBuilder.java`, `ppl/.../PPLService.java` |
102+
| ANTLR grammars | `language-grammar/src/main/antlr4/` (OpenSearchSQLParser.g4, OpenSearchPPLParser.g4) |
103+
| Analysis | `core/.../analysis/Analyzer.java`, `core/.../analysis/ExpressionAnalyzer.java` |
104+
| Planning | `core/.../planner/Planner.java`, `core/.../planner/logical/LogicalPlan.java` |
105+
| Execution | `core/.../executor/ExecutionEngine.java`, `opensearch/.../OpenSearchExecutionEngine.java` |
106+
| Storage | `opensearch/.../storage/OpenSearchStorageEngine.java` |
107+
| Query orchestration | `core/.../executor/QueryService.java`, `core/.../executor/QueryPlanFactory.java` |
108+
109+
### Core Abstractions
110+
111+
- **`Node<T>`** — Base AST node with visitor pattern support
112+
- **`UnresolvedPlan`** / **`LogicalPlan`** / **`PhysicalPlan`** — Query plan hierarchy (unresolved → logical → physical)
113+
- **`Expression`** — Resolved expression with `valueOf()` and `type()`
114+
- **`ExprValue`** — Runtime value types (ExprIntegerValue, ExprStringValue, etc.)
115+
- **`ExprType`** — Type system (DATE, TIMESTAMP, DOUBLE, STRUCT, etc.)
116+
- **`StorageEngine`** / **`Table`** — Pluggable storage abstraction
117+
- **`ExecutionEngine`** — Executes physical plans, returns QueryResponse
118+
119+
### Design Patterns
120+
121+
- **Visitor pattern** used pervasively: `AbstractNodeVisitor`, `LogicalPlanNodeVisitor`, `PhysicalPlanNodeVisitor`, `ExpressionNodeVisitor`
122+
- **PhysicalPlan** implements `Iterator<ExprValue>` for streaming execution
123+
- **Guice** dependency injection in `OpenSearchPluginModule`
124+
- Storage engines implement `Table.optimize()` and `Table.implement()` for push-down optimization
125+
126+
## Adding New PPL Commands
127+
128+
Follow the checklist in `docs/dev/ppl-commands.md`:
129+
1. Update lexer/parser grammars (OpenSearchPPLLexer.g4, OpenSearchPPLParser.g4)
130+
2. Add AST node under `org.opensearch.sql.ast.tree`
131+
3. Add `visit*` method in `AbstractNodeVisitor`, override in `Analyzer`, `CalciteRelNodeVisitor`, `PPLQueryDataAnonymizer`
132+
4. Unit tests extending `CalcitePPLAbstractTest` (include `verifyLogical()` and `verifyPPLToSparkSQL()`)
133+
5. Integration tests extending `PPLIntegTestCase`
134+
6. Add user docs under `docs/user/ppl/cmd/`
135+
136+
## Adding New PPL Functions
137+
138+
Follow `docs/dev/ppl-functions.md`. Three approaches:
139+
1. Reuse existing Calcite operators from `SqlStdOperatorTable`/`SqlLibraryOperators`
140+
2. Adapt static Java methods via `UserDefinedFunctionUtils.adapt*ToUDF`
141+
3. Implement `ImplementorUDF` interface from scratch, register in `PPLBuiltinOperators`
142+
143+
## Calcite Engine
144+
145+
The project has two execution engines: the legacy **v2 engine** and the newer **Calcite engine** (Apache Calcite-based). Calcite is toggled via `plugins.calcite.enabled` setting (default: off in production, toggled per-test in integration tests).
146+
147+
- In integration tests, call `enableCalcite()` in `init()` to activate the Calcite path
148+
- Some features (e.g., graphLookup) require pushdown optimization — use `enabledOnlyWhenPushdownIsEnabled()` to skip tests in the `CalciteNoPushdownIT` suite
149+
- `CalciteNoPushdownIT` is a JUnit `@Suite` that re-runs Calcite test classes with pushdown disabled; add new test classes to its `@Suite.SuiteClasses` list
150+
151+
## Integration Tests
152+
153+
Located in `integ-test/src/test/java/`. Organized by area: `sql/`, `ppl/`, `calcite/`, `legacy/`, `jdbc/`, `datasource/`, `asyncquery/`, `security/`. Uses OpenSearch test framework (in-memory cluster per test class). YAML REST tests in `integ-test/src/yamlRestTest/resources/rest-api-spec/test/`.
154+
155+
Key base classes:
156+
- `PPLIntegTestCase` — base for PPL integration tests (v2 engine)
157+
- `CalcitePPLIT` — base for Calcite PPL integration tests (calls `enableCalcite()`)
158+
- `CalcitePPLAbstractTest` — base for Calcite PPL unit tests (`verifyLogical()`, `verifyPPLToSparkSQL()`)
159+
- `CalciteExplainIT` — explain plan tests using YAML expected output files in `integ-test/src/test/resources/expectedOutput/calcite/`

0 commit comments

Comments
 (0)