Add AnalyticsExecutionEngine and query routing for Parquet-backed indices by ahkcs · Pull Request #5256 · opensearch-project/sql

ahkcs · 2026-03-23T21:54:59Z

Summary

Implements the execution engine adapter and query routing infrastructure for Project Mustang's unified query pipeline (#5247).

QueryPlanExecutor — @FunctionalInterface contract for analytics engine plan execution. Local equivalent of upstream org.opensearch.analytics.exec.QueryPlanExecutor; will be swapped when analytics-framework JAR is published.
AnalyticsExecutionEngine — Implements ExecutionEngine. Bridges QueryPlanExecutor output (Iterable<Object[]>) to the SQL plugin's QueryResponse pipeline with type mapping via OpenSearchTypeFactory and row conversion via ExprValueUtils.
RestUnifiedQueryAction — Orchestrates the full analytics query path: index detection, schema building, PPL planning via UnifiedQueryPlanner, execution via AnalyticsExecutionEngine, response formatting via JdbcResponseFormatter. Runs on sql-worker thread pool.
RestPPLQueryAction — Added routing branch: queries targeting parquet_ prefixed indices are routed to the analytics path; all other queries go through the existing Lucene path unchanged.
StubQueryPlanExecutor — Returns canned data for parquet_logs and parquet_metrics stub tables, enabling full pipeline testing without the analytics engine.

Test plan

12 unit tests for AnalyticsExecutionEngine (type mapping, row conversion, query size limit, null handling, error propagation, explain, PhysicalPlan rejection)
10 unit tests for RestUnifiedQueryAction (index name extraction, routing detection for various PPL patterns)
Integration tests (Step 5 — requires test cluster with stub plugin)
Existing PPL integration test suite regression check

github-actions · 2026-03-23T21:56:21Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 829ca34.

Path	Line	Severity	Description
plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java	1	high	A stub executor that returns hardcoded fake data (canned rows with IP addresses and fixed timestamps) is wired directly into the production plugin (SQLPlugin.java). Any PPL query targeting a 'parquet_*' index will silently return fabricated results instead of real data, completely bypassing the actual query engine. This data falsification is registered unconditionally in the production handler, not gated behind a test or feature flag.
plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java	168	high	RestPPLQueryAction is registered twice in the REST handler list: once with the now-removed no-arg constructor (which no longer exists per RestPPLQueryAction.java diff, making this a compile-time landmine or crash) and once with the new parameterized constructor. The duplicate registration is suspicious—if it compiles, it could double-handle PPL requests; if it does not, it acts as a sabotage of the build/deploy pipeline. Either outcome is consistent with deliberate interference.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java	104	medium	The isUnifiedQueryPath routing bypass silently diverts all PPL queries whose source index begins with 'parquet_' away from the normal OpenSearch execution path (including its authorization and access-control checks) into the new analytics pipeline backed by the stub executor. Any user who knows the naming convention can route their query to a path with weaker or absent access controls.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java	50	medium	The original no-arg RestPPLQueryAction() constructor is replaced by a constructor requiring ClusterService and NodeClient, but SQLPlugin.java still calls new RestPPLQueryAction() (the removed constructor). If the final field 'unifiedQueryHandler' is uninitialized via a surviving default constructor path, calling the unified routing branch will produce a NullPointerException that disrupts the entire PPL query service—a potential availability impact.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java	261	low	The reportError method inlines raw exception class names and messages directly into HTTP 500 responses as unescaped JSON strings. While not intentionally malicious, this can be exploited to leak internal class names, stack context, and configuration details to unauthenticated callers, aiding further reconnaissance.

The table above displays the top 10 most important findings.

Total: 5 | Critical: 0 | High: 2 | Medium: 2 | Low: 1

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-23T22:07:34Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 253da49.

Path	Line	Severity	Description
plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java	1	medium	A stub executor returning hardcoded rows is present in production code (not a test class). Any query targeting 'parquet_logs' or 'parquet_metrics' will silently return fabricated data including hardcoded IP addresses and fake status codes, bypassing real data access. While labeled as a development stub with TODOs, shipping this in a production plugin path could mislead users about real data.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java	87	medium	Queries targeting any 'parquet_*' index are silently rerouted to the analytics engine execution path via isUnifiedQueryPath(). If the analytics engine enforces different (or no) authorization/security checks compared to the standard OpenSearch query path, this prefix-based routing could act as an authorization bypass for attacker-controlled index names starting with 'parquet_'.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java	52	low	The ClusterService parameter is accepted in the constructor but never stored or used — only NodeClient is kept. Accepting an unused privileged object (ClusterService grants broad cluster-state access) without purpose is an anomaly worth investigating; it may indicate an incomplete or removed feature rather than malicious intent.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java	261	low	reportError() inlines the raw exception message directly into the JSON response body with only double-quote escaping. Depending on what the underlying exception surfaces (e.g., file paths, internal query plans, stack frames from wrapped exceptions), this can leak internal implementation details to unauthenticated callers.

The table above displays the top 10 most important findings.

Total: 4 | Critical: 0 | High: 0 | Medium: 2 | Low: 2

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-23T22:08:34Z

PR Reviewer Guide 🔍

(Review updated until commit `3b1ecb8`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 Security concerns Sensitive information exposure: The `reportError` method in `RestUnifiedQueryAction` reflects raw exception messages (including stack-trace-level detail from `e.getMessage()`) directly into HTTP responses. Depending on the exception type, this can leak internal implementation details, file paths, or configuration values to API callers. Exception messages should be sanitized or mapped to user-facing messages before being sent in responses.
📝 TODO sections plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [126]: Replace stub schema with EngineContext.getSchema() when analytics engine is ready
🔀 Multiple PR themes Sub-PR theme: Core analytics execution engine and QueryPlanExecutor interface Relevant files: core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java core/src/main/java/org/opensearch/sql/executor/analytics/QueryPlanExecutor.java core/src/test/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngineTest.java Sub-PR theme: PPL query routing and REST handler for analytics engine path Relevant files: plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java plugin/src/test/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryActionTest.java integ-test/src/test/java/org/opensearch/sql/ppl/AnalyticsPPLIT.java
⚡ Recommended focus areas for review XSS / Injection in Error Response The `reportError` method manually constructs a JSON error response by string-concatenating `e.getClass().getSimpleName()` and `e.getMessage()`. While `getMessage()` has its double-quotes escaped, other JSON-special characters (newlines, backslashes, control characters) are not escaped. A crafted exception message could break the JSON structure or inject content into the response. Use a proper JSON serializer (e.g., Jackson `ObjectMapper`) to build the error body. private static void reportError(RestChannel channel, Exception e) { channel.sendResponse( new BytesRestResponse( org.opensearch.core.rest.RestStatus.INTERNAL_SERVER_ERROR, "application/json; charset=UTF-8", "{\"error\":{\"type\":\"" + e.getClass().getSimpleName() + "\",\"reason\":\"" + (e.getMessage() != null ? e.getMessage().replace("\"", "\\\"") : "Unknown error") + "\"},\"status\":500}")); } Stub in Production Code `StubQueryPlanExecutor` is wired directly into `RestPPLQueryAction` (production plugin bootstrap) via `new StubQueryPlanExecutor()`. This means every deployed instance will use hardcoded canned data for any `parquet_` index, silently returning fake rows instead of an error or real data. This should not be merged to a production branch without a guard (e.g., a feature flag or ensuring the stub is only used in test scope). if (planStr.contains("parquet_metrics")) { Fragile Table Detection* `extractTableName` calls `RelOptUtil.toString(plan)` and does a string `contains` check to identify the table. This is fragile: any plan whose string representation happens to contain the substring "parquet_logs" (e.g., a filter literal, a column alias, or a comment) will incorrectly match. For a stub this is low-severity, but it will cause silent wrong-data bugs during pipeline testing. private String extractTableName(RelNode plan) { // Use RelOptUtil.toString to get the full plan tree including child nodes String planStr = RelOptUtil.toString(plan); if (planStr.contains("parquet_logs")) { return "parquet_logs"; } if (planStr.contains("parquet_metrics")) { return "parquet_metrics"; } return null; } Null context passed to executor `planExecutor.execute(plan, null)` always passes `null` as the execution context. The `QueryPlanExecutor` interface documents the parameter as "opaque execution context", implying it may carry security credentials, transaction state, or resource limits. Passing `null` unconditionally may cause a NullPointerException in the real executor when it is wired in, and it discards any context available in `CalcitePlanContext`. The `CalcitePlanContext` (or a wrapper) should be passed instead. Iterable<Object[]> rows = planExecutor.execute(plan, null);

github-actions · 2026-03-23T22:09:20Z

PR Code Suggestions ✨

Latest suggestions up to 3b1ecb8

Explore these optional code suggestions:

Category	Suggestion	Impact
Security	Fix incomplete JSON escaping in error responses The error message is manually escaped only for double quotes, but other special characters (backslashes, newlines, control characters) can break the JSON structure or cause injection issues. Use a proper JSON serialization approach instead of manual string concatenation for the error response body. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [263-273] private static void reportError(RestChannel channel, Exception e) { + String reason = e.getMessage() != null ? e.getMessage() : "Unknown error"; + // Escape all JSON special characters + String escapedReason = reason + .replace("\\", "\\\\") + .replace("\"", "\\\"") + .replace("\n", "\\n") + .replace("\r", "\\r") + .replace("\t", "\\t"); channel.sendResponse( new BytesRestResponse( org.opensearch.core.rest.RestStatus.INTERNAL_SERVER_ERROR, "application/json; charset=UTF-8", "{\"error\":{\"type\":\"" + e.getClass().getSimpleName() + "\",\"reason\":\"" - + (e.getMessage() != null ? e.getMessage().replace("\"", "\\\"") : "Unknown error") + + escapedReason + "\"},\"status\":500}")); } Suggestion importance[1-10]: 6 __ Why: The manual JSON escaping in `reportError` only handles double quotes, leaving backslashes, newlines, and other control characters unescaped, which could break the JSON structure. The improved code adds more escape sequences, though a proper JSON library would be the ideal solution.	Low
General	Handle thread pool scheduling failures gracefully If the `SQL_WORKER_THREAD_POOL_NAME` thread pool does not exist or is not registered, the `schedule` call will throw an exception that is not caught, leaving the REST channel without a response and potentially hanging the client. The scheduling call should be wrapped in a try-catch to ensure the channel always receives a response. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [112-119] public void execute(String query, QueryType queryType, RestChannel channel, boolean isExplain) { - client - .threadPool() - .schedule( - () -> doExecute(query, queryType, channel, isExplain), - new TimeValue(0), - SQL_WORKER_THREAD_POOL_NAME); + try { + client + .threadPool() + .schedule( + () -> doExecute(query, queryType, channel, isExplain), + new TimeValue(0), + SQL_WORKER_THREAD_POOL_NAME); + } catch (Exception e) { + recordFailureMetric(e); + reportError(channel, e); + } } Suggestion importance[1-10]: 5 __ Why: If the thread pool scheduling fails, the REST channel would hang without a response. Wrapping the `schedule` call in a try-catch ensures the client always receives a response, improving robustness.	Low
General	Pass execution context instead of null to plan executor The `context` parameter from `CalcitePlanContext` is passed as `null` to `planExecutor.execute()`. If the `QueryPlanExecutor` implementation requires a non-null context for proper execution (e.g., for security, timeout, or resource management), this will silently produce incorrect behavior or failures. The execution context should be derived from `CalcitePlanContext` and passed through. core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [72] -Iterable<Object[]> rows = planExecutor.execute(plan, null); +Iterable<Object[]> rows = planExecutor.execute(plan, context); Suggestion importance[1-10]: 3 __ Why: While passing `null` as context is intentional per the `QueryPlanExecutor` interface design (context is "opaque to avoid server dependency"), passing the `CalcitePlanContext` object directly would violate the interface contract since it's typed as `Object`. The suggestion may be premature given the current stub design.	Low
Possible issue	Fix substring matching order to avoid false table name matches The `parquet_metrics` check will never be reached because `parquet_metrics` also contains the substring `parquet_` and the string `parquet_logs` does not match `parquet_metrics`, but the order matters if a plan string could contain both. More critically, a plan involving `parquet_metrics` would never match `parquet_logs`, so the logic is correct in isolation — however, a plan string containing `parquet_logs_metrics` or similar would incorrectly match `parquet_logs`. The check should use exact word/table-name matching rather than substring containment to avoid false positives. plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java [49-59] private String extractTableName(RelNode plan) { - // Use RelOptUtil.toString to get the full plan tree including child nodes String planStr = RelOptUtil.toString(plan); + if (planStr.contains("parquet_metrics")) { + return "parquet_metrics"; + } if (planStr.contains("parquet_logs")) { return "parquet_logs"; - } - if (planStr.contains("parquet_metrics")) { - return "parquet_metrics"; } return null; } Suggestion importance[1-10]: 4 __ Why: The suggestion correctly identifies that checking `parquet_metrics` before `parquet_logs` avoids potential false matches (e.g., a table named `parquet_logs_metrics`). However, since this is a stub implementation meant to be replaced, the impact is limited.	Low

Previous suggestions

Suggestions up to commit 094a096

Category	Suggestion	Impact
Security	Fix incomplete JSON escaping in error responses The error message is manually escaped only for double quotes, but other special characters (newlines, backslashes, control characters) in the exception message can break the JSON structure or cause injection issues. Use a proper JSON serialization library (e.g., Jackson's `ObjectMapper` or `JsonStringEncoder`) to safely encode the error message string. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [263-273] private static void reportError(RestChannel channel, Exception e) { + String rawMessage = e.getMessage() != null ? e.getMessage() : "Unknown error"; + String safeMessage = rawMessage + .replace("\\", "\\\\") + .replace("\"", "\\\"") + .replace("\n", "\\n") + .replace("\r", "\\r") + .replace("\t", "\\t"); channel.sendResponse( new BytesRestResponse( org.opensearch.core.rest.RestStatus.INTERNAL_SERVER_ERROR, "application/json; charset=UTF-8", "{\"error\":{\"type\":\"" + e.getClass().getSimpleName() + "\",\"reason\":\"" - + (e.getMessage() != null ? e.getMessage().replace("\"", "\\\"") : "Unknown error") + + safeMessage + "\"},\"status\":500}")); } Suggestion importance[1-10]: 6 __ Why: The manual JSON escaping only handles double quotes but misses other special characters like newlines, backslashes, and control characters that could break the JSON structure. This is a valid security/correctness concern, though using a proper JSON library would be the ideal fix rather than the improved manual escaping shown.	Low
Possible issue	Handle thread pool scheduling failures gracefully The `schedule` call returns a `Cancellable` that is silently discarded. If the thread pool is shut down or the task is rejected, the error will be swallowed and the REST channel will never receive a response, leaving the client hanging. The scheduled task's failure should be handled, or the channel should be closed on rejection. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [112-119] public void execute(String query, QueryType queryType, RestChannel channel, boolean isExplain) { - client - .threadPool() - .schedule( - () -> doExecute(query, queryType, channel, isExplain), - new TimeValue(0), - SQL_WORKER_THREAD_POOL_NAME); + try { + client + .threadPool() + .schedule( + () -> doExecute(query, queryType, channel, isExplain), + new TimeValue(0), + SQL_WORKER_THREAD_POOL_NAME); + } catch (Exception e) { + recordFailureMetric(e); + reportError(channel, e); + } } Suggestion importance[1-10]: 5 __ Why: The discarded `Cancellable` return value from `schedule` could leave the REST channel hanging if the thread pool rejects the task. The suggestion to wrap in a try-catch is a reasonable improvement, though the actual risk depends on the thread pool's rejection policy.	Low
Possible issue	Fix substring matching ambiguity in table name extraction The check for `"parquet_logs"` will also match `"parquet_logs_archive"` or any other table name that contains `"parquet_logs"` as a substring, and since `parquet_logs` is checked before `parquet_metrics`, a table named `parquet_logs_metrics` would incorrectly match `parquet_logs`. Use more precise matching (e.g., exact word boundary or checking for the full table name with surrounding non-alphanumeric characters). plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java [49-59] private String extractTableName(RelNode plan) { - // Use RelOptUtil.toString to get the full plan tree including child nodes String planStr = RelOptUtil.toString(plan); - if (planStr.contains("parquet_logs")) { + if (planStr.matches("(?s).\\bparquet_metrics\\b.")) { + return "parquet_metrics"; + } + if (planStr.matches("(?s).\\bparquet_logs\\b.")) { return "parquet_logs"; - } - if (planStr.contains("parquet_metrics")) { - return "parquet_metrics"; } return null; } Suggestion importance[1-10]: 4 __ Why: The `contains` check for `"parquet_logs"` would incorrectly match table names like `"parquet_logs_archive"`, and the ordering matters since `parquet_logs` is checked before `parquet_metrics`. However, this is a stub implementation meant to be replaced, reducing the practical impact.	Low
General	Ensure streaming result resources are properly closed The `rows` iterable from `planExecutor.execute` may represent a lazy/streaming result set (e.g., backed by a cursor or network stream). Calling `plan.getRowType().getFieldList()` after `execute` is safe, but if the iterable holds resources (like a JDBC ResultSet or file handle), there is no guarantee it will be closed after iteration in `convertRows`. Consider wrapping the iterable in a try-with-resources if it implements `Closeable`, or documenting this constraint in `QueryPlanExecutor`. core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [72-75] Iterable<Object[]> rows = planExecutor.execute(plan, null); List<RelDataTypeField> fields = plan.getRowType().getFieldList(); -List<ExprValue> results = convertRows(rows, fields, querySizeLimit); +List<ExprValue> results; +if (rows instanceof AutoCloseable) { + try (AutoCloseable closeable = (AutoCloseable) rows) { + results = convertRows(rows, fields, querySizeLimit); + } +} else { + results = convertRows(rows, fields, querySizeLimit); +} Suggestion importance[1-10]: 4 __ Why: If `planExecutor.execute` returns a resource-backed iterable (e.g., a cursor), it may not be closed after iteration, causing resource leaks. The suggested `AutoCloseable` check is a reasonable defensive approach, though the `QueryPlanExecutor` interface could also be updated to return a `Closeable` iterable.	Low

Suggestions up to commit 253da49

Category	Suggestion	Impact
Possible issue	Fix incomplete JSON escaping in error responses The manual JSON construction in `reportError` is fragile and incomplete — it only escapes double quotes but not other special characters (backslash, newlines, control characters) that can appear in exception messages, potentially producing malformed JSON. Use a proper JSON serialization approach or at minimum apply full JSON string escaping to the message. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [263-273] private static void reportError(RestChannel channel, Exception e) { + String message = e.getMessage() != null ? e.getMessage() : "Unknown error"; + // Escape all JSON special characters + String escapedMessage = message + .replace("\\", "\\\\") + .replace("\"", "\\\"") + .replace("\n", "\\n") + .replace("\r", "\\r") + .replace("\t", "\\t"); channel.sendResponse( new BytesRestResponse( org.opensearch.core.rest.RestStatus.INTERNAL_SERVER_ERROR, "application/json; charset=UTF-8", "{\"error\":{\"type\":\"" + e.getClass().getSimpleName() + "\",\"reason\":\"" - + (e.getMessage() != null ? e.getMessage().replace("\"", "\\\"") : "Unknown error") + + escapedMessage + "\"},\"status\":500}")); } Suggestion importance[1-10]: 6 __ Why: The manual JSON construction only escapes double quotes but not backslashes, newlines, or other control characters that could appear in exception messages, producing malformed JSON. This is a valid correctness concern, though the code is explicitly temporary/stub infrastructure.	Low
General	Pass execution context instead of null to plan executor Passing `null` as the execution context to `planExecutor.execute` is hardcoded and will prevent proper context propagation when the real analytics engine is integrated. The `CalcitePlanContext` is already available in scope and should be passed (or a derived context object) to allow the executor to access query settings, credentials, or other runtime information. core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [72] -Iterable<Object[]> rows = planExecutor.execute(plan, null); +Iterable<Object[]> rows = planExecutor.execute(plan, context); Suggestion importance[1-10]: 5 __ Why: Passing `null` as the execution context is intentional for the current stub/development phase, but passing the available `context` object would be better for future integration with the real analytics engine. The `QueryPlanExecutor` interface accepts `Object context`, so passing `context` directly would work but may not be semantically correct since `CalcitePlanContext` is not the same as the analytics engine's execution context.	Low
	Fix substring collision in table name extraction Using `plan.toString()` to extract the table name is unreliable because the string representation of a `RelNode` is implementation-dependent and not guaranteed to contain the raw table name. Additionally, `parquet_metrics` check will never be reached since `parquet_logs` check comes first and `parquet_metrics` also contains `parquet_logs` as a substring — wait, actually it doesn't, but `parquet_logs` check would match any string containing that substring. The order is correct here, but the approach is still fragile. Consider checking `parquet_metrics` before `parquet_logs` to avoid potential future substring collision issues. plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java [48-58] private String extractTableName(RelNode plan) { // Walk the plan to find the table name from the string representation String planStr = plan.toString(); + // Check more specific names first to avoid substring collisions + if (planStr.contains("parquet_metrics")) { + return "parquet_metrics"; + } if (planStr.contains("parquet_logs")) { return "parquet_logs"; - } - if (planStr.contains("parquet_metrics")) { - return "parquet_metrics"; } return null; } Suggestion importance[1-10]: 3 __ Why: The suggestion correctly identifies that checking `parquet_metrics` before `parquet_logs` avoids potential future substring collision issues, but since `parquet_metrics` does not contain `parquet_logs` as a substring, there is no actual bug in the current code. This is a minor defensive improvement on stub/temporary code.	Low
	Prevent bare prefix from matching unified query path The `SOURCE_PATTERN` regex allows wildcard characters () in index names (e.g., `parquet_`), but `tableName.startsWith("parquet_")` would correctly match `parquet_*`. However, a query like `source = parquet_` (with nothing after the underscore) would also match, which may not be intended. Consider adding a minimum length check or requiring at least one character after the prefix. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [76-88] public static boolean isUnifiedQueryPath(String query) { if (query == null) { return false; } String indexName = extractIndexName(query); if (indexName == null) { return false; } // Handle qualified names like "catalog.parquet_logs" — check the last segment int lastDot = indexName.lastIndexOf('.'); String tableName = lastDot >= 0 ? indexName.substring(lastDot + 1) : indexName; - return tableName.startsWith("parquet_"); + // Require at least one character after "parquet_" prefix + return tableName.startsWith("parquet_") && tableName.length() > "parquet_".length(); } Suggestion importance[1-10]: 2 __ Why: The edge case of `source = parquet_` (bare prefix with nothing after the underscore) is an extremely unlikely real-world scenario, and the suggestion adds a minor defensive check with minimal practical impact on the routing logic.	Low

github-actions · 2026-03-23T23:26:51Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 094a096.

Path	Line	Severity	Description
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java	52	medium	StubQueryPlanExecutor is wired into production code (not test code), meaning all queries to 'parquet_*' indices bypass the real execution engine and return hardcoded data regardless of the actual query. While documented as temporary, placing a stub in src/main/java and instantiating it unconditionally in the production constructor is anomalous and could be used to return attacker-controlled or misleading results.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java	95	medium	Queries matching 'parquet_*' index names are silently rerouted to RestUnifiedQueryAction, bypassing the normal nodeClient.execute() path (which goes through the OpenSearch transport layer with its associated authentication and authorization checks). This could allow a user who names an index with the 'parquet_' prefix to bypass security controls enforced in the standard PPL execution pipeline.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java	49	low	The ClusterService parameter is accepted in the constructor but never stored or used. This dead parameter may indicate incomplete implementation or an intentional omission of cluster-level security checks that would normally use ClusterService.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java	261	low	Exception messages are directly interpolated into the JSON error response body without sanitization (only quotes are escaped). This can leak internal implementation details, stack information, or sensitive data (e.g., file paths, query internals) to clients.

The table above displays the top 10 most important findings.

Total: 4 | Critical: 0 | High: 0 | Medium: 2 | Low: 2

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-23T23:27:47Z

Persistent review updated to latest commit 094a096

…ices (opensearch-project#5247) Implements the execution engine adapter and query routing infrastructure for Project Mustang's unified query pipeline (Option B). PPL queries targeting parquet_ prefixed indices are routed through UnifiedQueryPlanner and AnalyticsExecutionEngine instead of the existing Lucene path. Key changes: - QueryPlanExecutor interface: contract for analytics engine execution - AnalyticsExecutionEngine: converts QueryPlanExecutor results to QueryResponse - RestUnifiedQueryAction: orchestrates schema building, planning, execution - RestPPLQueryAction: routing branch for parquet_ indices - StubQueryPlanExecutor: canned data for development/testing Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>

ahkcs · 2026-03-23T23:29:43Z

Retargeting to feature branch feature/mustang-ppl-integration

github-actions · 2026-03-23T23:30:17Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 3b1ecb8.

Path	Line	Severity	Description
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java	262	low	Error response is constructed via string concatenation using e.getMessage() with only quote-escaping. Other JSON metacharacters are not escaped, allowing a crafted exception message to produce malformed or injected JSON in the error response body. This appears to be a coding shortcut rather than malicious intent, but warrants a fix using a proper JSON serializer.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 0 | Low: 1

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-23T23:31:16Z

Persistent review updated to latest commit 3b1ecb8

ahkcs added PPL Piped processing language enhancement New feature or request labels Mar 23, 2026

ahkcs force-pushed the feature/mustang-analytics-execution-engine branch from 829ca34 to 253da49 Compare March 23, 2026 22:06

ahkcs force-pushed the feature/mustang-analytics-execution-engine branch from 253da49 to 094a096 Compare March 23, 2026 23:25

ahkcs force-pushed the feature/mustang-analytics-execution-engine branch from 094a096 to 3b1ecb8 Compare March 23, 2026 23:29

ahkcs closed this Mar 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add AnalyticsExecutionEngine and query routing for Parquet-backed indices#5256

Add AnalyticsExecutionEngine and query routing for Parquet-backed indices#5256
ahkcs wants to merge 1 commit into
opensearch-project:mainfrom
ahkcs:feature/mustang-analytics-execution-engine

ahkcs commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

ahkcs commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ahkcs commented Mar 23, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented Mar 23, 2026

PR Code Analyzer ❗

Uh oh!

github-actions Bot commented Mar 23, 2026

PR Code Analyzer ❗

Uh oh!

github-actions Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 3b1ecb8)

Uh oh!

github-actions Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions Bot commented Mar 23, 2026

PR Code Analyzer ❗

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

ahkcs commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026

PR Code Analyzer ❗

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Mar 23, 2026 •

edited

Loading

(Review updated until commit `3b1ecb8`)

github-actions Bot commented Mar 23, 2026 •

edited

Loading