Skip to content

Support SQL query profiling with the analytics engine#5475

Merged
dai-chen merged 2 commits into
opensearch-project:mainfrom
dai-chen:sql-profile-support
May 28, 2026
Merged

Support SQL query profiling with the analytics engine#5475
dai-chen merged 2 commits into
opensearch-project:mainfrom
dai-chen:sql-profile-support

Conversation

@dai-chen

@dai-chen dai-chen commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Description

Adds a profile boolean field to the /_plugins/_sql request body to capture per-stage query timings, mirroring PPL's existing profile pattern. Also propagates the profile context across the Analytics Engine's async listener thread so profile output is visible in the response.

  • Enabled when "profile": true is set on a SQL query that routes to the analytics engine.
  • Ignored for V2 SQL: queries handled by the V2 SQL engine silently drop the flag.

Related Issues

Part of #5246.

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Add a `profile` boolean field to SQLQueryRequest and thread it through
the SQL REST handler to the analytics router. Mirrors PPL's existing
profile pattern: parsed from the JSON body's `profile` key, gated by
the same rules (only honored for non-explain JDBC requests on the
analytics engine path).

Replaces the hardcoded `false` in SQLPlugin#createSqlAnalyticsRouter so
analytics-engine queries now honor the request's profile flag. The V2
SQL engine path is unchanged; profile is silently ignored when the
request does not route to the analytics engine.

Refs: opensearch-project#5317
Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen self-assigned this May 27, 2026
@dai-chen dai-chen added enhancement New feature or request SQL labels May 27, 2026
@github-actions

github-actions Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

PR Reviewer Guide 🔍

(Review updated until commit b968d7f)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Resource Leak

The UnifiedQueryContext is no longer closed in a try-with-resources block. If analyticsEngine.execute() throws an exception before the listener is invoked, or if the listener is never called due to an internal error, context.close() will not run. This can leak resources such as open connections or allocated memory.

UnifiedQueryContext context = buildContext(queryType, profiling);
ActionListener<TransportPPLQueryResponse> closingListener =
    wrapWithContextClose(context, listener);
try {
  UnifiedQueryPlanner planner = new UnifiedQueryPlanner(context);
  RelNode plan = planner.plan(query);
  CalcitePlanContext planContext = context.getPlanContext();
  plan = addQuerySizeLimit(plan, planContext);
  analyticsEngine.execute(
      plan, planContext, createQueryListener(queryType, closingListener));
} catch (Exception e) {
  closingListener.onFailure(e);
Possible Issue

If buildContext() throws an exception, closingListener is never initialized, causing a compilation error or runtime failure when closingListener.onFailure(e) is called. The catch block at line 131 references closingListener before it is assigned.

UnifiedQueryContext context = buildContext(queryType, profiling);
ActionListener<TransportPPLQueryResponse> closingListener =
    wrapWithContextClose(context, listener);
try {
  UnifiedQueryPlanner planner = new UnifiedQueryPlanner(context);
  RelNode plan = planner.plan(query);
  CalcitePlanContext planContext = context.getPlanContext();
  plan = addQuerySizeLimit(plan, planContext);
  analyticsEngine.execute(
      plan, planContext, createQueryListener(queryType, closingListener));
} catch (Exception e) {
  closingListener.onFailure(e);

@github-actions

Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Add null check for format field

The profile check may fail when format is null, causing a NullPointerException. Add
a null check for the format field before calling equalsIgnoreCase() to prevent
runtime errors.

sql/src/main/java/org/opensearch/sql/sql/domain/SQLQueryRequest.java [123-128]

 public boolean isProfileEnabled() {
   return jsonContent != null
       && jsonContent.optBoolean(QUERY_FIELD_PROFILE, false)
       && !isExplainRequest()
+      && format != null
       && Format.JDBC.getFormatName().equalsIgnoreCase(format);
 }
Suggestion importance[1-10]: 7

__

Why: Valid concern about potential NullPointerException when format is null. The suggestion correctly identifies that equalsIgnoreCase() would throw NPE if called on a null format. Adding the null check improves robustness, though the actual impact depends on whether format can be null in practice based on the codebase context.

Medium

@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 9cb31bb

QueryPlanExecutor (analytics-framework 3.7+) fires its listener on a
worker thread separate from the one that activated QueryProfiling.
Without explicit handoff, the listener-side formatter reads an empty
profile context, and the synchronous UnifiedQueryContext.close() via
try-with-resources fires before async work finishes — closing the
Calcite connection while it is still in use and clearing profile state
before the listener can read it.

- AnalyticsExecutionEngine.execute: capture the active ProfileContext
  and restore it via QueryProfiling.withCurrentContext inside the
  response callback so metric reads/writes see the same context.
- RestUnifiedQueryAction.execute: defer UnifiedQueryContext.close to
  listener completion via ActionListener.runAfter, wrapped in a
  named wrapWithContextClose helper.
- Add regression test simulating async listener delivery.

Refs: opensearch-project#5317
Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen force-pushed the sql-profile-support branch from 9cb31bb to b968d7f Compare May 28, 2026 20:15
@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit b968d7f

``_explain``) and only with the default ``format=jdbc``.

Example
-------

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this example an integration test? I am wondering if we can have an integration test for profiling(assert each metric to be greater than 0)?

@dai-chen dai-chen May 28, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this manually against OS cluster with SQL + AE. I'm also thinking can we add some dedicated IT for AE. Let me check after this.

@dai-chen dai-chen merged commit 81edddd into opensearch-project:main May 28, 2026
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request SQL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants