Skip to content

Resolve SQL unified query gaps for SELECT clauses and window functions#5450

Merged
dai-chen merged 2 commits into
opensearch-project:mainfrom
dai-chen:fix/calcite-visitor-select-path
May 19, 2026
Merged

Resolve SQL unified query gaps for SELECT clauses and window functions#5450
dai-chen merged 2 commits into
opensearch-project:mainfrom
dai-chen:fix/calcite-visitor-select-path

Conversation

@dai-chen

@dai-chen dai-chen commented May 18, 2026

Copy link
Copy Markdown
Collaborator

Description

Resolves Category B remaining gaps from the SQL V2 Calcite gap analysis:

  • SELECT Statements

    • LIMIT/OFFSET — queries with LIMIT n OFFSET m now apply correctly (B8/B19)
    • GROUP BY — aggregation no longer throws NPE when BUCKET_NULLABLE argument is absent (B1/B16)
    • Aggregate with aliasSELECT dept, COUNT(*) AS cnt ... GROUP BY dept correctly references computed aggregate fields in projection
    • Derived tablesSELECT ... FROM (SELECT ...) AS t subqueries in FROM clause
    • SELECT without FROMSELECT 1 (dual-table / VALUES) no longer throws exception
  • Functions

    • Window functions with ORDER BYSUM(x) OVER (PARTITION BY y ORDER BY z) now translates order keys
    • ROW_NUMBER — ranking window function is recognized and planned (B9, partial)
    • COUNT(DISTINCT x) OVER (...) — aggregate-based window functions with DISTINCT (B2)
    • ISNULL(field) — function is now recognized as alias for IS_NULL (B14)

Note: Here are the remaining gaps after the recent series of PRs that closed most of the original SQL V2 Calcite gap analysis tracked in #5248 (comment).

Related Issues

Part of #5248

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dai-chen dai-chen self-assigned this May 18, 2026
@github-actions

github-actions Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

PR Reviewer Guide 🔍

(Review updated until commit 3211f0b)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Possible Issue

The alias handling logic checks if alias.getName() is in currentFields to decide whether to reference an already-computed aggregate field. However, if a non-aggregate field happens to share the same name as the alias, this condition may incorrectly trigger, causing the code to reference the wrong field instead of analyzing the alias expression. This can occur when a column name coincidentally matches the alias name used for an aggregate.

if (alias.getDelegated() instanceof AggregateFunction
    && alias.getName() != null
    && currentFields.contains(alias.getName())) {
  String displayName =
      Strings.isNullOrEmpty(alias.getAlias()) ? alias.getName() : alias.getAlias();
  expandedFields.add(
      context.relBuilder.alias(context.relBuilder.field(alias.getName()), displayName));
} else {
  expandedFields.add(rexVisitor.analyze(alias, context));
}
Possible Issue

The code defaults bucketNullable to true when the argument is absent, but then uses !bucketNullable to set bits in nonNullGroupMask. If the original PPL code relied on a missing argument meaning "nullable buckets allowed" (true), then !bucketNullable would be false and no bits are set, which is correct. However, if the original code expected a missing argument to mean "non-nullable" (false), the new default of true inverts the behavior. The comment does not clarify the original intent, so this change may silently alter aggregation semantics for SQL queries.

// SQL aggregations don't carry the PPL-only BUCKET_NULLABLE argument; default to true.
boolean bucketNullable =
    (Boolean) statsArgs.getOrDefault(Argument.BUCKET_NULLABLE, Literal.TRUE).getValue();
int nGroup = node.getGroupExprList().size() + (Objects.nonNull(node.getSpan()) ? 1 : 0);
BitSet nonNullGroupMask = new BitSet(nGroup);
if (!bucketNullable) {

@github-actions

github-actions Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Latest suggestions up to 3211f0b

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent NoSuchElementException on empty list

Calling nodes.getFirst() when nodes is not null but empty will throw
NoSuchElementException. Add a check to ensure nodes is not empty before accessing
its first element to prevent runtime exceptions.

core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java [616-638]

 List<RexNode> nodes =
     PPLFuncImpTable.INSTANCE.validateAggFunctionSignature(
         functionName, field, args, context.rexBuilder);
-return nodes != null
+return nodes != null && !nodes.isEmpty()
     ? PlanUtils.makeOver(
         context,
         functionName,
         isDistinct,
         nodes.getFirst(),
         nodes.size() <= 1 ? Collections.emptyList() : nodes.subList(1, nodes.size()),
         partitions,
         orderKeys,
         node.getWindowFrame())
     : PlanUtils.makeOver(
         context,
         functionName,
         isDistinct,
         field,
         args,
         partitions,
         orderKeys,
         node.getWindowFrame());
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies that calling nodes.getFirst() on an empty list will throw NoSuchElementException. Adding a check for !nodes.isEmpty() before accessing the first element prevents a potential runtime exception and improves code safety.

Medium
General
Prevent ClassCastException on type mismatch

The cast to Boolean assumes getValue() returns a Boolean, but if the default
Literal.TRUE.getValue() returns a different type or null, this will throw a
ClassCastException. Verify the return type or add explicit type checking before
casting.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [1648-1649]

-boolean bucketNullable =
-    (Boolean) statsArgs.getOrDefault(Argument.BUCKET_NULLABLE, Literal.TRUE).getValue();
+Object bucketNullableValue = statsArgs.getOrDefault(Argument.BUCKET_NULLABLE, Literal.TRUE).getValue();
+boolean bucketNullable = bucketNullableValue instanceof Boolean ? (Boolean) bucketNullableValue : true;
Suggestion importance[1-10]: 7

__

Why: The suggestion addresses a potential ClassCastException if getValue() returns a non-Boolean type. The improved code adds defensive type checking before casting, which is a good practice for robustness, especially when dealing with default values from argument maps.

Medium
Add null-safety for field lookup

The condition checks if alias.getName() is in currentFields, but if alias.getName()
returns null, currentFields.contains(null) may cause unexpected behavior. Add a
null-safety check before calling contains() to prevent potential issues when
alias.getName() is null.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [546-559]

 case Alias alias -> {
   // SQL aggregate aliases (e.g., COUNT(*) AS cnt): reference the already-computed field
   // and rebind under the user's alias, since re-analyzing the alias returns null.
+  String aliasName = alias.getName();
   if (alias.getDelegated() instanceof AggregateFunction
-      && alias.getName() != null
-      && currentFields.contains(alias.getName())) {
+      && aliasName != null
+      && currentFields.contains(aliasName)) {
     String displayName =
-        Strings.isNullOrEmpty(alias.getAlias()) ? alias.getName() : alias.getAlias();
+        Strings.isNullOrEmpty(alias.getAlias()) ? aliasName : alias.getAlias();
     expandedFields.add(
-        context.relBuilder.alias(context.relBuilder.field(alias.getName()), displayName));
+        context.relBuilder.alias(context.relBuilder.field(aliasName), displayName));
   } else {
     expandedFields.add(rexVisitor.analyze(alias, context));
   }
 }
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that alias.getName() could be null and that calling currentFields.contains(null) might cause issues. However, the code already checks alias.getName() != null before calling contains(), so the issue is partially mitigated. The refactoring to store aliasName in a variable improves readability and ensures the null check is respected throughout the block.

Low

Previous suggestions

Suggestions up to commit 85838eb
CategorySuggestion                                                                                                                                    Impact
General
Safe cast for Boolean value

The cast to Boolean may fail if getValue() returns a non-Boolean type. Validate the
type returned by getValue() before casting, or handle potential ClassCastException
to prevent runtime failures when processing SQL aggregations without the
BUCKET_NULLABLE argument.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [1649-1650]

-boolean bucketNullable =
-    (Boolean) statsArgs.getOrDefault(Argument.BUCKET_NULLABLE, Literal.TRUE).getValue();
+Object value = statsArgs.getOrDefault(Argument.BUCKET_NULLABLE, Literal.TRUE).getValue();
+boolean bucketNullable = value instanceof Boolean ? (Boolean) value : true;
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a potential ClassCastException risk when casting the result of getValue() to Boolean. Adding type validation before casting improves robustness, though the impact is moderate since Literal.TRUE.getValue() should return a Boolean.

Medium
Null check for argument list

Check if aggFunc.getArgList() returns null before calling addAll(). If getArgList()
returns null, this will throw a NullPointerException. Add a null check or use a safe
default like Collections.emptyList() to prevent runtime failures.

core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java [575-583]

 if (node.getFunction() instanceof AggregateFunction aggFunc) {
   funcName = aggFunc.getFuncName();
   isDistinct = Boolean.TRUE.equals(aggFunc.getDistinct());
   List<UnresolvedExpression> argExprs = new ArrayList<>();
   if (aggFunc.getField() != null) {
     argExprs.add(aggFunc.getField());
   }
-  argExprs.addAll(aggFunc.getArgList());
+  List<UnresolvedExpression> argList = aggFunc.getArgList();
+  if (argList != null) {
+    argExprs.addAll(argList);
+  }
   arguments = argExprs.stream().map(arg -> analyze(arg, context)).toList();
 }
Suggestion importance[1-10]: 6

__

Why: The suggestion identifies a potential NullPointerException if aggFunc.getArgList() returns null. While this is a valid defensive programming practice, the actual risk depends on whether getArgList() can return null in practice, which isn't clear from the context.

Low
Suggestions up to commit 27041da
CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent null pointer on argument list

The code calls aggFunc.getArgList() without null-checking it first. If getArgList()
returns null, addAll() will throw a NullPointerException. Add a null check before
calling addAll() to prevent potential runtime failures.

core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java [568-584]

 List<RexNode> arguments;
 final boolean isDistinct;
 if (node.getFunction() instanceof AggregateFunction aggFunc) {
   funcName = aggFunc.getFuncName();
   isDistinct = Boolean.TRUE.equals(aggFunc.getDistinct());
   List<UnresolvedExpression> argExprs = new java.util.ArrayList<>();
   if (aggFunc.getField() != null) {
     argExprs.add(aggFunc.getField());
   }
-  argExprs.addAll(aggFunc.getArgList());
+  if (aggFunc.getArgList() != null) {
+    argExprs.addAll(aggFunc.getArgList());
+  }
   arguments = argExprs.stream().map(arg -> analyze(arg, context)).toList();
 }
Suggestion importance[1-10]: 7

__

Why: This is a valid defensive programming suggestion that prevents a potential NullPointerException if aggFunc.getArgList() returns null. While the likelihood depends on the implementation of getArgList(), adding the null check improves code robustness.

Medium
General
Handle null alias explicitly

The condition checks if alias.getName() is not null but doesn't verify if
alias.getAlias() is null before using it in the ternary expression. If
alias.getAlias() returns null, Strings.isNullOrEmpty() will handle it, but the
fallback to alias.getName() might not be the intended behavior. Consider explicitly
handling the null case for alias.getAlias() to ensure correct display name
assignment.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [547-559]

 case Alias alias -> {
   // For aggregate aliases, reference the already-computed field by its function name
   // (e.g., "COUNT(*)") and rebind under the user's alias. Re-analyzing would return null.
   if (alias.getDelegated() instanceof AggregateFunction
       && alias.getName() != null
       && currentFields.contains(alias.getName())) {
     String displayName =
-        Strings.isNullOrEmpty(alias.getAlias()) ? alias.getName() : alias.getAlias();
+        (alias.getAlias() != null && !alias.getAlias().isEmpty()) 
+            ? alias.getAlias() 
+            : alias.getName();
     expandedFields.add(
         context.relBuilder.alias(context.relBuilder.field(alias.getName()), displayName));
   } else {
     expandedFields.add(rexVisitor.analyze(alias, context));
   }
 }
Suggestion importance[1-10]: 3

__

Why: The suggestion addresses a minor code style improvement. The existing code using Strings.isNullOrEmpty() already handles null and empty cases correctly, making the suggested explicit null check redundant. The improvement is marginal and doesn't fix a bug.

Low
Suggestions up to commit 365855e
CategorySuggestion                                                                                                                                    Impact
General
Filter null results from analysis

The analyze method may return null for certain expressions, which would result in
null entries in the arguments list. This could cause NullPointerException in
downstream processing. Filter out or handle null results explicitly to prevent
runtime errors.

core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java [570-579]

 if (node.getFunction() instanceof AggregateFunction aggFunc) {
   funcName = aggFunc.getFuncName();
   isDistinct = Boolean.TRUE.equals(aggFunc.getDistinct());
   List<UnresolvedExpression> argExprs = new java.util.ArrayList<>();
   if (aggFunc.getField() != null) {
     argExprs.add(aggFunc.getField());
   }
   argExprs.addAll(aggFunc.getArgList());
-  arguments = argExprs.stream().map(arg -> analyze(arg, context)).toList();
+  arguments = argExprs.stream()
+      .map(arg -> analyze(arg, context))
+      .filter(Objects::nonNull)
+      .toList();
 }
Suggestion importance[1-10]: 6

__

Why: The suggestion addresses a potential NullPointerException by filtering out null results from the analyze method. This is a valid concern as null entries could cause issues in downstream processing. However, the impact depends on whether the analyze method actually returns null in practice, which is not evident from the PR diff alone.

Low
Validate aggregate alias field reference

The logic assumes alias.getName() returns the function name for aggregate aliases,
but this may not always be correct. Verify that alias.getName() consistently returns
the expected aggregate function name (e.g., "COUNT(*)") before using it to reference
fields. Consider adding validation or fallback logic to handle cases where the name
doesn't match.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [546-559]

 case Alias alias -> {
   // For aggregate aliases, reference the already-computed field by its function name
   // (e.g., "COUNT(*)") and rebind under the user's alias. Re-analyzing would return null.
   if (alias.getDelegated() instanceof AggregateFunction
       && alias.getName() != null
       && currentFields.contains(alias.getName())) {
     String displayName =
         Strings.isNullOrEmpty(alias.getAlias()) ? alias.getName() : alias.getAlias();
     expandedFields.add(
         context.relBuilder.alias(context.relBuilder.field(alias.getName()), displayName));
   } else {
-    expandedFields.add(rexVisitor.analyze(alias, context));
+    RexNode analyzed = rexVisitor.analyze(alias, context);
+    if (analyzed != null) {
+      expandedFields.add(analyzed);
+    }
   }
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion adds a null check for the analyzed result, which is a reasonable defensive programming practice. However, the original code already handles the case where alias.getName() is null or not in currentFields, making this additional check less critical. The improvement is marginal.

Low
Prevent potential index access error

The condition rows.get(0).isEmpty() may throw IndexOutOfBoundsException if rows is
empty. Although the rows.isEmpty() check precedes it, the logic is fragile. Consider
restructuring to avoid potential index access issues or add explicit size
validation.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [4158-4164]

 List<List<Literal>> rows = values.getValues();
-if (rows == null || rows.isEmpty() || (rows.size() == 1 && rows.get(0).isEmpty())) {
+if (rows == null || rows.isEmpty() || (rows.size() == 1 && rows.get(0) != null && rows.get(0).isEmpty())) {
   // Single empty row (dual table) = SELECT without FROM. Provide a 0-column relation
   // so subsequent Project can evaluate constant expressions.
   context.relBuilder.values(context.relBuilder.getTypeFactory().builder().build());
   return context.relBuilder.peek();
 }
Suggestion importance[1-10]: 3

__

Why: The suggestion adds a null check for rows.get(0), but the existing condition rows.isEmpty() already prevents accessing an empty list. The additional check rows.get(0) != null is redundant since rows.get(0) would only be accessed when rows.size() == 1, which means the list is not empty. The improvement is minimal.

Low
Suggestions up to commit 563a054
CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent NPE when aggregate argument list is null

If aggFunc.getArgList() returns null, calling addAll() will throw a
NullPointerException. Ensure getArgList() returns a non-null collection before
adding to argExprs, or use a null-safe collection operation.

core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java [570-584]

 if (node.getFunction() instanceof AggregateFunction aggFunc) {
   funcName = aggFunc.getFuncName();
   isDistinct = Boolean.TRUE.equals(aggFunc.getDistinct());
   List<UnresolvedExpression> argExprs = new java.util.ArrayList<>();
   if (aggFunc.getField() != null) {
     argExprs.add(aggFunc.getField());
   }
-  argExprs.addAll(aggFunc.getArgList());
+  if (aggFunc.getArgList() != null) {
+    argExprs.addAll(aggFunc.getArgList());
+  }
   arguments = argExprs.stream().map(arg -> analyze(arg, context)).toList();
 }
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a potential NullPointerException if aggFunc.getArgList() returns null. Adding a null check before addAll() would prevent runtime errors and make the code more defensive, which is important for handling aggregate functions.

Medium
General
Verify null-safety for aggregate field name

The logic assumes alias.getName() returns a non-null value when checking
currentFields.contains(aggFieldName). If alias.getName() returns null, the
subsequent null-check is redundant and may cause unexpected behavior. Verify that
aggFieldName is non-null before using it in currentFields.contains().

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [547-563]

+case Alias alias -> {
+  String aliasName =
+      Strings.isNullOrEmpty(alias.getAlias()) ? alias.getName() : alias.getAlias();
+  // When an aggregate was already computed (its output name matches a current field),
+  // reference the existing field instead of re-analyzing (which would return null).
+  if (alias.getDelegated() instanceof AggregateFunction) {
+    String aggFieldName = alias.getName();
+    if (currentFields.contains(aliasName)) {
+      expandedFields.add(context.relBuilder.field(aliasName));
+    } else if (aggFieldName != null && currentFields.contains(aggFieldName)) {
+      expandedFields.add(
+          context.relBuilder.alias(context.relBuilder.field(aggFieldName), aliasName));
+    } else {
+      expandedFields.add(rexVisitor.analyze(alias, context));
+    }
+  } else {
+    expandedFields.add(rexVisitor.analyze(alias, context));
+  }
+}
 
-
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that aggFieldName is already checked for null before use in currentFields.contains(aggFieldName), making the logic safe. However, the suggestion asks to verify this null-safety, which is a valid concern for code review. The existing code already handles this correctly with the null check on line 555.

Medium
Add type safety for bucket nullable cast

The cast to Boolean assumes getValue() returns a Boolean type. If
Literal.TRUE.getValue() returns a different type or null, this will throw a
ClassCastException. Add explicit type validation or use a safer default value
retrieval pattern.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [1652-1653]

-boolean bucketNullable =
-    (Boolean) statsArgs.getOrDefault(Argument.BUCKET_NULLABLE, Literal.TRUE).getValue();
+Object bucketNullableValue = statsArgs.getOrDefault(Argument.BUCKET_NULLABLE, Literal.TRUE).getValue();
+boolean bucketNullable = bucketNullableValue instanceof Boolean ? (Boolean) bucketNullableValue : true;
Suggestion importance[1-10]: 7

__

Why: The suggestion identifies a potential ClassCastException risk when casting getValue() to Boolean. While the code uses getOrDefault with Literal.TRUE as default, adding explicit type checking would make the code more robust against unexpected types.

Medium
Suggestions up to commit b1c5970
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix alias application order

When aliasName equals aggFieldName and both exist in currentFields, the first
condition will match and reference the field directly without applying the alias.
This could cause the alias to be lost in the output, leading to incorrect column
names in query results.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [551-560]

 if (alias.getDelegated() instanceof AggregateFunction) {
   String aggFieldName = alias.getName();
-  if (currentFields.contains(aliasName)) {
-    expandedFields.add(context.relBuilder.field(aliasName));
-  } else if (aggFieldName != null && currentFields.contains(aggFieldName)) {
+  if (aggFieldName != null && currentFields.contains(aggFieldName)) {
     expandedFields.add(
         context.relBuilder.alias(context.relBuilder.field(aggFieldName), aliasName));
+  } else if (currentFields.contains(aliasName)) {
+    expandedFields.add(context.relBuilder.field(aliasName));
   } else {
     expandedFields.add(rexVisitor.analyze(alias, context));
   }
 }
Suggestion importance[1-10]: 8

__

Why: The suggestion identifies a critical logic flaw where checking currentFields.contains(aliasName) before aggFieldName could cause the alias to be lost when aliasName equals aggFieldName. Reordering the conditions ensures the alias is properly applied to the aggregate field.

Medium
Add duplicate field name check

The addedFields.add() check is missing for the Alias case, which could allow
duplicate field names in the expanded fields list. This inconsistency with the
AllFields case may lead to incorrect query results or downstream processing errors.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [546-564]

 case Alias alias -> {
   String aliasName =
       Strings.isNullOrEmpty(alias.getAlias()) ? alias.getName() : alias.getAlias();
+  if (!addedFields.add(aliasName)) {
+    continue;
+  }
   // When an aggregate was already computed (its output name matches a current field),
   // reference the existing field instead of re-analyzing (which would return null).
   if (alias.getDelegated() instanceof AggregateFunction) {
     String aggFieldName = alias.getName();
     if (currentFields.contains(aliasName)) {
       expandedFields.add(context.relBuilder.field(aliasName));
     } else if (aggFieldName != null && currentFields.contains(aggFieldName)) {
       expandedFields.add(
           context.relBuilder.alias(context.relBuilder.field(aggFieldName), aliasName));
     } else {
       expandedFields.add(rexVisitor.analyze(alias, context));
     }
   } else {
     expandedFields.add(rexVisitor.analyze(alias, context));
   }
 }
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that the Alias case lacks the duplicate field check present in the AllFields case. Adding addedFields.add() check would maintain consistency and prevent potential duplicate field names in the expanded fields list.

Medium

@dai-chen dai-chen force-pushed the fix/calcite-visitor-select-path branch from 7d2d33b to b1c5970 Compare May 18, 2026 18:42
@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit b1c5970

@dai-chen dai-chen force-pushed the fix/calcite-visitor-select-path branch from b1c5970 to 563a054 Compare May 18, 2026 20:06
@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 563a054

@dai-chen dai-chen added enhancement New feature or request SQL labels May 18, 2026
@dai-chen dai-chen force-pushed the fix/calcite-visitor-select-path branch from 563a054 to 365855e Compare May 18, 2026 21:03
@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 365855e

@dai-chen dai-chen force-pushed the fix/calcite-visitor-select-path branch from 365855e to 27041da Compare May 18, 2026 22:09
@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 27041da

@dai-chen dai-chen force-pushed the fix/calcite-visitor-select-path branch from 27041da to 85838eb Compare May 18, 2026 23:04
@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 85838eb

dai-chen added 2 commits May 18, 2026 16:18
…alues dual-table in visitor

Add visitLimit to support LIMIT/OFFSET clauses. Handle Alias nodes in
project list by referencing already-computed aggregate fields instead of
re-analyzing. Add visitRelationSubquery for derived tables in FROM
clause. Fix visitValues to treat a single empty row as a dual-table
(SELECT without FROM). Make bucketNullable lookup null-safe with
getOrDefault in visitAggregation.

Add integration tests covering LIMIT OFFSET, aggregate with alias,
GROUP BY without bucket nullable, SELECT with alias, derived table
subquery, and SELECT without FROM clause.

Signed-off-by: Chen Dai <daichen@amazon.com>
Add AggregateFunction handling in visitWindowFunction to support
aggregate-based window expressions with DISTINCT and ORDER BY keys.
Add translateOrderKeys utility for window ORDER BY translation.
Register row_number in WINDOW_FUNC_MAPPING and skip aggregate
signature validation for it (it has no field/args). Pass distinct
flag through makeOver call chain.

RANK and DENSE_RANK are deferred to a follow-up alongside the open
PPL eventstats/streamstats issue (opensearch-project#5168) which involves the same
function registration and a separate ORDER BY semantics question.

Register ISNULL as alias for IS_NULL in PPLFuncImpTable.

Add integration tests for window functions with ORDER BY,
ROW_NUMBER, COUNT DISTINCT OVER, and ISNULL.

Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen force-pushed the fix/calcite-visitor-select-path branch from 85838eb to 3211f0b Compare May 18, 2026 23:18
@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 3211f0b

@dai-chen dai-chen changed the title Fix CalciteRelNodeVisitor for Limit, Alias, Subquery, and Values Resolve SQL unified query gaps for SELECT clauses and window functions May 18, 2026
@dai-chen dai-chen marked this pull request as ready for review May 18, 2026 23:25
@dai-chen dai-chen merged commit 5851bdb into opensearch-project:main May 19, 2026
42 of 45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request SQL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants