Skip to content

Support PPL timewrap command #5241

Open
ahkcs wants to merge 5 commits into
opensearch-project:mainfrom
ahkcs:feature/ppl-timewrap-command
Open

Support PPL timewrap command #5241
ahkcs wants to merge 5 commits into
opensearch-project:mainfrom
ahkcs:feature/ppl-timewrap-command

Conversation

@ahkcs

@ahkcs ahkcs commented Mar 18, 2026

Copy link
Copy Markdown
Collaborator

Description

Adds the timewrap command to PPL, which reshapes timechart output by wrapping each time period into separate columns for side-by-side comparison (day-over-day, week-over-week, etc.). Behavior verified against Splunk.

Syntax: ... | timechart <agg> | timewrap <span> [align=end|now]

Key features

  • Column naming matches Splunk: <agg>_<N><unit>_before, <agg>_latest_<unit>, <agg>_<N><unit>_after
  • Column order: oldest first (leftmost)
  • Dynamic column count: only populated period columns appear (no trailing nulls)
  • align=end (default): uses WHERE upper bound as reference for deterministic column names
  • align=now: uses current time as reference
  • Supports all timescales: sec, min, hr, day, week, month, quarter, year

Implementation

  • Grammar: timewrapCommand rule reusing existing spanLiteral
  • AST: Timewrap node with unit, value, align
  • Calcite: PIVOT-based transformation with UNIX_TIMESTAMP/FROM_UNIXTIME for epoch conversion
  • Execution engine: post-processing to strip all-null columns, rename to absolute offsets using __base_offset__, and handle latest_<unit> naming
  • WHERE upper bound extraction: walks AST to find @timestamp <= filter for align=end reference

Known limitations

  • BY clause not yet supported
  • Month/quarter/year use approximate fixed-length conversions (30d, 91d, 365d)
  • Max 20 period columns
  • Timestamps show data dates, not reference date (Splunk rewrites to reference date)

Testing

  • 26 integration tests covering all timescales, aggregation types, WHERE filters, align modes, null fill, and edge cases
  • 5 doctest examples in user documentation
  • All existing timechart and transpose tests pass (backward compatible)

Issues Resolved

Resolves #5237

Check List

  • New functionality includes testing
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

@ahkcs ahkcs changed the title Add PPL timewrap command for time-period comparison Support PPL timewrap command Mar 18, 2026
@ahkcs ahkcs added PPL Piped processing language feature labels Mar 18, 2026
@opensearch-project opensearch-project deleted a comment from github-actions Bot Mar 18, 2026
@opensearch-project opensearch-project deleted a comment from github-actions Bot Mar 18, 2026
@opensearch-project opensearch-project deleted a comment from github-actions Bot Mar 18, 2026
@opensearch-project opensearch-project deleted a comment from github-actions Bot Mar 18, 2026
@opensearch-project opensearch-project deleted a comment from github-actions Bot Mar 18, 2026
@github-actions

github-actions Bot commented Mar 18, 2026

Copy link
Copy Markdown
Contributor

PR Reviewer Guide 🔍

(Review updated until commit 4ba32b8)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Possible Issue

In extractTimestampUpperBound, the method walks the AST to find the deepest Filter node with a timestamp upper bound. However, if multiple Filter nodes exist at the same depth (e.g., sibling filters in a union or parallel branches), only the first child path is traversed. This could miss the actual time picker filter if it's not on the first branch, leading to incorrect column naming when align=end is used.

public static Long extractTimestampUpperBound(Timewrap node) {
  Node current = node;
  Long lastBound = null;
  while (current != null && !current.getChild().isEmpty()) {
    current = current.getChild().get(0);
    if (current instanceof Filter filter) {
      Long bound = findUpperBound(filter.getCondition());
      if (bound != null) {
        lastBound = bound;
      }
    }
  }
  return lastBound;
}
Possible Issue

The pivoting logic assumes __period__ and __base_offset__ columns exist and are non-null in at least the first row. If all rows have null values for these columns (e.g., due to upstream filtering or aggregation edge cases), boVal.longValue() at line 359 will throw a NullPointerException. The code checks isNull() but only after dereferencing boVal, which could be null itself if the column is missing.

// Read __base_offset__ (constant across all rows)
long baseOffset = 0;
ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__");
if (boVal != null && !boVal.isNull()) {
  baseOffset = boVal.longValue();
}
Possible Issue

The workaround at lines 345-352 re-scans columns if periodIdx or baseOffsetIdx are not found before value columns. This suggests the initial scan logic is unreliable. If the workaround also fails (e.g., columns are genuinely missing), the code proceeds with -1 indices, causing ArrayIndexOutOfBoundsException when accessing columns.get(periodIdx) or similar operations later.

if (periodIdx < 0 || baseOffsetIdx < 0) {
  valueIdxs.clear();
  for (int i = 0; i < columns.size(); i++) {
    String name = columns.get(i).getName();
    if ("__period__".equals(name)) periodIdx = i;
    else if ("__base_offset__".equals(name)) baseOffsetIdx = i;
    else if (i > 0) valueIdxs.add(i);
  }
}
Possible Issue

spanToSeconds uses fixed approximations for variable-length units (30 days for month, 91 for quarter, 365 for year). While documented as approximate, this can cause incorrect period alignment in leap years or months with different day counts. For example, a 1-month span in February (28/29 days) vs. March (31 days) will be treated identically as 30 days, misaligning data when wrapping across these months.

public static long spanToSeconds(SpanUnit unit, int value) {
  return switch (unit.getName()) {
    case "s" -> value;
    case "m" -> value * 60L;
    case "h" -> value * 3_600L;
    case "d" -> value * 86_400L;
    case "w" -> value * 7L * 86_400L;
    case "M" -> value * 30L * 86_400L;
    case "q" -> value * 91L * 86_400L;
    case "y" -> value * 365L * 86_400L;
    default ->
        throw new SemanticCheckException("Unsupported time unit in timewrap: " + unit.getName());
  };
}

@github-actions

github-actions Bot commented Mar 18, 2026

Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Latest suggestions up to 61d68f3

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Add null check for period value

The code assumes period is always present and non-null when accessing
longValue(). If the column is missing or null, this will throw a
NullPointerException. Add null checks before calling longValue() to handle malformed
input gracefully.

core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapPivot.java [134-137]

 for (ExprValue row : values) {
   Map<String, ExprValue> tuple = row.tupleValue();
   String tsKey = tuple.get(tsColName).toString();
-  long period = tuple.get("__period__").longValue();
+  ExprValue periodVal = tuple.get("__period__");
+  if (periodVal == null || periodVal.isNull()) {
+    continue;
+  }
+  long period = periodVal.longValue();
Suggestion importance[1-10]: 7

__

Why: Valid concern about potential NullPointerException when accessing __period__. The code at line 136 calls longValue() without checking if the value is null, which could cause runtime errors with malformed data. Adding null checks improves robustness.

Medium
Prevent variable-length unit conversion

The method returns approximate values for variable-length units (M, q, y) but these
are used in fixed-length calculations in visitTimewrap. This creates inconsistency
since variable-length units should use calendar arithmetic. Either document that
this method is only for approximation or prevent its use with variable-length units.

core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [39-52]

 public static long spanToSeconds(SpanUnit unit, int value) {
+  if (isVariableLengthUnit(unit)) {
+    throw new SemanticCheckException("Cannot convert variable-length unit to fixed seconds: " + unit.getName());
+  }
   return switch (unit.getName()) {
     case "s" -> value;
     case "m" -> value * 60L;
     case "h" -> value * 3_600L;
     case "d" -> value * 86_400L;
     case "w" -> value * 7L * 86_400L;
-    case "M" -> value * 30L * 86_400L;
-    case "q" -> value * 91L * 86_400L;
-    case "y" -> value * 365L * 86_400L;
     default ->
         throw new SemanticCheckException("Unsupported time unit in timewrap: " + unit.getName());
   };
 }
Suggestion importance[1-10]: 3

__

Why: The suggestion correctly identifies that spanToSeconds returns approximate values for variable-length units. However, the method is only called in the fixed-length branch of visitTimewrap (line 3898), so the inconsistency doesn't actually occur in practice. Adding the check would improve code clarity but isn't critical.

Low
General
Enforce maximum period limit

The code collects all distinct periods without enforcing the documented maximum of
20 periods. If data contains more than 20 periods, all will be processed and
included in output, contradicting the stated limitation. Enforce the limit by
truncating the periods list after collection.

core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapPivot.java [99-106]

 Set<Long> periodSet = new TreeSet<>(Collections.reverseOrder());
 for (ExprValue row : values) {
   ExprValue pv = row.tupleValue().get("__period__");
   if (pv != null && !pv.isNull()) {
     periodSet.add(pv.longValue());
   }
 }
 List<Long> periods = new ArrayList<>(periodSet);
+if (periods.size() > TimewrapUtils.MAX_PERIODS) {
+  periods = periods.subList(0, TimewrapUtils.MAX_PERIODS);
+}
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that MAX_PERIODS is defined but not enforced in the pivot logic. While the documentation mentions a 20-period limit, the code doesn't actually truncate the periods list. Enforcing this limit would align implementation with documentation and prevent excessive column generation.

Low

Previous suggestions

Suggestions up to commit 4ba32b8
CategorySuggestion                                                                                                                                    Impact
Possible issue
Validate required columns exist before pivoting

The double-scan workaround for finding column indices suggests fragile logic that
depends on column ordering. If period or base_offset are not found after two
passes, the code proceeds with -1 indices, which will cause
ArrayIndexOutOfBoundsException when accessing columns. Add validation after the
second scan to fail fast with a clear error message.

opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [345-353]

-if (CalcitePlanContext.stripNullColumns.get()) {
-  try {
-    String unitInfo = CalcitePlanContext.timewrapUnitName.get();
-    if (unitInfo != null && !values.isEmpty()) {
-      // Find column indices
-      int tsIdx = 0;
-      int periodIdx = -1;
-      int baseOffsetIdx = -1;
-      ...
-      // Workaround: if indices weren't found before value columns, scan again
-      if (periodIdx < 0 || baseOffsetIdx < 0) {
-        valueIdxs.clear();
-        for (int i = 0; i < columns.size(); i++) {
-          String name = columns.get(i).getName();
-          if ("__period__".equals(name)) periodIdx = i;
-          else if ("__base_offset__".equals(name)) baseOffsetIdx = i;
-          else if (i > 0) valueIdxs.add(i);
-        }
-      }
+if (periodIdx < 0 || baseOffsetIdx < 0) {
+  valueIdxs.clear();
+  for (int i = 0; i < columns.size(); i++) {
+    String name = columns.get(i).getName();
+    if ("__period__".equals(name)) periodIdx = i;
+    else if ("__base_offset__".equals(name)) baseOffsetIdx = i;
+    else if (i > 0) valueIdxs.add(i);
+  }
+  if (periodIdx < 0 || baseOffsetIdx < 0) {
+    throw new IllegalStateException(
+      "Timewrap pivot failed: required columns __period__ or __base_offset__ not found");
+  }
+}
Suggestion importance[1-10]: 8

__

Why: The double-scan workaround is fragile and proceeds with -1 indices if columns aren't found, which will cause ArrayIndexOutOfBoundsException. Adding validation after the second scan would prevent runtime crashes with a clear error message.

Medium
Enforce MAX_PERIODS limit validation

The MAX_PERIODS constant is defined but never enforced in the code. Without
validation, queries with more than 20 periods will succeed but may cause performance
issues or unexpected behavior. Add a check in visitTimewrap to throw a
SemanticCheckException when the period count exceeds this limit.

core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [31]

-public static final int MAX_PERIODS = 20;
+// In CalciteRelNodeVisitor.visitTimewrap, after computing periods:
+if (periods.size() > TimewrapUtils.MAX_PERIODS) {
+  throw new SemanticCheckException(
+    "Timewrap period count (" + periods.size() + ") exceeds maximum allowed (" 
+    + TimewrapUtils.MAX_PERIODS + ")");
+}
Suggestion importance[1-10]: 7

__

Why: The MAX_PERIODS constant is defined but never enforced. Without validation, queries exceeding this limit may cause performance issues. However, the suggestion points to code in CalciteRelNodeVisitor.visitTimewrap which is not in TimewrapUtils.java, making the relevant_file incorrect.

Medium
General
Log warning on timestamp parse failure

The timestamp parsing silently returns null on failure, which causes align=end to
fall back to now without warning. This can produce incorrect period alignment when
the WHERE clause timestamp format is unexpected. Log a warning when parsing fails so
users can diagnose alignment issues.

core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [333-349]

 private static Long parseTimestampLiteral(UnresolvedExpression expr) {
   if (expr instanceof Literal lit && lit.getValue() instanceof String s) {
     try {
       java.time.LocalDateTime ldt =
           java.time.LocalDateTime.parse(
               s, java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss"));
       return ldt.toEpochSecond(java.time.ZoneOffset.UTC);
     } catch (Exception e) {
       try {
         return java.time.Instant.parse(s).getEpochSecond();
-      } catch (Exception ignored) {
+      } catch (Exception parseError) {
+        LOG.warn("Failed to parse timestamp literal '{}' for timewrap align=end, falling back to now", s);
         return null;
       }
     }
   }
   return null;
 }
Suggestion importance[1-10]: 6

__

Why: Silent null return on parse failure causes align=end to fall back to now without warning, potentially producing incorrect period alignment. Logging would help users diagnose issues, though this is a minor improvement.

Low
Add defensive empty check before getFirst

Accessing values.getFirst() without checking if values is empty can throw
NoSuchElementException. Although the outer if (!values.isEmpty()) guard exists,
defensive programming suggests verifying before calling getFirst() to prevent future
refactoring errors.

opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [356-360]

-ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__");
-if (boVal != null && !boVal.isNull()) {
-  baseOffset = boVal.longValue();
+if (!values.isEmpty()) {
+  ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__");
+  if (boVal != null && !boVal.isNull()) {
+    baseOffset = boVal.longValue();
+  }
 }
Suggestion importance[1-10]: 3

__

Why: The outer if (!values.isEmpty()) guard at line 329 already protects against empty values, making this additional check redundant. The suggestion adds defensive programming but offers minimal practical benefit.

Low
Suggestions up to commit 219a704
CategorySuggestion                                                                                                                                    Impact
Possible issue
Ensure ThreadLocal cleanup always executes

The ThreadLocal cleanup only happens inside the if
(CalcitePlanContext.stripNullColumns.get()) block. If stripNullColumns is true but
values.isEmpty() causes the pivot logic to be skipped, the cleanup still runs.
However, if stripNullColumns is true and an exception is thrown before the try block
is entered (e.g., during column index scanning), the cleanup still runs via finally.
But if stripNullColumns is never set to true for a timewrap query (e.g., due to a
bug in the visitor), the ThreadLocal values for timewrapUnitName, timewrapSeries,
and timewrapTimeFormat set in visitTimewrap will never be cleaned up. Move the
cleanup to an unconditional finally block outside the if check.

opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [329-445]

-if (CalcitePlanContext.stripNullColumns.get()) {
-  try {
-    ...
-  } finally {
-    CalcitePlanContext.stripNullColumns.set(false);
-    CalcitePlanContext.timewrapUnitName.set(null);
-    CalcitePlanContext.timewrapSeries.set(null);
-    CalcitePlanContext.timewrapTimeFormat.set(null);
+try {
+  if (CalcitePlanContext.stripNullColumns.get()) {
+    String unitInfo = CalcitePlanContext.timewrapUnitName.get();
+    if (unitInfo != null && !values.isEmpty()) {
+      // ... pivot logic ...
+    }
   }
+} finally {
+  CalcitePlanContext.stripNullColumns.set(false);
+  CalcitePlanContext.timewrapUnitName.set(null);
+  CalcitePlanContext.timewrapSeries.set(null);
+  CalcitePlanContext.timewrapTimeFormat.set(null);
 }
Suggestion importance[1-10]: 7

__

Why: The cleanup of timewrapUnitName, timewrapSeries, and timewrapTimeFormat is only inside the if (CalcitePlanContext.stripNullColumns.get()) block. If stripNullColumns is never set to true (e.g., due to a bug in visitTimewrap), these ThreadLocal values will leak across requests. Moving cleanup to an unconditional outer finally block is a valid correctness improvement.

Medium
Prevent stale ThreadLocal state leaks

These ThreadLocal variables are never initialized with a default value, meaning
get() returns null unless explicitly set. If a query runs on a thread that
previously processed a timewrap query and the finally block in buildResultSet was
skipped (e.g., due to an exception before stripNullColumns was set to true), stale
values could leak into subsequent queries. Initialize them with
ThreadLocal.withInitial(() -> null) or ensure cleanup happens unconditionally.

core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java [51-57]

-public static final ThreadLocal<String> timewrapUnitName = new ThreadLocal<>();
-public static final ThreadLocal<String> timewrapSeries = new ThreadLocal<>();
-public static final ThreadLocal<String> timewrapTimeFormat = new ThreadLocal<>();
+public static final ThreadLocal<String> timewrapUnitName = ThreadLocal.withInitial(() -> null);
+public static final ThreadLocal<String> timewrapSeries = ThreadLocal.withInitial(() -> null);
+public static final ThreadLocal<String> timewrapTimeFormat = ThreadLocal.withInitial(() -> null);
Suggestion importance[1-10]: 3

__

Why: ThreadLocal.withInitial(() -> null) is functionally equivalent to new ThreadLocal<>() since both return null by default. The real concern about stale values is valid but is better addressed by fixing the cleanup scope (suggestion 2), not by changing initialization.

Low
General
Warn users about unimplemented series mode

The series=exact mode silently falls back to series=short naming (_s) without any
warning or error. Users specifying series=exact with a time_format will receive
unexpected column names with no indication that their parameter was ignored. This
should either throw a SemanticCheckException indicating the feature is not yet
supported, or log a warning so users are aware of the fallback behavior.

opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [482-495]

 case "exact" -> {
-  // series=exact: prefix_<formatted_date>
-  String timeFormat = CalcitePlanContext.timewrapTimeFormat.get();
-  if (timeFormat == null) timeFormat = "%Y-%m-%d";
-  // Compute period start timestamp: reference - absolutePeriod * spanSeconds
-  // absolutePeriod is in span units, need to convert to seconds
-  long spanSec =
-      TimewrapUtils.spanToSeconds(
-              org.opensearch.sql.ast.expression.SpanUnit.of(singular.toUpperCase()), 1)
-          * spanValue;
-  // Not enough info to compute exact date here — fall back to short naming
-  // TODO: pass reference epoch + span for exact date computation
-  yield prefix + "_s" + absolutePeriod;
+  // series=exact is not yet fully implemented; fall back to short naming with a warning
+  log.warn("timewrap series=exact is not yet supported; falling back to series=short naming");
+  yield "s" + absolutePeriod;
 }
Suggestion importance[1-10]: 5

__

Why: The series=exact mode silently falls back to series=short naming without any user-visible indication, which is a usability issue. Adding a warning log or throwing an exception would make the unimplemented behavior explicit, though the improved_code also changes the yield value from prefix + "_s" + absolutePeriod to "s" + absolutePeriod (dropping the prefix), which appears to be a bug in the suggestion itself.

Low
Robustly handle missing base offset value

baseOffset is read only from the first row, but base_offset is computed per-row
in the Calcite plan (it's a window function result). While it should be constant
across all rows for a given query, relying on the first row without validation could
silently produce wrong column names if the value varies. Additionally, if the first
row has a null base_offset, baseOffset defaults to 0, which may produce
incorrect period naming without any warning.

opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [356-360]

 long baseOffset = 0;
-ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__");
-if (boVal != null && !boVal.isNull()) {
-  baseOffset = boVal.longValue();
+boolean baseOffsetFound = false;
+for (ExprValue row : values) {
+  ExprValue boVal = row.tupleValue().get("__base_offset__");
+  if (boVal != null && !boVal.isNull()) {
+    baseOffset = boVal.longValue();
+    baseOffsetFound = true;
+    break;
+  }
+}
+if (!baseOffsetFound) {
+  log.warn("timewrap: __base_offset__ not found or null in result rows, defaulting to 0");
 }
Suggestion importance[1-10]: 3

__

Why: The suggestion to scan all rows for __base_offset__ is a minor robustness improvement, but since __base_offset__ is a window function result that should be constant across all rows, reading from the first row is acceptable. The improved code is functionally nearly identical for the expected case.

Low
Suggestions up to commit e0a03f0
CategorySuggestion                                                                                                                                    Impact
Possible issue
Missing bounds check on input field list

If fieldNames is empty (e.g., the upstream timechart produced no non-metadata
fields), calling fieldNames.get(0) will throw an IndexOutOfBoundsException.
Similarly, if valueFieldNames is empty, the subsequent pivot logic will produce
incorrect results. A guard check should be added to validate the input schema before
proceeding.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3110-3113]

 List<String> fieldNames =
     b.peek().getRowType().getFieldNames().stream().filter(f -> !isMetadataField(f)).toList();
+if (fieldNames.size() < 2) {
+  throw new SemanticCheckException(
+      "timewrap requires a timechart with at least one aggregation field, but got: " + fieldNames);
+}
 String tsFieldName = fieldNames.get(0);
 List<String> valueFieldNames = fieldNames.subList(1, fieldNames.size());
Suggestion importance[1-10]: 6

__

Why: This is a valid defensive programming suggestion that prevents an IndexOutOfBoundsException when timewrap is used without a preceding timechart or with an empty result. The improved code provides a clear error message and correctly validates the input schema.

Low
Base offset read only from potentially null first row

The baseOffset is read only from the first row (values.getFirst()), but
base_offset is supposed to be constant across all rows. If the first row happens
to have a null base_offset (e.g., due to a null timestamp in the first row),
baseOffset defaults to 0, which will produce incorrect absolute column names for all
columns. The code should scan for the first non-null base_offset value across
all rows.

opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [349-355]

 long baseOffset = 0;
-if (!values.isEmpty()) {
-  ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__");
+for (ExprValue row : values) {
+  ExprValue boVal = row.tupleValue().get("__base_offset__");
   if (boVal != null && !boVal.isNull()) {
     baseOffset = boVal.longValue();
+    break;
   }
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies that reading __base_offset__ only from the first row could fail if that row has a null timestamp. Scanning for the first non-null value is more robust, though in practice __base_offset__ should be constant across all rows as it's a window aggregate.

Low
General
Initialize ThreadLocal with default value for safety

Using ThreadLocal variables in a shared execution context can cause state leakage
between queries if the finally block in the execution engine is not always reached
(e.g., if an exception occurs before buildResultSet is called). The timewrapUnitName
ThreadLocal is not initialized with a default value, which means get() returns null
and requires null checks everywhere. Consider initializing it with
ThreadLocal.withInitial(() -> null) and ensuring cleanup happens in a more robust
location, such as a try-finally wrapping the entire query execution, not just inside
buildResultSet.

core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java [45-51]

 public static final ThreadLocal<Boolean> stripNullColumns = ThreadLocal.withInitial(() -> false);
 
-public static final ThreadLocal<String> timewrapUnitName = new ThreadLocal<>();
+public static final ThreadLocal<String> timewrapUnitName = ThreadLocal.withInitial(() -> null);
Suggestion importance[1-10]: 3

__

Why: The suggestion to use ThreadLocal.withInitial(() -> null) is functionally equivalent to new ThreadLocal<>() since both return null by default. The real concern about state leakage is valid but the fix doesn't address the cleanup robustness issue mentioned in the suggestion content.

Low
Hardcoded timestamp field names limit flexibility

The isTimestampField method hardcodes only @timestamp and timestamp as valid
timestamp field names. This means extractTimestampUpperBound will silently return
null for any other timestamp field name, causing align=end to fall back to now
without warning. The timestamp field name should be derived from the actual query
context (e.g., the first field of the timechart output) rather than being hardcoded.

core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [312-318]

 private static boolean isTimestampField(UnresolvedExpression expr) {
   if (expr instanceof Field field) {
     String name = field.getField().toString();
-    return "@timestamp".equals(name) || "timestamp".equals(name);
+    // Match common timestamp field names; ideally this should be derived from query context
+    return "@timestamp".equals(name) || "timestamp".equals(name) || name.endsWith("timestamp");
   }
   return false;
 }
Suggestion importance[1-10]: 3

__

Why: The concern about hardcoded field names is valid, but the proposed fix of adding name.endsWith("timestamp") is a partial workaround rather than a proper solution. The suggestion acknowledges the real fix should derive the field name from query context, making the improved code only marginally better.

Low
Suggestions up to commit 6276cba
CategorySuggestion                                                                                                                                    Impact
Possible issue
ThreadLocal state may leak between queries on error

These ThreadLocal variables are static fields on a shared class and are only reset
in the finally block of buildResultSet. If an exception occurs before buildResultSet
is reached, or if the query is cancelled, the ThreadLocals will retain stale values
and corrupt subsequent queries on the same thread. Consider using a try/finally
pattern that resets these values immediately after the plan is built, or reset them
at the start of each query execution.

core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java [45-51]

 public static final ThreadLocal<Boolean> stripNullColumns = ThreadLocal.withInitial(() -> false);
 
 public static final ThreadLocal<String> timewrapUnitName = new ThreadLocal<>();
 
+public static void resetTimewrapState() {
+  stripNullColumns.set(false);
+  timewrapUnitName.set(null);
+}
+
Suggestion importance[1-10]: 6

__

Why: The stripNullColumns and timewrapUnitName ThreadLocals are only reset in the finally block of buildResultSet, but if an exception occurs before that point, stale values could corrupt subsequent queries on the same thread. The suggestion is valid but the improved_code only adds a helper method without actually changing the reset pattern, making it a partial fix.

Low
Missing guard against empty field list causes potential crash

The code assumes the first non-metadata field is always the timestamp field. If the
upstream timechart output has a different column order, or if fieldNames is empty
after filtering, this will either produce incorrect results or throw an
IndexOutOfBoundsException. Add a guard for the empty case and consider validating
that the first field is actually a timestamp type.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3110-3113]

 List<String> fieldNames =
         b.peek().getRowType().getFieldNames().stream().filter(f -> !isMetadataField(f)).toList();
+    if (fieldNames.isEmpty()) {
+      throw new SemanticCheckException("timewrap: no fields found in timechart output");
+    }
     String tsFieldName = fieldNames.get(0);
     List<String> valueFieldNames = fieldNames.subList(1, fieldNames.size());
Suggestion importance[1-10]: 5

__

Why: If fieldNames is empty after filtering metadata fields, fieldNames.get(0) will throw an IndexOutOfBoundsException. Adding an explicit guard with a meaningful error message improves robustness and debuggability.

Low
General
Upper bound extraction stops too early in AST walk

The method only checks the first Filter node encountered in the AST walk and returns
immediately. If the first child is not a Filter but a subsequent child is (e.g.,
there is a Sort or Project between Timewrap and the Filter), the upper bound will
not be found and align=end will silently fall back to now. The loop should continue
walking past non-Filter nodes rather than returning only on the first Filter found.

core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [282-291]

 public static Long extractTimestampUpperBound(Timewrap node) {
     Node current = node;
     while (current != null && !current.getChild().isEmpty()) {
       current = current.getChild().get(0);
       if (current instanceof Filter filter) {
-        return findUpperBound(filter.getCondition());
+        Long bound = findUpperBound(filter.getCondition());
+        if (bound != null) return bound;
       }
     }
     return null;
   }
Suggestion importance[1-10]: 5

__

Why: The current code returns null immediately after the first Filter node even if findUpperBound returns null, potentially missing a valid upper bound in a deeper Filter. The improved code continues walking past non-matching filters, which is more robust for align=end behavior.

Low
Silent fallback produces incorrect column names when offset missing

baseOffset defaults to 0 if base_offset is missing or null. A baseOffset of 0
will cause all period columns to be named as if the reference is the same as the
latest data point, silently producing wrong column names (e.g., latest_day instead
of 5days_before) without any error. This should at minimum log a warning, or throw
if the column is expected but missing.

opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [349-355]

 long baseOffset = 0;
         if (!values.isEmpty()) {
           ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__");
           if (boVal != null && !boVal.isNull()) {
             baseOffset = boVal.longValue();
+          } else {
+            // __base_offset__ missing: column naming will be relative, not absolute
+            log.warn("timewrap: __base_offset__ not found in result; column names may be incorrect");
           }
         }
Suggestion importance[1-10]: 4

__

Why: A silent default of 0 for baseOffset when __base_offset__ is missing will produce incorrect column names without any indication of the problem. Adding a warning log is a reasonable improvement, though the improved_code only adds a log statement rather than a stronger error signal.

Low
Suggestions up to commit e0a03f0
CategorySuggestion                                                                                                                                    Impact
Possible issue
Add validation to prevent index out of bounds

If fieldNames is empty (e.g., the upstream timechart produced no non-metadata
fields), fieldNames.get(0) will throw an IndexOutOfBoundsException. Similarly, if
there are no value fields, valueFieldNames will be empty and subsequent operations
like valueFieldNames.get(0) will fail. Add a validation check to provide a clear
error message.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3110-3113]

 List<String> fieldNames =
     b.peek().getRowType().getFieldNames().stream().filter(f -> !isMetadataField(f)).toList();
+if (fieldNames.size() < 2) {
+  throw new SemanticCheckException(
+      "timewrap requires a timechart with at least one aggregation field");
+}
 String tsFieldName = fieldNames.get(0);
 List<String> valueFieldNames = fieldNames.subList(1, fieldNames.size());
Suggestion importance[1-10]: 6

__

Why: This is a valid defensive check — if timewrap is used without a preceding timechart or with an empty result, fieldNames.get(0) would throw an IndexOutOfBoundsException. The improved code provides a clear error message instead, improving debuggability.

Low
Ensure ThreadLocal state is always cleaned up

Using static ThreadLocal variables in a shared context can cause state leakage
between concurrent queries if the finally block in the execution engine is not
always reached (e.g., due to exceptions before buildResultSet). Consider resetting
these thread-locals in a more robust lifecycle hook, or ensure they are always
cleaned up. Additionally, timewrapUnitName has no initial value, which means get()
returns null and requires null checks everywhere it is used.

core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java [45-51]

 public static final ThreadLocal<Boolean> stripNullColumns = ThreadLocal.withInitial(() -> false);
-public static final ThreadLocal<String> timewrapUnitName = new ThreadLocal<>();
+public static final ThreadLocal<String> timewrapUnitName = ThreadLocal.withInitial(() -> null);
Suggestion importance[1-10]: 4

__

Why: The suggestion about timewrapUnitName having no initial value is valid but minor — ThreadLocal.withInitial(() -> null) is functionally equivalent to new ThreadLocal<>() since both return null by default. The concern about state leakage is real but the improved_code only changes new ThreadLocal<>() to ThreadLocal.withInitial(() -> null), which doesn't actually address the cleanup concern.

Low
General
Fix fragile column name parsing with ambiguous separators

The unitInfo format expected by renameTimewrapColumn is
"spanValue|singular|plural|_before" (4 parts split by |), but the value being set
here is "spanValue|singular|plural|_before" where unitBaseName returns
"value|singular|plural" and "|_before" is appended — this produces
"value|singular|plural|_before" which is correct. However, the suffix _before in
nameSuffix is parsed as parts[3] which would be "_before", but the column names end
with "_before" (with underscore). The column name format is prefix_N_before, so
nameSuffix should be "_before" and the check name.endsWith(nameSuffix) would match.
Verify that the separator between the period number and _before in the column name
is consistent with what renameTimewrapColumn expects — currently columns are named
like sum(requests)_3_before but the rename logic strips before and then splits on
the last
to get the period number, which would incorrectly split sum(requests)_3
at the last underscore giving prefix=sum(requests) and period=3. This works only if
the value field name contains no trailing underscore-number pattern, but could
silently misparse for field names like my_field_3.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3104-3105]

+// Use a distinct separator that won't appear in field names, e.g., "__before__"
 CalcitePlanContext.timewrapUnitName.set(
-    TimewrapUtils.unitBaseName(node.getUnit(), node.getValue()) + "|_before");
+    TimewrapUtils.unitBaseName(node.getUnit(), node.getValue()) + "|__before__");
Suggestion importance[1-10]: 5

__

Why: The suggestion raises a valid concern about ambiguous parsing when field names contain underscore-number patterns, but the improved_code only changes the separator string without addressing the actual parsing logic in renameTimewrapColumn. The fix is incomplete and would require coordinated changes in the execution engine as well.

Low
Remove hardcoded timestamp field name assumption

The isTimestampField method hardcodes only @timestamp and timestamp as valid
timestamp field names. This means extractTimestampUpperBound will silently return
null for any other timestamp field name (e.g., event_time, ts), causing timewrap to
fall back to now even when align=end is specified. The timestamp field name should
be derived from the actual schema rather than hardcoded.

core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [312-318]

+// Consider passing the actual timestamp field name from the Timewrap node or its child schema
+// instead of hardcoding. As a minimal fix, make the check case-insensitive and more general:
 private static boolean isTimestampField(UnresolvedExpression expr) {
   if (expr instanceof Field field) {
-    String name = field.getField().toString();
-    return "@timestamp".equals(name) || "timestamp".equals(name);
+    String name = field.getField().toString().toLowerCase();
+    return name.contains("timestamp") || name.equals("ts") || name.equals("time");
   }
   return false;
 }
Suggestion importance[1-10]: 4

__

Why: The concern about hardcoded field names is valid, but the proposed fix of using contains("timestamp") is overly broad and could match unintended fields. The suggestion acknowledges the proper fix (deriving from schema) but doesn't implement it, making the improved_code a potentially worse solution.

Low

@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 6276cba

@ahkcs ahkcs force-pushed the feature/ppl-timewrap-command branch from 6276cba to e0a03f0 Compare March 18, 2026 20:30
@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit e0a03f0

@github-actions

github-actions Bot commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 4ba32b8.

PathLineSeverityDescription
opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java477lowrenameTimewrapColumn is defined but never called anywhere in the diff — dead code that references SpanUnit.of(singular.toUpperCase()) with a value derived from user-controlled query input. Not directly exploitable through the parser constraints, but the unused method adds unnecessary attack surface.
opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java333lowThreadLocal variables (stripNullColumns, timewrapUnitName, timewrapSeries, timewrapTimeFormat) are set during query planning in visitTimewrap but only cleaned up inside buildResultSet's try/finally. If buildResultSet is never reached on an error path, ThreadLocals persist for subsequent requests on the same pooled thread, potentially leaking query configuration state across unrelated user queries.
opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java516lowThe series=exact branch silently falls back to series=short behavior with no warning or error (TODO comment acknowledges this). Users requesting exact date-formatted column names receive short numeric names instead, which is a silent behavioral discrepancy but not exploitable.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 0 | Medium: 0 | Low: 3


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 219a704

ahkcs added 4 commits June 25, 2026 10:00
Implement the timewrap command that reshapes timechart output by wrapping
each time period into a separate data series, enabling day-over-day,
week-over-week, and other recurring interval comparisons.

Signed-off-by: Jialiang Li <jialiang.li@hey.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Add timewrap_test to SHOW TABLES expected output (25 tables).

Signed-off-by: Jialiang Li <jialiang.li@hey.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
…etic

- Extract all timewrap helper methods to TimewrapUtils.java in calcite/utils/
- Add precise EXTRACT-based period computation for month/quarter/year
- Add cumDaysBeforeMonth with leap year CASE expression for precise quarter offset
- Month/quarter/year period assignment now uses calendar arithmetic instead of
  approximate fixed-length conversions

Signed-off-by: Jialiang Li <jialiang.li@hey.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
- Remove Calcite PIVOT from timewrap: no more MAX_PERIODS limit or crash risk
- Add post-processing pivot in execution engine: dynamically builds columns
  from actual data using HashMap grouping
- Add series parameter: series=relative (default), series=short (s0, s1)
- Add series=exact grammar support (falls back to short at runtime)
- Add time_format grammar support for series=exact
- Benchmark: 1000 period columns in ~50ms (Calcite PIVOT crashed at 1000)
- 32 IT tests, all using verifySchema + verifyDataRows

Signed-off-by: Jialiang Li <jialiang.li@hey.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@ahkcs ahkcs force-pushed the feature/ppl-timewrap-command branch from 219a704 to 4ba32b8 Compare June 25, 2026 17:03
@github-actions

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 4ba32b8

@ahkcs ahkcs removed this from PPL 2026 Roadmap Jun 25, 2026
@ahkcs ahkcs marked this pull request as ready for review June 25, 2026 17:43
The timewrap pivot (turning the unpivoted [display_ts, value, __base_offset__,
__period__] rows into Splunk-style period columns) was done as post-processing
in OpenSearchExecutionEngine.buildResultSet, gated on CalcitePlanContext
thread-locals. The analytics-engine route executes the RelNode via
AnalyticsExecutionEngine and never reaches that code, so timewrap queries came
back unpivoted (CalciteTimewrapCommandIT 0/32 on the analytics route).

Extract the pivot into a shared core helper (TimewrapPivot) and call it from
both execution engines so they produce identical output. AnalyticsExecutionEngine
captures the timewrap signals at execute() entry via TimewrapSignals because the
result callback runs on a different worker thread than the planning thread that
set the thread-locals.

CalciteTimewrapCommandIT: 32/32 on both the Calcite/v2 and analytics-engine routes.
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC] PPL timewrap command

1 participant