Support PPL timewrap command by ahkcs · Pull Request #5241 · opensearch-project/sql

ahkcs · 2026-03-18T00:10:51Z

Description

Adds the timewrap command to PPL, which reshapes timechart output by wrapping each time period into separate columns for side-by-side comparison (day-over-day, week-over-week, etc.). Behavior verified against Splunk.

Syntax: ... | timechart <agg> | timewrap <span> [align=end|now]

Key features

Column naming matches Splunk: <agg>_<N><unit>_before, <agg>_latest_<unit>, <agg>_<N><unit>_after
Column order: oldest first (leftmost)
Dynamic column count: only populated period columns appear (no trailing nulls)
align=end (default): uses WHERE upper bound as reference for deterministic column names
align=now: uses current time as reference
Supports all timescales: sec, min, hr, day, week, month, quarter, year

Implementation

Grammar: timewrapCommand rule reusing existing spanLiteral
AST: Timewrap node with unit, value, align
Calcite: PIVOT-based transformation with UNIX_TIMESTAMP/FROM_UNIXTIME for epoch conversion
Execution engine: post-processing to strip all-null columns, rename to absolute offsets using __base_offset__, and handle latest_<unit> naming
WHERE upper bound extraction: walks AST to find @timestamp <= filter for align=end reference

Known limitations

BY clause not yet supported
Month/quarter/year use approximate fixed-length conversions (30d, 91d, 365d)
Max 20 period columns
Timestamps show data dates, not reference date (Splunk rewrites to reference date)

Testing

26 integration tests covering all timescales, aggregation types, WHERE filters, align modes, null fill, and edge cases
5 doctest examples in user documentation
All existing timechart and transpose tests pass (backward compatible)

Issues Resolved

Resolves #5237

Check List

New functionality includes testing
New functionality has been documented
Commits are signed per the DCO using --signoff

github-actions · 2026-03-18T18:54:48Z

PR Reviewer Guide 🔍

(Review updated until commit `4ba32b8`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review Possible Issue In `extractTimestampUpperBound`, the method walks the AST to find the deepest Filter node with a timestamp upper bound. However, if multiple Filter nodes exist at the same depth (e.g., sibling filters in a union or parallel branches), only the first child path is traversed. This could miss the actual time picker filter if it's not on the first branch, leading to incorrect column naming when `align=end` is used. public static Long extractTimestampUpperBound(Timewrap node) { Node current = node; Long lastBound = null; while (current != null && !current.getChild().isEmpty()) { current = current.getChild().get(0); if (current instanceof Filter filter) { Long bound = findUpperBound(filter.getCondition()); if (bound != null) { lastBound = bound; } } } return lastBound; } Possible Issue The pivoting logic assumes `__period__` and `__base_offset__` columns exist and are non-null in at least the first row. If all rows have null values for these columns (e.g., due to upstream filtering or aggregation edge cases), `boVal.longValue()` at line 359 will throw a NullPointerException. The code checks `isNull()` but only after dereferencing `boVal`, which could be null itself if the column is missing. // Read __base_offset__ (constant across all rows) long baseOffset = 0; ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__"); if (boVal != null && !boVal.isNull()) { baseOffset = boVal.longValue(); } Possible Issue The workaround at lines 345-352 re-scans columns if `periodIdx` or `baseOffsetIdx` are not found before value columns. This suggests the initial scan logic is unreliable. If the workaround also fails (e.g., columns are genuinely missing), the code proceeds with `-1` indices, causing ArrayIndexOutOfBoundsException when accessing `columns.get(periodIdx)` or similar operations later. if (periodIdx < 0 \|\| baseOffsetIdx < 0) { valueIdxs.clear(); for (int i = 0; i < columns.size(); i++) { String name = columns.get(i).getName(); if ("__period__".equals(name)) periodIdx = i; else if ("__base_offset__".equals(name)) baseOffsetIdx = i; else if (i > 0) valueIdxs.add(i); } } Possible Issue `spanToSeconds` uses fixed approximations for variable-length units (30 days for month, 91 for quarter, 365 for year). While documented as approximate, this can cause incorrect period alignment in leap years or months with different day counts. For example, a 1-month span in February (28/29 days) vs. March (31 days) will be treated identically as 30 days, misaligning data when wrapping across these months. public static long spanToSeconds(SpanUnit unit, int value) { return switch (unit.getName()) { case "s" -> value; case "m" -> value * 60L; case "h" -> value * 3_600L; case "d" -> value * 86_400L; case "w" -> value * 7L * 86_400L; case "M" -> value * 30L * 86_400L; case "q" -> value * 91L * 86_400L; case "y" -> value * 365L * 86_400L; default -> throw new SemanticCheckException("Unsupported time unit in timewrap: " + unit.getName()); }; }

github-actions · 2026-03-18T18:55:35Z

PR Code Suggestions ✨

Latest suggestions up to 61d68f3

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Add null check for period value The code assumes `period` is always present and non-null when accessing `longValue()`. If the column is missing or null, this will throw a NullPointerException. Add null checks before calling `longValue()` to handle malformed input gracefully. core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapPivot.java [134-137] for (ExprValue row : values) { Map<String, ExprValue> tuple = row.tupleValue(); String tsKey = tuple.get(tsColName).toString(); - long period = tuple.get("__period__").longValue(); + ExprValue periodVal = tuple.get("__period__"); + if (periodVal == null \|\| periodVal.isNull()) { + continue; + } + long period = periodVal.longValue(); Suggestion importance[1-10]: 7 __ Why: Valid concern about potential `NullPointerException` when accessing `__period__`. The code at line 136 calls `longValue()` without checking if the value is null, which could cause runtime errors with malformed data. Adding null checks improves robustness.	Medium
Possible issue	Prevent variable-length unit conversion The method returns approximate values for variable-length units (M, q, y) but these are used in fixed-length calculations in `visitTimewrap`. This creates inconsistency since variable-length units should use calendar arithmetic. Either document that this method is only for approximation or prevent its use with variable-length units. core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [39-52] public static long spanToSeconds(SpanUnit unit, int value) { + if (isVariableLengthUnit(unit)) { + throw new SemanticCheckException("Cannot convert variable-length unit to fixed seconds: " + unit.getName()); + } return switch (unit.getName()) { case "s" -> value; case "m" -> value * 60L; case "h" -> value * 3_600L; case "d" -> value * 86_400L; case "w" -> value * 7L * 86_400L; - case "M" -> value * 30L * 86_400L; - case "q" -> value * 91L * 86_400L; - case "y" -> value * 365L * 86_400L; default -> throw new SemanticCheckException("Unsupported time unit in timewrap: " + unit.getName()); }; } Suggestion importance[1-10]: 3 __ Why: The suggestion correctly identifies that `spanToSeconds` returns approximate values for variable-length units. However, the method is only called in the fixed-length branch of `visitTimewrap` (line 3898), so the inconsistency doesn't actually occur in practice. Adding the check would improve code clarity but isn't critical.	Low
General	Enforce maximum period limit The code collects all distinct periods without enforcing the documented maximum of 20 periods. If data contains more than 20 periods, all will be processed and included in output, contradicting the stated limitation. Enforce the limit by truncating the periods list after collection. core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapPivot.java [99-106] Set<Long> periodSet = new TreeSet<>(Collections.reverseOrder()); for (ExprValue row : values) { ExprValue pv = row.tupleValue().get("__period__"); if (pv != null && !pv.isNull()) { periodSet.add(pv.longValue()); } } List<Long> periods = new ArrayList<>(periodSet); +if (periods.size() > TimewrapUtils.MAX_PERIODS) { + periods = periods.subList(0, TimewrapUtils.MAX_PERIODS); +} Suggestion importance[1-10]: 6 __ Why: The suggestion correctly identifies that `MAX_PERIODS` is defined but not enforced in the pivot logic. While the documentation mentions a 20-period limit, the code doesn't actually truncate the periods list. Enforcing this limit would align implementation with documentation and prevent excessive column generation.	Low

Previous suggestions

Suggestions up to commit 4ba32b8

Category	Suggestion	Impact
Possible issue	Validate required columns exist before pivoting The double-scan workaround for finding column indices suggests fragile logic that depends on column ordering. If `period` or `base_offset` are not found after two passes, the code proceeds with `-1` indices, which will cause `ArrayIndexOutOfBoundsException` when accessing columns. Add validation after the second scan to fail fast with a clear error message. opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [345-353] -if (CalcitePlanContext.stripNullColumns.get()) { - try { - String unitInfo = CalcitePlanContext.timewrapUnitName.get(); - if (unitInfo != null && !values.isEmpty()) { - // Find column indices - int tsIdx = 0; - int periodIdx = -1; - int baseOffsetIdx = -1; - ... - // Workaround: if indices weren't found before value columns, scan again - if (periodIdx < 0 \|\| baseOffsetIdx < 0) { - valueIdxs.clear(); - for (int i = 0; i < columns.size(); i++) { - String name = columns.get(i).getName(); - if ("__period__".equals(name)) periodIdx = i; - else if ("__base_offset__".equals(name)) baseOffsetIdx = i; - else if (i > 0) valueIdxs.add(i); - } - } +if (periodIdx < 0 \|\| baseOffsetIdx < 0) { + valueIdxs.clear(); + for (int i = 0; i < columns.size(); i++) { + String name = columns.get(i).getName(); + if ("__period__".equals(name)) periodIdx = i; + else if ("__base_offset__".equals(name)) baseOffsetIdx = i; + else if (i > 0) valueIdxs.add(i); + } + if (periodIdx < 0 \|\| baseOffsetIdx < 0) { + throw new IllegalStateException( + "Timewrap pivot failed: required columns __period__ or __base_offset__ not found"); + } +} Suggestion importance[1-10]: 8 __ Why: The double-scan workaround is fragile and proceeds with `-1` indices if columns aren't found, which will cause `ArrayIndexOutOfBoundsException`. Adding validation after the second scan would prevent runtime crashes with a clear error message.	Medium
Possible issue	Enforce MAX_PERIODS limit validation The `MAX_PERIODS` constant is defined but never enforced in the code. Without validation, queries with more than 20 periods will succeed but may cause performance issues or unexpected behavior. Add a check in `visitTimewrap` to throw a `SemanticCheckException` when the period count exceeds this limit. core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [31] -public static final int MAX_PERIODS = 20; +// In CalciteRelNodeVisitor.visitTimewrap, after computing periods: +if (periods.size() > TimewrapUtils.MAX_PERIODS) { + throw new SemanticCheckException( + "Timewrap period count (" + periods.size() + ") exceeds maximum allowed (" + + TimewrapUtils.MAX_PERIODS + ")"); +} Suggestion importance[1-10]: 7 __ Why: The `MAX_PERIODS` constant is defined but never enforced. Without validation, queries exceeding this limit may cause performance issues. However, the suggestion points to code in `CalciteRelNodeVisitor.visitTimewrap` which is not in `TimewrapUtils.java`, making the `relevant_file` incorrect.	Medium
General	Log warning on timestamp parse failure The timestamp parsing silently returns `null` on failure, which causes `align=end` to fall back to `now` without warning. This can produce incorrect period alignment when the WHERE clause timestamp format is unexpected. Log a warning when parsing fails so users can diagnose alignment issues. core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [333-349] private static Long parseTimestampLiteral(UnresolvedExpression expr) { if (expr instanceof Literal lit && lit.getValue() instanceof String s) { try { java.time.LocalDateTime ldt = java.time.LocalDateTime.parse( s, java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")); return ldt.toEpochSecond(java.time.ZoneOffset.UTC); } catch (Exception e) { try { return java.time.Instant.parse(s).getEpochSecond(); - } catch (Exception ignored) { + } catch (Exception parseError) { + LOG.warn("Failed to parse timestamp literal '{}' for timewrap align=end, falling back to now", s); return null; } } } return null; } Suggestion importance[1-10]: 6 __ Why: Silent `null` return on parse failure causes `align=end` to fall back to `now` without warning, potentially producing incorrect period alignment. Logging would help users diagnose issues, though this is a minor improvement.	Low
General	Add defensive empty check before getFirst Accessing `values.getFirst()` without checking if `values` is empty can throw `NoSuchElementException`. Although the outer `if (!values.isEmpty())` guard exists, defensive programming suggests verifying before calling `getFirst()` to prevent future refactoring errors. opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [356-360] -ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__"); -if (boVal != null && !boVal.isNull()) { - baseOffset = boVal.longValue(); +if (!values.isEmpty()) { + ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__"); + if (boVal != null && !boVal.isNull()) { + baseOffset = boVal.longValue(); + } } Suggestion importance[1-10]: 3 __ Why: The outer `if (!values.isEmpty())` guard at line 329 already protects against empty `values`, making this additional check redundant. The suggestion adds defensive programming but offers minimal practical benefit.	Low

Suggestions up to commit 219a704

Category	Suggestion	Impact
Possible issue	Ensure ThreadLocal cleanup always executes The `ThreadLocal` cleanup only happens inside the `if` `(CalcitePlanContext.stripNullColumns.get())` block. If `stripNullColumns` is `true` but `values.isEmpty()` causes the pivot logic to be skipped, the cleanup still runs. However, if `stripNullColumns` is `true` and an exception is thrown before the `try` block is entered (e.g., during column index scanning), the cleanup still runs via `finally`. But if `stripNullColumns` is never set to `true` for a timewrap query (e.g., due to a bug in the visitor), the ThreadLocal values for `timewrapUnitName`, `timewrapSeries`, and `timewrapTimeFormat` set in `visitTimewrap` will never be cleaned up. Move the cleanup to an unconditional `finally` block outside the `if` check. opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [329-445] -if (CalcitePlanContext.stripNullColumns.get()) { - try { - ... - } finally { - CalcitePlanContext.stripNullColumns.set(false); - CalcitePlanContext.timewrapUnitName.set(null); - CalcitePlanContext.timewrapSeries.set(null); - CalcitePlanContext.timewrapTimeFormat.set(null); +try { + if (CalcitePlanContext.stripNullColumns.get()) { + String unitInfo = CalcitePlanContext.timewrapUnitName.get(); + if (unitInfo != null && !values.isEmpty()) { + // ... pivot logic ... + } } +} finally { + CalcitePlanContext.stripNullColumns.set(false); + CalcitePlanContext.timewrapUnitName.set(null); + CalcitePlanContext.timewrapSeries.set(null); + CalcitePlanContext.timewrapTimeFormat.set(null); } Suggestion importance[1-10]: 7 __ Why: The cleanup of `timewrapUnitName`, `timewrapSeries`, and `timewrapTimeFormat` is only inside the `if (CalcitePlanContext.stripNullColumns.get())` block. If `stripNullColumns` is never set to `true` (e.g., due to a bug in `visitTimewrap`), these ThreadLocal values will leak across requests. Moving cleanup to an unconditional outer `finally` block is a valid correctness improvement.	Medium
Possible issue	Prevent stale ThreadLocal state leaks These `ThreadLocal` variables are never initialized with a default value, meaning `get()` returns `null` unless explicitly set. If a query runs on a thread that previously processed a timewrap query and the `finally` block in `buildResultSet` was skipped (e.g., due to an exception before `stripNullColumns` was set to `true`), stale values could leak into subsequent queries. Initialize them with `ThreadLocal.withInitial(() -> null)` or ensure cleanup happens unconditionally. core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java [51-57] -public static final ThreadLocal<String> timewrapUnitName = new ThreadLocal<>(); -public static final ThreadLocal<String> timewrapSeries = new ThreadLocal<>(); -public static final ThreadLocal<String> timewrapTimeFormat = new ThreadLocal<>(); +public static final ThreadLocal<String> timewrapUnitName = ThreadLocal.withInitial(() -> null); +public static final ThreadLocal<String> timewrapSeries = ThreadLocal.withInitial(() -> null); +public static final ThreadLocal<String> timewrapTimeFormat = ThreadLocal.withInitial(() -> null); Suggestion importance[1-10]: 3 __ Why: `ThreadLocal.withInitial(() -> null)` is functionally equivalent to `new ThreadLocal<>()` since both return `null` by default. The real concern about stale values is valid but is better addressed by fixing the cleanup scope (suggestion 2), not by changing initialization.	Low
General	Warn users about unimplemented series mode The `series=exact` mode silently falls back to `series=short` naming (`_s`) without any warning or error. Users specifying `series=exact` with a `time_format` will receive unexpected column names with no indication that their parameter was ignored. This should either throw a `SemanticCheckException` indicating the feature is not yet supported, or log a warning so users are aware of the fallback behavior. opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [482-495] case "exact" -> { - // series=exact: prefix_<formatted_date> - String timeFormat = CalcitePlanContext.timewrapTimeFormat.get(); - if (timeFormat == null) timeFormat = "%Y-%m-%d"; - // Compute period start timestamp: reference - absolutePeriod * spanSeconds - // absolutePeriod is in span units, need to convert to seconds - long spanSec = - TimewrapUtils.spanToSeconds( - org.opensearch.sql.ast.expression.SpanUnit.of(singular.toUpperCase()), 1) - * spanValue; - // Not enough info to compute exact date here — fall back to short naming - // TODO: pass reference epoch + span for exact date computation - yield prefix + "_s" + absolutePeriod; + // series=exact is not yet fully implemented; fall back to short naming with a warning + log.warn("timewrap series=exact is not yet supported; falling back to series=short naming"); + yield "s" + absolutePeriod; } Suggestion importance[1-10]: 5 __ Why: The `series=exact` mode silently falls back to `series=short` naming without any user-visible indication, which is a usability issue. Adding a warning log or throwing an exception would make the unimplemented behavior explicit, though the `improved_code` also changes the `yield` value from `prefix + "_s" + absolutePeriod` to `"s" + absolutePeriod` (dropping the prefix), which appears to be a bug in the suggestion itself.	Low
General	Robustly handle missing base offset value `baseOffset` is read only from the first row, but `base_offset` is computed per-row in the Calcite plan (it's a window function result). While it should be constant across all rows for a given query, relying on the first row without validation could silently produce wrong column names if the value varies. Additionally, if the first row has a null `base_offset`, `baseOffset` defaults to `0`, which may produce incorrect period naming without any warning. opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [356-360] long baseOffset = 0; -ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__"); -if (boVal != null && !boVal.isNull()) { - baseOffset = boVal.longValue(); +boolean baseOffsetFound = false; +for (ExprValue row : values) { + ExprValue boVal = row.tupleValue().get("__base_offset__"); + if (boVal != null && !boVal.isNull()) { + baseOffset = boVal.longValue(); + baseOffsetFound = true; + break; + } +} +if (!baseOffsetFound) { + log.warn("timewrap: __base_offset__ not found or null in result rows, defaulting to 0"); } Suggestion importance[1-10]: 3 __ Why: The suggestion to scan all rows for `__base_offset__` is a minor robustness improvement, but since `__base_offset__` is a window function result that should be constant across all rows, reading from the first row is acceptable. The improved code is functionally nearly identical for the expected case.	Low

Suggestions up to commit e0a03f0

Category	Suggestion	Impact
Possible issue	Missing bounds check on input field list If `fieldNames` is empty (e.g., the upstream `timechart` produced no non-metadata fields), calling `fieldNames.get(0)` will throw an `IndexOutOfBoundsException`. Similarly, if `valueFieldNames` is empty, the subsequent pivot logic will produce incorrect results. A guard check should be added to validate the input schema before proceeding. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3110-3113] List<String> fieldNames = b.peek().getRowType().getFieldNames().stream().filter(f -> !isMetadataField(f)).toList(); +if (fieldNames.size() < 2) { + throw new SemanticCheckException( + "timewrap requires a timechart with at least one aggregation field, but got: " + fieldNames); +} String tsFieldName = fieldNames.get(0); List<String> valueFieldNames = fieldNames.subList(1, fieldNames.size()); Suggestion importance[1-10]: 6 __ Why: This is a valid defensive programming suggestion that prevents an `IndexOutOfBoundsException` when `timewrap` is used without a preceding `timechart` or with an empty result. The improved code provides a clear error message and correctly validates the input schema.	Low
Possible issue	Base offset read only from potentially null first row The `baseOffset` is read only from the first row (`values.getFirst()`), but `base_offset` is supposed to be constant across all rows. If the first row happens to have a null `base_offset` (e.g., due to a null timestamp in the first row), `baseOffset` defaults to `0`, which will produce incorrect absolute column names for all columns. The code should scan for the first non-null `base_offset` value across all rows. opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [349-355] long baseOffset = 0; -if (!values.isEmpty()) { - ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__"); +for (ExprValue row : values) { + ExprValue boVal = row.tupleValue().get("__base_offset__"); if (boVal != null && !boVal.isNull()) { baseOffset = boVal.longValue(); + break; } } Suggestion importance[1-10]: 5 __ Why: The suggestion correctly identifies that reading `__base_offset__` only from the first row could fail if that row has a null timestamp. Scanning for the first non-null value is more robust, though in practice `__base_offset__` should be constant across all rows as it's a window aggregate.	Low
General	Initialize ThreadLocal with default value for safety Using `ThreadLocal` variables in a shared execution context can cause state leakage between queries if the `finally` block in the execution engine is not always reached (e.g., if an exception occurs before `buildResultSet` is called). The `timewrapUnitName` ThreadLocal is not initialized with a default value, which means `get()` returns `null` and requires null checks everywhere. Consider initializing it with `ThreadLocal.withInitial(() -> null)` and ensuring cleanup happens in a more robust location, such as a try-finally wrapping the entire query execution, not just inside `buildResultSet`. core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java [45-51] public static final ThreadLocal<Boolean> stripNullColumns = ThreadLocal.withInitial(() -> false); -public static final ThreadLocal<String> timewrapUnitName = new ThreadLocal<>(); +public static final ThreadLocal<String> timewrapUnitName = ThreadLocal.withInitial(() -> null); Suggestion importance[1-10]: 3 __ Why: The suggestion to use `ThreadLocal.withInitial(() -> null)` is functionally equivalent to `new ThreadLocal<>()` since both return `null` by default. The real concern about state leakage is valid but the fix doesn't address the cleanup robustness issue mentioned in the suggestion content.	Low
General	Hardcoded timestamp field names limit flexibility The `isTimestampField` method hardcodes only `@timestamp` and `timestamp` as valid timestamp field names. This means `extractTimestampUpperBound` will silently return `null` for any other timestamp field name, causing `align=end` to fall back to `now` without warning. The timestamp field name should be derived from the actual query context (e.g., the first field of the timechart output) rather than being hardcoded. core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [312-318] private static boolean isTimestampField(UnresolvedExpression expr) { if (expr instanceof Field field) { String name = field.getField().toString(); - return "@timestamp".equals(name) \|\| "timestamp".equals(name); + // Match common timestamp field names; ideally this should be derived from query context + return "@timestamp".equals(name) \|\| "timestamp".equals(name) \|\| name.endsWith("timestamp"); } return false; } Suggestion importance[1-10]: 3 __ Why: The concern about hardcoded field names is valid, but the proposed fix of adding `name.endsWith("timestamp")` is a partial workaround rather than a proper solution. The suggestion acknowledges the real fix should derive the field name from query context, making the improved code only marginally better.	Low

Suggestions up to commit 6276cba

Category	Suggestion	Impact
Possible issue	ThreadLocal state may leak between queries on error These `ThreadLocal` variables are static fields on a shared class and are only reset in the `finally` block of `buildResultSet`. If an exception occurs before `buildResultSet` is reached, or if the query is cancelled, the ThreadLocals will retain stale values and corrupt subsequent queries on the same thread. Consider using a try/finally pattern that resets these values immediately after the plan is built, or reset them at the start of each query execution. core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java [45-51] public static final ThreadLocal<Boolean> stripNullColumns = ThreadLocal.withInitial(() -> false); public static final ThreadLocal<String> timewrapUnitName = new ThreadLocal<>(); +public static void resetTimewrapState() { + stripNullColumns.set(false); + timewrapUnitName.set(null); +} + Suggestion importance[1-10]: 6 __ Why: The `stripNullColumns` and `timewrapUnitName` ThreadLocals are only reset in the `finally` block of `buildResultSet`, but if an exception occurs before that point, stale values could corrupt subsequent queries on the same thread. The suggestion is valid but the `improved_code` only adds a helper method without actually changing the reset pattern, making it a partial fix.	Low
Possible issue	Missing guard against empty field list causes potential crash The code assumes the first non-metadata field is always the timestamp field. If the upstream `timechart` output has a different column order, or if `fieldNames` is empty after filtering, this will either produce incorrect results or throw an `IndexOutOfBoundsException`. Add a guard for the empty case and consider validating that the first field is actually a timestamp type. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3110-3113] List<String> fieldNames = b.peek().getRowType().getFieldNames().stream().filter(f -> !isMetadataField(f)).toList(); + if (fieldNames.isEmpty()) { + throw new SemanticCheckException("timewrap: no fields found in timechart output"); + } String tsFieldName = fieldNames.get(0); List<String> valueFieldNames = fieldNames.subList(1, fieldNames.size()); Suggestion importance[1-10]: 5 __ Why: If `fieldNames` is empty after filtering metadata fields, `fieldNames.get(0)` will throw an `IndexOutOfBoundsException`. Adding an explicit guard with a meaningful error message improves robustness and debuggability.	Low
General	Upper bound extraction stops too early in AST walk The method only checks the first `Filter` node encountered in the AST walk and returns immediately. If the first child is not a `Filter` but a subsequent child is (e.g., there is a `Sort` or `Project` between `Timewrap` and the `Filter`), the upper bound will not be found and `align=end` will silently fall back to `now`. The loop should continue walking past non-Filter nodes rather than returning only on the first Filter found. core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [282-291] public static Long extractTimestampUpperBound(Timewrap node) { Node current = node; while (current != null && !current.getChild().isEmpty()) { current = current.getChild().get(0); if (current instanceof Filter filter) { - return findUpperBound(filter.getCondition()); + Long bound = findUpperBound(filter.getCondition()); + if (bound != null) return bound; } } return null; } Suggestion importance[1-10]: 5 __ Why: The current code returns `null` immediately after the first `Filter` node even if `findUpperBound` returns `null`, potentially missing a valid upper bound in a deeper `Filter`. The improved code continues walking past non-matching filters, which is more robust for `align=end` behavior.	Low
General	Silent fallback produces incorrect column names when offset missing `baseOffset` defaults to `0` if `base_offset` is missing or null. A `baseOffset` of `0` will cause all period columns to be named as if the reference is the same as the latest data point, silently producing wrong column names (e.g., `latest_day` instead of `5days_before`) without any error. This should at minimum log a warning, or throw if the column is expected but missing. opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [349-355] long baseOffset = 0; if (!values.isEmpty()) { ExprValue boVal = values.getFirst().tupleValue().get("__base_offset__"); if (boVal != null && !boVal.isNull()) { baseOffset = boVal.longValue(); + } else { + // __base_offset__ missing: column naming will be relative, not absolute + log.warn("timewrap: __base_offset__ not found in result; column names may be incorrect"); } } Suggestion importance[1-10]: 4 __ Why: A silent default of `0` for `baseOffset` when `__base_offset__` is missing will produce incorrect column names without any indication of the problem. Adding a warning log is a reasonable improvement, though the `improved_code` only adds a log statement rather than a stronger error signal.	Low

Suggestions up to commit e0a03f0

Category	Suggestion	Impact
Possible issue	Add validation to prevent index out of bounds If `fieldNames` is empty (e.g., the upstream `timechart` produced no non-metadata fields), `fieldNames.get(0)` will throw an `IndexOutOfBoundsException`. Similarly, if there are no value fields, `valueFieldNames` will be empty and subsequent operations like `valueFieldNames.get(0)` will fail. Add a validation check to provide a clear error message. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3110-3113] List<String> fieldNames = b.peek().getRowType().getFieldNames().stream().filter(f -> !isMetadataField(f)).toList(); +if (fieldNames.size() < 2) { + throw new SemanticCheckException( + "timewrap requires a timechart with at least one aggregation field"); +} String tsFieldName = fieldNames.get(0); List<String> valueFieldNames = fieldNames.subList(1, fieldNames.size()); Suggestion importance[1-10]: 6 __ Why: This is a valid defensive check — if `timewrap` is used without a preceding `timechart` or with an empty result, `fieldNames.get(0)` would throw an `IndexOutOfBoundsException`. The improved code provides a clear error message instead, improving debuggability.	Low
Possible issue	Ensure ThreadLocal state is always cleaned up Using `static` `ThreadLocal` variables in a shared context can cause state leakage between concurrent queries if the `finally` block in the execution engine is not always reached (e.g., due to exceptions before `buildResultSet`). Consider resetting these thread-locals in a more robust lifecycle hook, or ensure they are always cleaned up. Additionally, `timewrapUnitName` has no initial value, which means `get()` returns `null` and requires null checks everywhere it is used. core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java [45-51] public static final ThreadLocal<Boolean> stripNullColumns = ThreadLocal.withInitial(() -> false); -public static final ThreadLocal<String> timewrapUnitName = new ThreadLocal<>(); +public static final ThreadLocal<String> timewrapUnitName = ThreadLocal.withInitial(() -> null); Suggestion importance[1-10]: 4 __ Why: The suggestion about `timewrapUnitName` having no initial value is valid but minor — `ThreadLocal.withInitial(() -> null)` is functionally equivalent to `new ThreadLocal<>()` since both return `null` by default. The concern about state leakage is real but the `improved_code` only changes `new ThreadLocal<>()` to `ThreadLocal.withInitial(() -> null)`, which doesn't actually address the cleanup concern.	Low
General	Fix fragile column name parsing with ambiguous separators The `unitInfo` format expected by `renameTimewrapColumn` is `"spanValue\|singular\|plural\|_before"` (4 parts split by `\|`), but the value being set here is `"spanValue\|singular\|plural\|_before"` where `unitBaseName` returns `"value\|singular\|plural"` and `"\|_before"` is appended — this produces `"value\|singular\|plural\|_before"` which is correct. However, the suffix `_before` in `nameSuffix` is parsed as `parts[3]` which would be `"_before"`, but the column names end with `"_before"` (with underscore). The column name format is `prefix_N_before`, so `nameSuffix` should be `"_before"` and the check `name.endsWith(nameSuffix)` would match. Verify that the separator between the period number and `_before` in the column name is consistent with what `renameTimewrapColumn` expects — currently columns are named like `sum(requests)_3_before` but the rename logic strips `before` and then splits on the last to get the period number, which would incorrectly split `sum(requests)_3` at the last underscore giving prefix=`sum(requests)` and period=`3`. This works only if the value field name contains no trailing underscore-number pattern, but could silently misparse for field names like `my_field_3`. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3104-3105] +// Use a distinct separator that won't appear in field names, e.g., "__before__" CalcitePlanContext.timewrapUnitName.set( - TimewrapUtils.unitBaseName(node.getUnit(), node.getValue()) + "\|_before"); + TimewrapUtils.unitBaseName(node.getUnit(), node.getValue()) + "\|__before__"); Suggestion importance[1-10]: 5 __ Why: The suggestion raises a valid concern about ambiguous parsing when field names contain underscore-number patterns, but the `improved_code` only changes the separator string without addressing the actual parsing logic in `renameTimewrapColumn`. The fix is incomplete and would require coordinated changes in the execution engine as well.	Low
General	Remove hardcoded timestamp field name assumption The `isTimestampField` method hardcodes only `@timestamp` and `timestamp` as valid timestamp field names. This means `extractTimestampUpperBound` will silently return `null` for any other timestamp field name (e.g., `event_time`, `ts`), causing `timewrap` to fall back to `now` even when `align=end` is specified. The timestamp field name should be derived from the actual schema rather than hardcoded. core/src/main/java/org/opensearch/sql/calcite/utils/TimewrapUtils.java [312-318] +// Consider passing the actual timestamp field name from the Timewrap node or its child schema +// instead of hardcoding. As a minimal fix, make the check case-insensitive and more general: private static boolean isTimestampField(UnresolvedExpression expr) { if (expr instanceof Field field) { - String name = field.getField().toString(); - return "@timestamp".equals(name) \|\| "timestamp".equals(name); + String name = field.getField().toString().toLowerCase(); + return name.contains("timestamp") \|\| name.equals("ts") \|\| name.equals("time"); } return false; } Suggestion importance[1-10]: 4 __ Why: The concern about hardcoded field names is valid, but the proposed fix of using `contains("timestamp")` is overly broad and could match unintended fields. The suggestion acknowledges the proper fix (deriving from schema) but doesn't implement it, making the `improved_code` a potentially worse solution.	Low

github-actions · 2026-03-18T19:09:55Z

Persistent review updated to latest commit 6276cba

github-actions · 2026-03-18T20:32:58Z

Persistent review updated to latest commit e0a03f0

github-actions · 2026-03-19T20:23:32Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 4ba32b8.

Path	Line	Severity	Description
opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java	477	low	renameTimewrapColumn is defined but never called anywhere in the diff — dead code that references SpanUnit.of(singular.toUpperCase()) with a value derived from user-controlled query input. Not directly exploitable through the parser constraints, but the unused method adds unnecessary attack surface.
opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java	333	low	ThreadLocal variables (stripNullColumns, timewrapUnitName, timewrapSeries, timewrapTimeFormat) are set during query planning in visitTimewrap but only cleaned up inside buildResultSet's try/finally. If buildResultSet is never reached on an error path, ThreadLocals persist for subsequent requests on the same pooled thread, potentially leaking query configuration state across unrelated user queries.
opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java	516	low	The series=exact branch silently falls back to series=short behavior with no warning or error (TODO comment acknowledges this). Users requesting exact date-formatted column names receive short numeric names instead, which is a silent behavioral discrepancy but not exploitable.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 0 | Medium: 0 | Low: 3

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-19T20:24:33Z

Persistent review updated to latest commit 219a704

Implement the timewrap command that reshapes timechart output by wrapping each time period into a separate data series, enabling day-over-day, week-over-week, and other recurring interval comparisons. Signed-off-by: Jialiang Li <jialiang.li@hey.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>

Add timewrap_test to SHOW TABLES expected output (25 tables). Signed-off-by: Jialiang Li <jialiang.li@hey.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>

…etic - Extract all timewrap helper methods to TimewrapUtils.java in calcite/utils/ - Add precise EXTRACT-based period computation for month/quarter/year - Add cumDaysBeforeMonth with leap year CASE expression for precise quarter offset - Month/quarter/year period assignment now uses calendar arithmetic instead of approximate fixed-length conversions Signed-off-by: Jialiang Li <jialiang.li@hey.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>

- Remove Calcite PIVOT from timewrap: no more MAX_PERIODS limit or crash risk - Add post-processing pivot in execution engine: dynamically builds columns from actual data using HashMap grouping - Add series parameter: series=relative (default), series=short (s0, s1) - Add series=exact grammar support (falls back to short at runtime) - Add time_format grammar support for series=exact - Benchmark: 1000 period columns in ~50ms (Calcite PIVOT crashed at 1000) - 32 IT tests, all using verifySchema + verifyDataRows Signed-off-by: Jialiang Li <jialiang.li@hey.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>

github-actions · 2026-06-25T17:05:26Z

Persistent review updated to latest commit 4ba32b8

The timewrap pivot (turning the unpivoted [display_ts, value, __base_offset__, __period__] rows into Splunk-style period columns) was done as post-processing in OpenSearchExecutionEngine.buildResultSet, gated on CalcitePlanContext thread-locals. The analytics-engine route executes the RelNode via AnalyticsExecutionEngine and never reaches that code, so timewrap queries came back unpivoted (CalciteTimewrapCommandIT 0/32 on the analytics route). Extract the pivot into a shared core helper (TimewrapPivot) and call it from both execution engines so they produce identical output. AnalyticsExecutionEngine captures the timewrap signals at execute() entry via TimewrapSignals because the result callback runs on a different worker thread than the planning thread that set the thread-locals. CalciteTimewrapCommandIT: 32/32 on both the Calcite/v2 and analytics-engine routes. Signed-off-by: Kai Huang <ahkcs@amazon.com>

ahkcs changed the title ~~Add PPL timewrap command for time-period comparison~~ Support PPL timewrap command Mar 18, 2026

ahkcs added PPL Piped processing language feature labels Mar 18, 2026

opensearch-project deleted a comment from github-actions Bot Mar 18, 2026

ahkcs force-pushed the feature/ppl-timewrap-command branch from 6276cba to e0a03f0 Compare March 18, 2026 20:30

ahkcs mentioned this pull request Mar 19, 2026

[RFC] PPL timewrap command #5237

Open

ahkcs added 4 commits June 25, 2026 10:00

Update metadata.rst doctest for timewrap_test index

57495d0

Add timewrap_test to SHOW TABLES expected output (25 tables). Signed-off-by: Jialiang Li <jialiang.li@hey.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>

ahkcs force-pushed the feature/ppl-timewrap-command branch from 219a704 to 4ba32b8 Compare June 25, 2026 17:03

ahkcs added this to PPL 2026 Roadmap Jun 25, 2026

github-project-automation Bot moved this to Todo in PPL 2026 Roadmap Jun 25, 2026

ahkcs removed this from PPL 2026 Roadmap Jun 25, 2026

ahkcs marked this pull request as ready for review June 25, 2026 17:43

ahkcs requested review from dai-chen, joshuali925, mengweieric and ps48 as code owners June 25, 2026 17:43

ahkcs requested review from LantaoJin, RyanL1997, Swiddis, acarbonetto, anirudha, noCharger, penghuo, qianheng-aws, songkant-aws, vamsimanohar, ykmr1224 and yuancu as code owners June 25, 2026 17:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support PPL timewrap command #5241

Support PPL timewrap command #5241
ahkcs wants to merge 5 commits into
opensearch-project:mainfrom
ahkcs:feature/ppl-timewrap-command

ahkcs commented Mar 18, 2026

Uh oh!

github-actions Bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 18, 2026

Uh oh!

github-actions Bot commented Mar 18, 2026

Uh oh!

github-actions Bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 19, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ahkcs commented Mar 18, 2026

Description

Key features

Implementation

Known limitations

Testing

Issues Resolved

Check List

Uh oh!

github-actions Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 4ba32b8)

Uh oh!

github-actions Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions Bot commented Mar 18, 2026

Uh oh!

github-actions Bot commented Mar 18, 2026

Uh oh!

github-actions Bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Analyzer ❗

Uh oh!

github-actions Bot commented Mar 19, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Mar 18, 2026 •

edited

Loading

(Review updated until commit `4ba32b8`)

github-actions Bot commented Mar 18, 2026 •

edited

Loading

github-actions Bot commented Mar 19, 2026 •

edited

Loading