Skip to content

Time Unit Unification for bin/stats#4450

Merged
Swiddis merged 2 commits into
opensearch-project:mainfrom
ahkcs:feature/unified-time-units-4397
Dec 10, 2025
Merged

Time Unit Unification for bin/stats#4450
Swiddis merged 2 commits into
opensearch-project:mainfrom
ahkcs:feature/unified-time-units-4397

Conversation

@ahkcs

@ahkcs ahkcs commented Oct 7, 2025

Copy link
Copy Markdown
Collaborator

Description

Implements unified time unit support across PPL commands with case-sensitive handling.

Key Changes:

  • Fixed case sensitivity for bin command: Uppercase M = month, lowercase m = minute
  • Added subsecond test cases to bin command: us, cs, ds
  • Documented limitations: stats command cannot support subsecond units (OpenSearch calendar intervals)

Related Issues

Resolves #4397

@dai-chen dai-chen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we have single place for span parsing logic (currently in AstExpressionBuilder.visitSpan?) instead of trying to parse span string everywhere? Any difference between span in bin options and span elsewhere?

@ahkcs ahkcs force-pushed the feature/unified-time-units-4397 branch from dbcf2ba to 77d1596 Compare October 15, 2025 19:50
@ahkcs

ahkcs commented Oct 15, 2025

Copy link
Copy Markdown
Collaborator Author

Shall we have single place for span parsing logic (currently in AstExpressionBuilder.visitSpan?) instead of trying to parse span string everywhere? Any difference between span in bin options and span elsewhere?

Thanks @dai-chen for the suggestion, however bin span is indeed different with the span elsewhere
For Span Types:

  • stats/timechart: Numeric + Time only
  • bin: Numeric + Time + Logarithmic (log2, log10, 2.5log3)

I think it's still necessary to separate the implementation of bin span and other span

@dai-chen dai-chen added enhancement New feature or request PPL Piped processing language backport 2.19-dev labels Oct 22, 2025
@opensearch-trigger-bot

Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

@ahkcs ahkcs force-pushed the feature/unified-time-units-4397 branch from 77d1596 to 0fe134c Compare November 19, 2025 20:12
dai-chen
dai-chen previously approved these changes Nov 21, 2025
@dai-chen

Copy link
Copy Markdown
Collaborator

Shall we have single place for span parsing logic (currently in AstExpressionBuilder.visitSpan?) instead of trying to parse span string everywhere? Any difference between span in bin options and span elsewhere?

Thanks @dai-chen for the suggestion, however bin span is indeed different with the span elsewhere For Span Types:

  • stats/timechart: Numeric + Time only
  • bin: Numeric + Time + Logarithmic (log2, log10, 2.5log3)

I think it's still necessary to separate the implementation of bin span and other span

I see. We can consider how to extract the common logic later.

@opensearch-trigger-bot

Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

# Conflicts:
#	integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java

# Conflicts:
#	docs/user/ppl/cmd/bin.rst
#	docs/user/ppl/cmd/stats.rst
#	integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
@ahkcs ahkcs force-pushed the feature/unified-time-units-4397 branch from 0fe134c to dfb0a18 Compare December 9, 2025 22:27
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@coderabbitai

coderabbitai Bot commented Dec 9, 2025

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for "M" as a month unit alias in time span parameters.
    • Expanded time unit options with new short forms: secs, mins, hrs.
    • Enhanced subsecond time unit support for milliseconds, microseconds, centiseconds, and deciseconds.
  • Documentation

    • Updated bin command documentation to reflect newly supported time unit aliases.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

This change standardizes time unit handling in the binning utility to support case-sensitive unit parsing. It introduces consistent mapping for month (M) and minute (m), adds subsecond unit support (microseconds, centiseconds, deciseconds), and consolidates unit normalization logic through a public SpanParser API while updating related code and documentation.

Changes

Cohort / File(s) Summary
Core Binning Utilities
core/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java, core/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.java, core/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.java
Introduces case-sensitive time unit mapping (M for months, m for minutes), expands subsecond unit support (us, cs, ds), and refactors unit normalization to consolidate logic in SpanParser.getNormalizedUnit(). Updated TimeSpanHelper to use public API instead of private helper. Removes toLowerCase call from input normalization.
Documentation
docs/user/ppl/cmd/bin.md
Added "M" as an alias for month unit in bin command span parameter options.
Integration Tests
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
Added testBinCaseSensitivity_mon_vs_M() to validate month unit consistency and testBinWithSubsecondUnits() to validate subsecond span binning behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Case-sensitivity rules: Verify that SpanParser correctly applies case-sensitive suffix matching for M, m, us, cs, ds and case-insensitive matching for all other units while preserving word-boundary semantics.
  • API consolidation: Confirm TimeSpanHelper correctly uses SpanParser.getNormalizedUnit() across all methods and that the updated unit comparisons ("d" and "months") are correct.
  • Unit mapping completeness: Ensure TimeUnitRegistry.getConfig() handles all case-sensitive units consistently and the expanded mappings (secs, mins, hrs) align with the rest of the system.
  • Test coverage: Validate that new tests adequately cover case-sensitivity and subsecond unit binning scenarios.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Time Unit Unification for bin/stats' accurately summarizes the main change: implementing unified time unit support with case-sensitive handling across PPL commands.
Description check ✅ Passed The description clearly relates to the changeset, explaining the key changes in case sensitivity for the bin command, subsecond unit support, and documentation of stats limitations.
Linked Issues check ✅ Passed The PR implements all coding requirements from issue #4397: case-sensitive time units (m=minute, M=month), added subsecond unit support (us, cs, ds), and updated relevant parsing logic across SpanParser, TimeSpanHelper, and TimeUnitRegistry.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing unified time unit support: core parsing logic updates, test additions for case sensitivity and subsecond units, and documentation updates align with the stated objectives.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/user/ppl/cmd/bin.md (1)

21-21: Month unit alias now matches parser; consider an explicit m vs M note

Adding M as an alias for month is consistent with the new case-sensitive behavior and the bin tests. To make the case-sensitivity more obvious to users, consider adding a brief note near the time-unit list that explicitly states m = minute and M = month, so readers don't have to infer it from the two separate bullets.

core/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java (1)

52-56: SpanParser unit handling matches requirements; consider centralizing the case-sensitive unit set

The additions of "M" -> "months" in NORMALIZED_UNITS, the case-preserving entries in UNIT_LOOKUP, and the updated extractTimeUnit (case-sensitive for M, m, us, cs, ds, case-insensitive for others with longest-match and boundary checks) correctly implement the desired m vs M and subsecond behavior for bin spans.

For maintainability, you might:

  • Extract the set { "M", "m", "us", "cs", "ds" } into a shared constant (or utility) and reuse it both here and in TimeUnitRegistry.getConfig` to avoid divergence.
  • Optionally precompute String lowerSpan = spanStr.toLowerCase(Locale.ROOT); once in extractTimeUnit and reuse it in the case-insensitive branch to avoid repeated allocations.

Behavior is fine as-is; these would just reduce duplication and make future unit additions safer.

Also applies to: 65-77, 146-178

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5f963a0 and 4461387.

📒 Files selected for processing (5)
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java (3 hunks)
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.java (3 hunks)
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.java (3 hunks)
  • docs/user/ppl/cmd/bin.md (1 hunks)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java

⚙️ CodeRabbit configuration file

**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)

  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure methods are under 20 lines with single responsibility
  • Verify proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Verify integration tests are in correct module (integ-test/)

  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify test coverage for new business logic

  • Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor

  • Verify SQL generation and optimization paths
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java
🧠 Learnings (2)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: CodeQL-Scan (java)
  • GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (3)
core/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.java (1)

17-40: Case-sensitive handling for M/m and subsecond units in TimeUnitRegistry looks consistent

The updated UNIT_MAPPING plus the getConfig branch for M, m, us, cs, and ds align with the intended semantics: m as minutes, M as months, and explicit configs for subsecond units, while keeping other aliases (e.g., secs, mins, hrs) case-insensitive. This matches the SpanParser behavior and should give consistent unit handling across commands.

Also applies to: 46-47, 55-60, 65-79

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java (1)

1080-1160: New bin integration tests for months and subsecond units are well targeted

testBinCaseSensitivity_mon_vs_M and testBinWithSubsecondUnits exercise the key new behaviors: unified month handling for 1M vs 1mon and support for ms, us, cs, and ds spans, including schema expectations. The assertions are precise and consistent with the rest of this IT class, giving good regression coverage for the time-unit changes.

core/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.java (1)

31-42: TimeSpanHelper now correctly reuses SpanParser normalization for time units

Switching shouldApplyAligntime, createAlignedTimeSpan, and createStandardTimeSpan to use SpanParser.extractTimeUnit and SpanParser.getNormalizedUnit removes duplicated normalization logic and keeps bin/stats alignment decisions consistent with the new unit semantics (d vs months, m vs M, subsecond units). The conditions for when to ignore aligntime (days/months) remain intact, and the code stays straightforward.

Also applies to: 59-72, 83-96

@Swiddis Swiddis merged commit 5bb2747 into opensearch-project:main Dec 10, 2025
38 checks passed
@opensearch-trigger-bot

Copy link
Copy Markdown
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4450-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5bb274740685a57d1798e70ab43f6859e3d7ee81
# Push it to GitHub
git push --set-upstream origin backport/backport-4450-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4450-to-2.19-dev.

ahkcs added a commit to ahkcs/sql that referenced this pull request Dec 10, 2025
ahkcs added a commit to ahkcs/sql that referenced this pull request Dec 10, 2025
(cherry picked from commit 5bb2747)
Signed-off-by: Kai Huang <ahkcs@amazon.com>
LantaoJin pushed a commit that referenced this pull request Dec 11, 2025
(cherry picked from commit 5bb2747)

Signed-off-by: Kai Huang <ahkcs@amazon.com>
@LantaoJin LantaoJin added the backport-manually Filed a PR to backport manually. label Dec 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. enhancement New feature or request PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent time unit support across PPL commands

4 participants