Time Unit Unification for bin/stats#4450
Conversation
dbcf2ba to
77d1596
Compare
Thanks @dai-chen for the suggestion, however bin span is indeed different with the span elsewhere
I think it's still necessary to separate the implementation of bin span and other span |
|
This PR is stalled because it has been open for 2 weeks with no activity. |
77d1596 to
0fe134c
Compare
I see. We can consider how to extract the common logic later. |
|
This PR is stalled because it has been open for 2 weeks with no activity. |
Signed-off-by: Kai Huang <ahkcs@amazon.com> # Conflicts: # integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java # Conflicts: # docs/user/ppl/cmd/bin.rst # docs/user/ppl/cmd/stats.rst # integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
0fe134c to
dfb0a18
Compare
Signed-off-by: Kai Huang <ahkcs@amazon.com>
📝 WalkthroughSummary by CodeRabbitRelease Notes
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughThis change standardizes time unit handling in the binning utility to support case-sensitive unit parsing. It introduces consistent mapping for month ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/user/ppl/cmd/bin.md (1)
21-21: Month unit alias now matches parser; consider an explicit m vs M noteAdding
Mas an alias for month is consistent with the new case-sensitive behavior and the bin tests. To make the case-sensitivity more obvious to users, consider adding a brief note near the time-unit list that explicitly statesm= minute andM= month, so readers don't have to infer it from the two separate bullets.core/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java (1)
52-56: SpanParser unit handling matches requirements; consider centralizing the case-sensitive unit setThe additions of
"M" -> "months"inNORMALIZED_UNITS, the case-preserving entries inUNIT_LOOKUP, and the updatedextractTimeUnit(case-sensitive forM,m,us,cs,ds, case-insensitive for others with longest-match and boundary checks) correctly implement the desiredmvsMand subsecond behavior for bin spans.For maintainability, you might:
- Extract the set
{ "M", "m", "us", "cs", "ds" } into a shared constant (or utility) and reuse it both here and inTimeUnitRegistry.getConfig` to avoid divergence.- Optionally precompute
String lowerSpan = spanStr.toLowerCase(Locale.ROOT);once inextractTimeUnitand reuse it in the case-insensitive branch to avoid repeated allocations.Behavior is fine as-is; these would just reduce duplication and make future unit additions safer.
Also applies to: 65-77, 146-178
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
core/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java(3 hunks)core/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.java(3 hunks)core/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.java(3 hunks)docs/user/ppl/cmd/bin.md(1 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.javacore/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.javacore/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.javacore/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.javacore/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Verify SQL generation and optimization paths
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.javacore/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.javacore/src/main/java/org/opensearch/sql/calcite/utils/binning/SpanParser.java
🧠 Learnings (2)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (3)
core/src/main/java/org/opensearch/sql/calcite/utils/binning/time/TimeUnitRegistry.java (1)
17-40: Case-sensitive handling for M/m and subsecond units inTimeUnitRegistrylooks consistentThe updated
UNIT_MAPPINGplus thegetConfigbranch forM,m,us,cs, anddsalign with the intended semantics:mas minutes,Mas months, and explicit configs for subsecond units, while keeping other aliases (e.g.,secs,mins,hrs) case-insensitive. This matches the SpanParser behavior and should give consistent unit handling across commands.Also applies to: 46-47, 55-60, 65-79
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java (1)
1080-1160: New bin integration tests for months and subsecond units are well targeted
testBinCaseSensitivity_mon_vs_MandtestBinWithSubsecondUnitsexercise the key new behaviors: unified month handling for1Mvs1monand support forms,us,cs, anddsspans, including schema expectations. The assertions are precise and consistent with the rest of this IT class, giving good regression coverage for the time-unit changes.core/src/main/java/org/opensearch/sql/calcite/utils/binning/handlers/TimeSpanHelper.java (1)
31-42: TimeSpanHelper now correctly reuses SpanParser normalization for time unitsSwitching
shouldApplyAligntime,createAlignedTimeSpan, andcreateStandardTimeSpanto useSpanParser.extractTimeUnitandSpanParser.getNormalizedUnitremoves duplicated normalization logic and keeps bin/stats alignment decisions consistent with the new unit semantics (dvsmonths,mvsM, subsecond units). The conditions for when to ignore aligntime (days/months) remain intact, and the code stays straightforward.Also applies to: 59-72, 83-96
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4450-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5bb274740685a57d1798e70ab43f6859e3d7ee81
# Push it to GitHub
git push --set-upstream origin backport/backport-4450-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
(cherry picked from commit 5bb2747)
(cherry picked from commit 5bb2747) Signed-off-by: Kai Huang <ahkcs@amazon.com>
Description
Implements unified time unit support across PPL commands with case-sensitive handling.
Key Changes:
bincommand: Uppercase M = month, lowercase m = minuteRelated Issues
Resolves #4397