Skip to content

[#11017] improvement(clickhouse): Support date partition transforms#11018

Merged
diqiu50 merged 7 commits into
mainfrom
yuqi/clickhouse-partition-transforms
May 15, 2026
Merged

[#11017] improvement(clickhouse): Support date partition transforms#11018
diqiu50 merged 7 commits into
mainfrom
yuqi/clickhouse-partition-transforms

Conversation

@yuqi1129
Copy link
Copy Markdown
Contributor

@yuqi1129 yuqi1129 commented May 9, 2026

What changes were proposed in this pull request?

Support ClickHouse table creation with Gravitino date partition transforms:

  • Transforms.year(column) -> PARTITION BY toYear(column)
  • Transforms.month(column) -> PARTITION BY toYYYYMM(column)
  • Transforms.day(column) -> PARTITION BY toDate(column)

This PR also removes the stale identity-only helper in ClickHouseTableOperations and uses the shared ClickHouse SQL utility path for partition expression rendering.

Why are the changes needed?

ClickHouse catalog table creation previously only supported identity partitioning, while loading existing ClickHouse tables already recognized toYear, toYYYYMM, and toDate partition expressions.

Fix: #11017

Does this PR introduce any user-facing change?

Yes. Users can now create ClickHouse MergeTree-family tables with year, month, and day partition transforms through Gravitino.

How was this patch tested?

./gradlew :catalogs-contrib:catalog-jdbc-clickhouse:spotlessApply
./gradlew :catalogs-contrib:catalog-jdbc-clickhouse:test --tests org.apache.gravitino.catalog.clickhouse.operations.TestClickHouseTableOperations -PskipITs
./gradlew :catalogs-contrib:catalog-jdbc-clickhouse:test --tests org.apache.gravitino.catalog.clickhouse.integration.test.CatalogClickHouseIT.testCreateAndLoadWithPartitionSortAndIndexes -PskipTests -PskipDockerTests=false

Copilot AI review requested due to automatic review settings May 9, 2026 11:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support in the ClickHouse JDBC catalog for rendering Gravitino date-based partition transforms into ClickHouse PARTITION BY expressions during table creation, aligning create-table behavior with the existing SHOW CREATE TABLE parsing support.

Changes:

  • Extend ClickHouse partition expression rendering to support year, month, and day transforms (toYear, toYYYYMM, toDate).
  • Remove the stale identity-only partition helper from ClickHouseTableOperations and route partition rendering through the shared SQL utility.
  • Add/adjust unit and integration tests to validate create + load roundtrip with month partitioning.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
catalogs-contrib/catalog-jdbc-clickhouse/src/main/java/org/apache/gravitino/catalog/clickhouse/operations/ClickHouseTableSqlUtils.java Implement year/month/day partition transform rendering and centralize single-column validation.
catalogs-contrib/catalog-jdbc-clickhouse/src/main/java/org/apache/gravitino/catalog/clickhouse/operations/ClickHouseTableOperations.java Remove local partition-expression helper; use shared SQL utility for PARTITION BY rendering.
catalogs-contrib/catalog-jdbc-clickhouse/src/test/java/org/apache/gravitino/catalog/clickhouse/operations/TestClickHouseTableOperations.java Add unit assertions for PARTITION BY toYear/toYYYYMM/toDate generation.
catalogs-contrib/catalog-jdbc-clickhouse/src/test/java/org/apache/gravitino/catalog/clickhouse/integration/test/CatalogClickHouseIT.java Update integration test to create/load a table using month(event_time) partitioning and assert the loaded transform.

@yuqi1129 yuqi1129 changed the title [#11017] feat(clickhouse): Support date partition transforms [#11017] improvement(clickhouse): Support date partition transforms May 9, 2026
@yuqi1129 yuqi1129 requested a review from Copilot May 9, 2026 12:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

Code Coverage Report

Overall Project 66.08% +0.19% 🟢
Files changed 85.73% 🟢

Module Coverage
aliyun 1.72% 🔴
api 47.13% 🟢
authorization-common 85.96% 🟢
aws 1.08% 🔴
azure 2.47% 🔴
catalog-common 10.2% 🔴
catalog-fileset 80.02% 🟢
catalog-glue 83.41% 🟢
catalog-hive 81.83% 🟢
catalog-jdbc-clickhouse 80.02% +10.02% 🟢
catalog-jdbc-common 43.93% 🟢
catalog-jdbc-doris 80.28% 🟢
catalog-jdbc-hologres 54.03% 🟢
catalog-jdbc-mysql 79.23% 🟢
catalog-jdbc-oceanbase 78.38% 🟢
catalog-jdbc-postgresql 82.05% 🟢
catalog-jdbc-starrocks 78.27% 🟢
catalog-kafka 77.01% 🟢
catalog-lakehouse-generic 45.14% 🟢
catalog-lakehouse-hudi 79.1% 🟢
catalog-lakehouse-iceberg 86.98% 🟢
catalog-lakehouse-paimon 76.85% 🟢
catalog-model 77.72% 🟢
cli 44.51% 🟢
client-java 77.96% 🟢
common 50.0% 🟢
core 82.29% 🟢
filesystem-hadoop3 76.97% 🟢
flink 0.0% 🔴
flink-common 43.17% 🟢
flink-runtime 0.0% 🔴
gcp 14.12% 🔴
hadoop-common 10.39% 🔴
hive-metastore-common 46.83% 🟢
iceberg-common 55.46% 🟢
iceberg-rest-server 69.61% 🟢
idp-basic 94.68% 🟢
integration-test-common 0.0% 🔴
jobs 66.17% 🟢
lance-common 19.95% 🔴
lance-rest-server 62.78% 🟢
lineage 53.02% 🟢
optimizer 82.95% 🟢
optimizer-api 21.95% 🔴
server 85.83% 🟢
server-common 71.23% 🟢
spark 32.79% 🔴
spark-common 39.09% 🔴
trino-connector 35.14% 🔴
Files
Module File Coverage
catalog-jdbc-clickhouse ClickHouseTableSqlUtils.java 96.83% 🟢
ClickHouseTableOperations.java 83.57% 🟢

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

…g and broaden test coverage

- Collapse the if-else chain in `toPartitionExpression` into a switch
  expression; the default branch now subsumes the previous
  `isSupportedPartitionTransform` guard.
- Reword the misleading "single column partitioning" precondition;
  ClickHouse does support multi-column partitions via tuple(), the
  real constraint is per-transform.
- Add identity / year / day round-trip integration coverage so the
  refactor in #11017 does not drop the original identity path.
- Add a tuple multi-transform unit test to lock the
  `tuple(toYear(c1), toDate(c1))` output.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@yuqi1129 yuqi1129 requested a review from diqiu50 May 14, 2026 09:33
@yuqi1129 yuqi1129 self-assigned this May 14, 2026
Copy link
Copy Markdown
Contributor

@diqiu50 diqiu50 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@diqiu50 diqiu50 merged commit 6402759 into main May 15, 2026
30 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement] Support ClickHouse partition transforms when creating tables

3 participants