Skip to content

[Improvement] Support ClickHouse partition transforms when creating tables #11017

@yuqi1129

Description

@yuqi1129

What would you like to be improved?

ClickHouse table creation currently only supports identity partitioning when converting Gravitino partition transforms to PARTITION BY.

For example, Transforms.identity("created_at") can be converted to:

PARTITION BY `created_at`

But function-based transforms such as year, month, and day are rejected during table creation, even though ClickHouse supports common date/time partition expressions like:

PARTITION BY toYear(created_at)
PARTITION BY toYYYYMM(created_at)
PARTITION BY toDate(created_at)

This limits users who want to create MergeTree-family ClickHouse tables with time-based partitioning through Gravitino.

How should we improve?

Support ClickHouse partition expression generation for Gravitino time transforms when creating tables, at least:

  • Transforms.year(column) -> PARTITION BY toYear(column)
  • Transforms.month(column) -> PARTITION BY toYYYYMM(column)
  • Transforms.day(column) -> PARTITION BY toDate(column)

The implementation should keep the existing validation behavior for unsupported transforms and nested fields, and add unit/integration test coverage for creating ClickHouse tables with these partition transforms.

Metadata

Metadata

Assignees

Labels

1.3.0Release v1.3.0improvementImprovements on everything

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions