Skip to content

feat: enhance clickhouse type support#846

Open
dhimanAbhi wants to merge 3 commits into
clidey:mainfrom
dhimanAbhi:feat/enhance-clickhouse-type-support
Open

feat: enhance clickhouse type support#846
dhimanAbhi wants to merge 3 commits into
clidey:mainfrom
dhimanAbhi:feat/enhance-clickhouse-type-support

Conversation

@dhimanAbhi
Copy link
Copy Markdown
Contributor

Overview

This PR addresses issue #815 by adding support for several ClickHouse-specific data types that were previously causing crashes when users attempted to browse tables containing them. The changes ensure that WhoDB can now safely display and interact with tables containing geometry types (LineString, MultiLineString, Ring, Geometry), temporal types (Time, Time64), aggregate functions (AggregateFunction, SimpleAggregateFunction), QBit, Variant, Dynamic, and the ML-focused BFloat16 type.

Signed-off-by: Abhishek Dhiman <abhi2002dhiman@gmail.com>
Signed-off-by: Abhishek Dhiman <abhi2002dhiman@gmail.com>
@modelorona
Copy link
Copy Markdown
Collaborator

Have you played around with a sample table to see that these types are handled correctly? We have the pre-existing docker setup https://github.com/clidey/whodb/blob/main/dev/docker-compose.yml which you can use to run the clickhouse container. The container is populated with this data script https://github.com/clidey/whodb/blob/main/dev/sample-data/clickhouse/data.sql in which you can extend the data_types table with these types and try it out. Same with testing import/export, modifying their values, mock data generation.

@dhimanAbhi
Copy link
Copy Markdown
Contributor Author

Hi @modelorona ,
Yes, I tested each column type by creating sample tables using queries from the official ClickHouse docs - https://clickhouse.com/docs/sql-reference/data-types. All of them displayed without any issues. I tested on both ClickHouse Cloud and the open source version (I did run into a few issues creating tables with Time, Time64 and QBit columns in cloud version so I had to switch Clickhouse's OSS to test these types).

I'll attach screenshots by today or tomorrow and then we can move forward!
Thanks for the review.

@dhimanAbhi
Copy link
Copy Markdown
Contributor Author

Hi @modelorona , attaching here some screenshots as proof manifests, let me know your thoughts on these 😄
Thank you!

AggregateFunction

agg_example

BFloat16

bf16_example

Dynamic

dynamic_example

Geometry

geo_example

Geo Linestring

geo_linestring

Geo MultiLineString

geo_multilinestring

Geo Ring

geo_ring

QBit

qbit_example

SimpleAggregateFunction

simple_agg_example

Time

time_example

Time64

time64_example

Variant

variant_example

@modelorona
Copy link
Copy Markdown
Collaborator

Thanks for uploading! I'm a bit swamped atm and aim to review this + your other LM Studio PR this week.

Copy link
Copy Markdown

@renezander030 renezander030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dhimanAbhi,

Coverage of the new types in ShouldHandleColumnType looks right against the table in #815. A few code-level points from the diff:

  1. The new explicit scanner block in GetColumnScanner:
if strings.HasPrefix(upper, "AGGREGATEFUNCTION(") ||
    strings.HasPrefix(upper, "SIMPLEAGGREGATEFUNCTION(") ||
    strings.HasPrefix(upper, "VARIANT(") ||
    strings.HasPrefix(upper, "QBIT(") ||
    upper == "DYNAMIC" ||
    upper == "RING" ||
    upper == "LINESTRING" ||
    upper == "MULTILINESTRING" ||
    upper == "GEOMETRY" {
    var v any
    return &v
}

var value any
return &value

Both branches return *any. The explicit block is equivalent to the default path, so the if can be dropped without changing behavior. If the intent was to use a different scanner for these (e.g. new(string) to force string conversion from the driver), it'd be worth doing that explicitly so the special-case is meaningful.

  1. getColumnsFromSystemColumns is defined in this diff but I don't see it called anywhere in the changed files. If it's groundwork for a follow-up, fine; if it's leftover from an earlier iteration, worth removing so the diff matches what's actually exercised.

  2. BFLOAT16 added to FloatTypes is the right bucket; the only thing to flag is that an 8-bit-mantissa value will render with very low precision in any column-aware display, but that's a display layer concern, not blocking.

  3. Geometry types (RING, LINESTRING, MULTILINESTRING, GEOMETRY) fall to the generic *any scanner — the screenshots show they don't crash, which solves the immediate issue, but rendering will be whatever the clickhouse-go driver returns rather than WKT. Worth a comment in the issue thread on whether a follow-up should normalize geo output.

The PR fixes the crash described in #815 cleanly. The points above are mostly trim/clarity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants