|
| 1 | +# Client V2 Apache Arrow Example |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This module contains a runnable example demonstrating how to use `client-v2` |
| 6 | +to insert and read data using the [Apache Arrow Stream](https://clickhouse.com/docs/en/interfaces/formats/#arrowstream) |
| 7 | +format (`ClickHouseFormat.ArrowStream`). |
| 8 | + |
| 9 | +The example shows: |
| 10 | + |
| 11 | +- writing a batch into ClickHouse from an Arrow `VectorSchemaRoot` via |
| 12 | + `ArrowStreamWriter`, using `Decimal256Vector` and `TimeStampMilliVector`; |
| 13 | +- reading rows back from ClickHouse with `ArrowStreamReader` and copying them |
| 14 | + into another table by streaming the same `VectorSchemaRoot` straight back to |
| 15 | + `client.insert(...)`. |
| 16 | + |
| 17 | +Unlike the other examples in this repository, this one is built with **Gradle** |
| 18 | +(JDK 17 toolchain) and is not part of the Maven multi-module build. |
| 19 | + |
| 20 | +## Requirements |
| 21 | + |
| 22 | +- JDK 17 or newer (the Gradle toolchain will fetch one if it is missing). |
| 23 | +- A running ClickHouse server reachable from the machine running the example. |
| 24 | + |
| 25 | +Apache Arrow needs access to direct memory and a few internal JDK APIs. The |
| 26 | +required `--add-opens` flags are already wired into `applicationDefaultJvmArgs` |
| 27 | +in `build.gradle.kts`, so running the example through Gradle just works: |
| 28 | + |
| 29 | +```text |
| 30 | +--add-opens=java.base/java.nio=ALL-UNNAMED |
| 31 | +--add-opens=java.base/sun.nio.ch=ALL-UNNAMED |
| 32 | +--add-opens=java.base/jdk.internal.misc=ALL-UNNAMED |
| 33 | +``` |
| 34 | + |
| 35 | +If you run the produced jar manually, pass these flags to the JVM yourself. |
| 36 | + |
| 37 | +## How to Run |
| 38 | + |
| 39 | +From this directory: |
| 40 | + |
| 41 | +```shell |
| 42 | +./gradlew run |
| 43 | +``` |
| 44 | + |
| 45 | +Connection properties can be supplied as system properties: |
| 46 | + |
| 47 | +- `-DchEndpoint` - Endpoint to connect in the format of URL (default: `http://localhost:8123`) |
| 48 | +- `-DchUser` - ClickHouse user name (default: `default`) |
| 49 | +- `-DchPassword` - ClickHouse user password (default: empty) |
| 50 | +- `-DchDatabase` - ClickHouse database name (default: `default`) |
| 51 | + |
| 52 | +Example with custom connection properties: |
| 53 | + |
| 54 | +```shell |
| 55 | +./gradlew run \ |
| 56 | + -DchEndpoint=http://localhost:8123 \ |
| 57 | + -DchUser=default \ |
| 58 | + -DchPassword= \ |
| 59 | + -DchDatabase=default |
| 60 | +``` |
| 61 | + |
| 62 | +To see the wire-level data flow, raise the SLF4J log level: |
| 63 | + |
| 64 | +```shell |
| 65 | +./gradlew run -Dorg.slf4j.simpleLogger.defaultLogLevel=DEBUG |
| 66 | +``` |
| 67 | + |
| 68 | +## Executable Example |
| 69 | + |
| 70 | +`com.clickhouse.examples.arrow_format.ReadWriteArrow` |
| 71 | + |
| 72 | +- Creates table `arrow_example (ts DateTime64, val1 Decimal(76,39))` and |
| 73 | + inserts a batch of 10 000 rows from Arrow vectors using |
| 74 | + `ClickHouseFormat.ArrowStream`. |
| 75 | +- Creates tables `arrow_read_example` and `arrow_read_example_copy` |
| 76 | + (`ts DateTime(3), val1 Decimal(76,62)`), populates the first one, reads it |
| 77 | + back with `ArrowStreamReader`, and streams each batch into the copy table. |
| 78 | + |
| 79 | +The example truncates the demo tables on every run, so it is safe to rerun. |
0 commit comments