|
| 1 | +--- |
| 2 | +title: Linking ClickHouse as a source |
| 3 | +sidebar: Docs |
| 4 | +showTitle: true |
| 5 | +availability: |
| 6 | + free: full |
| 7 | + selfServe: full |
| 8 | + enterprise: full |
| 9 | +sourceId: ClickHouse |
| 10 | +--- |
| 11 | + |
| 12 | +The ClickHouse connector can link your ClickHouse database tables to PostHog. ClickHouse databases are often very large, so we stream the data in Arrow batches to keep memory bounded. |
| 13 | + |
| 14 | +To link ClickHouse: |
| 15 | + |
| 16 | +1. Go to the [Data pipeline page](https://app.posthog.com/data-management/sources) and the sources tab in PostHog |
| 17 | +2. Click **New source** and select ClickHouse |
| 18 | +3. Enter your database connection details: |
| 19 | + - **Host:** The hostname or IP of your ClickHouse server like `play.clickhouse.com` or `123.132.1.100`. |
| 20 | + - **Port:** The HTTP(S) port your ClickHouse server is listening on. The default is `8443` for HTTPS and `8123` for HTTP. |
| 21 | + - **Database:** The name of the database you want to sync. The default is `default`. |
| 22 | + - **User:** The username with read permissions on the database. |
| 23 | + - **Password:** The password for the user (optional). |
| 24 | + - **Use HTTPS?:** Whether to connect over HTTPS. Default is enabled. |
| 25 | + - **Verify SSL certificate?:** Whether to verify the server's SSL certificate. Default is enabled. Disable if your server uses a self-signed certificate. |
| 26 | +4. If you need to connect through an SSH tunnel, enable and configure it (optional): |
| 27 | + - **Tunnel host:** The hostname of your SSH server. |
| 28 | + - **Tunnel port:** The port your SSH server is listening on. |
| 29 | + - **Authentication type:** |
| 30 | + - For password authentication, enter your SSH username and password. |
| 31 | + - For key-based authentication, enter your SSH username, private key, and optional passphrase. |
| 32 | +5. Click **Next** |
| 33 | + |
| 34 | +The data warehouse then starts syncing your ClickHouse data. You can see details and progress in the [sources tab](https://app.posthog.com/data-management/sources). |
| 35 | + |
| 36 | +> **Permissions:** The ClickHouse source only requires read permissions on the database and tables you intend to sync, plus read access to `system.tables` and `system.columns` for schema discovery. |
| 37 | +
|
| 38 | +## Supported table engines |
| 39 | + |
| 40 | +PostHog can sync data from any ClickHouse table engine, but row counts are only available for engines that track them: |
| 41 | + |
| 42 | +- **MergeTree family** (including `ReplacingMergeTree`, `SummingMergeTree`, etc.) — full support including accurate row counts from `system.tables.total_rows`. |
| 43 | +- **Distributed tables** — row counts come from a distributed `SELECT count()`. |
| 44 | +- **MaterializedView** — resolves to the underlying `TO` target table or `.inner_id.<uuid>` inner table for row counts. |
| 45 | +- **View** — synced on demand. Row count shown as "Skipped" because counting would require a full scan. |
| 46 | +- **Memory, Buffer, Log, Kafka, URL, and other no-counter engines** — synced on demand. Row count shown as "Skipped". |
| 47 | + |
| 48 | +## Incremental sync |
| 49 | + |
| 50 | +Incremental syncs are supported on integer (`Int8`–`Int256`, `UInt8`–`UInt256`) and temporal (`Date`, `Date32`, `DateTime`, `DateTime64`) cursor fields. |
| 51 | + |
| 52 | +PostHog uses the sorting key from `system.columns` as the detected primary key. Because ClickHouse sorting keys are not guaranteed to be unique, every incremental sync runs a bounded duplicate-key probe first and will fail the sync if duplicates are detected on the chosen primary key. |
| 53 | + |
| 54 | +## Type handling |
| 55 | + |
| 56 | +ClickHouse's Arrow output does not support every type, so PostHog serializes the following to strings on the server side to keep the stream reliable: `UUID`, `IPv4`/`IPv6`, wide ints (`Int128`/`Int256`/`UInt128`/`UInt256`), `Enum8`/`Enum16`, `FixedString`, `Array`, `Map`, `Tuple`, `Nested`, `Variant`, `Dynamic`, `JSON`, and `Object`. |
| 57 | + |
| 58 | +`Nullable` and `LowCardinality` wrappers, `DateTime`/`DateTime64` precision and timezones, and `Decimal[32–256]` are all preserved natively. |
| 59 | + |
| 60 | +import InboundIpAddresses from '../_snippets/inbound-ip-addresses.mdx' |
| 61 | + |
| 62 | +<InboundIpAddresses /> |
0 commit comments