You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/integrations/data-ingestion/etl-tools/fivetran/index.md
+20-15Lines changed: 20 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,27 +43,32 @@ The destination connector is developed and maintained together by ClickHouse and
43
43
</div>
44
44
45
45
## Key features {#key-features}
46
-
47
-
-**Automatic schema creation** — destination tables and databases are created automatically based on source schema.
48
-
-**History Mode (SCD Type 2)** — preserves complete history of all record versions for point-in-time analysis and audit trails.
49
-
-**Retry on network failures** — transient network errors are retried with exponential backoff. Duplicates from retries are handled by `ReplacingMergeTree`.
50
-
-**Configurable batch sizes** — tune write, select, mutation, and hard delete batch sizes via a JSON configuration file.
46
+
-**ClickHouse Cloud compatible**: use your ClickHouse Cloud database as a Fivetran destination.
47
+
-**SaaS deployment model**: fully managed by Fivetran, no need to manage your own infrastructure.
48
+
-**History Mode (SCD Type 2)**: preserves complete history of all record versions for point-in-time analysis and audit trails.
49
+
-**Configurable batch sizes**: You can adapt Fivetran to your particular use case by tuning write, select, mutation, and hard delete batch sizes via a JSON configuration file.
51
50
52
51
## Limitations {#limitations}
53
-
52
+
- Schema migrations is not supported yet, but we are working on it.
54
53
- Adding, removing, or modifying primary key columns is not supported.
55
54
- Custom ClickHouse settings on `CREATE TABLE` statements are not supported.
56
-
- Role-based grants are not fully supported — the connector's grants check only queries direct user grants. Use [direct grants](/integrations/fivetran/troubleshooting#role-based-grants) instead.
55
+
- Role-based grants are not fully supported. The connector's grants check only queries direct user grants. Use [direct grants](/integrations/fivetran/troubleshooting#role-based-grants) instead.
-[ClickHouse Fivetran destination on GitHub](https://github.com/ClickHouse/clickhouse-fivetran-destination)
69
-
-[ClickHouse Support](/about-us/support)
69
+
The ClickHouse Fivetran destination has a split ownership model:
70
+
71
+
-**ClickHouse** develops and maintains the destination connector code.
72
+
-**Fivetran** hosts the connector and is responsible for data movement, pipeline scheduling, and source connectors.
73
+
74
+
Both Fivetran and ClickHouse provide support for the Fivetran ClickHouse destination. For general inquiries, we recommend reaching out to Fivetran, as they are the experts on the Fivetran platform. For any ClickHouse-specific questions or issues, our support team is happy to help. Create a [support ticket](/about-us/support) to ask a question or report an issue.
The ClickHouse Cloud destination supports an optional JSON configuration file for advanced use cases. This file allows you to fine-tune destination behavior by overriding the default settings that control batch sizes, parallelism, connection pools, and request timeouts.
39
+
40
+
> NOTE: This configuration is entirely optional. If no file is uploaded, the destination uses
41
+
> sensible defaults that work well for most use cases.
42
+
43
+
The file must be valid JSON and conform to the schema described below.
44
+
45
+
If you need to modify the configuration after the initial setup, you can edit the destination configurations in the Fivetran dashboard and upload an updated file.
46
+
47
+
The configuration file has a top-level section:
48
+
49
+
```json
50
+
{
51
+
"destination_configurations": { ... }
52
+
}
53
+
```
54
+
55
+
Inside of it you can specify the following configurations that control the internal behavior of the ClickHouse destination connector itself.
56
+
These configurations affect how the connector processes data before sending it to ClickHouse.
57
+
58
+
| Setting | Type | Default | Allowed Range | Description |
|`write_batch_size`| integer |`100000`| 5,000 – 100,000 | Number of rows per batch for insert, update, and replace operations. |
61
+
|`select_batch_size`| integer |`1500`| 200 – 1,500 | Number of rows per batch for SELECT queries used during updates. |
62
+
|`mutation_batch_size`| integer |`1500`| 200 – 1,500 | Number of rows per batch for ALTER TABLE UPDATE mutations in history mode. Lower it if you are experiencing large SQL statements. |
63
+
|`hard_delete_batch_size`| integer |`1500`| 200 – 1,500 | Number of rows per batch for hard delete operations in history mode. Lower it if you are experiencing large SQL statements. |
64
+
65
+
All fields are optional. If a field is not specified, the default value is used.
66
+
If a value is outside the allowed range, the destination will report an error during sync.
67
+
Unknown fields are silently ignored (a warning is logged) and do not cause errors, which allows forward compatibility when new settings are added.
68
+
69
+
Example:
70
+
71
+
```json
72
+
{
73
+
"destination_configurations": {
74
+
"write_batch_size": 50000,
75
+
"select_batch_size": 200
76
+
}
77
+
}
78
+
```
12
79
13
80
## Type transformation mapping {#type-mapping}
14
81
@@ -35,13 +102,29 @@ The Fivetran ClickHouse destination maps [Fivetran data types](https://fivetran.
35
102
\* BINARY, XML, and JSON are stored as [String](/sql-reference/data-types/string) because ClickHouse's `String` type can represent an arbitrary set of bytes. The destination adds a column comment to indicate the original data type. The ClickHouse [JSON](/sql-reference/data-types/newjson) data type is not used as it was marked as obsolete and never recommended for production usage.
36
103
:::
37
104
38
-
## Destination table structure {#table-structure}
105
+
## Destination tables {#table-structure}
106
+
107
+
The ClickHouse Cloud destination uses
108
+
[Replacing](/engines/table-engines/mergetree-family/replacingmergetree) engine type of
109
+
[SharedMergeTree](/cloud/reference/shared-merge-tree) family
110
+
(specifically, `SharedReplacingMergeTree`), versioned by the `_fivetran_synced` column.
111
+
112
+
Every column except primary (ordering) keys and Fivetran metadata columns is created
113
+
as [Nullable(T)](/sql-reference/data-types/nullable), where `T` is a
114
+
ClickHouse Cloud type based on the [data types mapping](#data-types-mapping).
115
+
116
+
Every destination table includes the following metadata columns:
39
117
40
-
All destination tables use [SharedReplacingMergeTree](/cloud/reference/shared-merge-tree) versioned by the `_fivetran_synced` column. Every column except primary (ordering) keys and Fivetran metadata columns is created as [Nullable(T)](/sql-reference/data-types/nullable).
118
+
| Column | Type | Description |
119
+
|--------|------|-------------|
120
+
|`_fivetran_synced`|`DateTime64(9, 'UTC')`| Timestamp when the record was synced by Fivetran. Used as the version column for `SharedReplacingMergeTree`. |
121
+
|`_fivetran_deleted`|`Bool`| Soft delete marker. Set to `true` when the source record is deleted. |
122
+
|`_fivetran_id`|`String`| Auto-generated unique identifier. Only present when the source table has no primary keys. |
41
123
42
-
### Single primary key {#single-pk}
124
+
### Single primary key in the source table {#single-pk}
43
125
44
-
For a source table `users` with primary key `id` (`INT`) and column `name` (`STRING`):
126
+
For example, source table `users` has a primary key column `id` (`INT`) and a regular column `name` (`STRING`).
127
+
The destination table will be defined as follows:
45
128
46
129
```sql
47
130
CREATETABLE `users`
@@ -55,9 +138,15 @@ ORDER BY id
55
138
SETTINGS index_granularity =8192
56
139
```
57
140
58
-
### Multiple primary keys {#multiple-pk}
141
+
In this case, the `id` column is chosen as a table sorting key.
142
+
143
+
### Multiple primary keys in the source table
59
144
60
-
For a source table `items` with primary keys `id` (`INT`) and `name` (`STRING`), plus column `description` (`STRING`):
145
+
If the source table has multiple primary keys, they are used in order of their appearance in the Fivetran source table
146
+
definition.
147
+
148
+
For example, there is a source table `items` with primary key columns `id` (`INT`) and `name` (`STRING`), plus an
149
+
additional regular column `description` (`STRING`). The destination table will be defined as follows:
61
150
62
151
```sql
63
152
CREATETABLE `items`
@@ -72,11 +161,13 @@ ORDER BY (id, name)
72
161
SETTINGS index_granularity =8192
73
162
```
74
163
75
-
Primary keys are used in order of their appearance in the Fivetran source table definition.
164
+
In this case, `id` and `name` columns are chosen as table sorting keys.
76
165
77
-
### No primary keys {#no-pk}
166
+
### No primary keys in the source table
78
167
79
-
When the source table has no primary keys, Fivetran adds a `_fivetran_id` column as the sorting key:
168
+
If the source table has no primary keys, a unique identifier will be added by Fivetran as a `_fivetran_id` column.
169
+
Consider an `events` table that only has the `event` (`STRING`) and `timestamp` (`LOCALDATETIME`) columns in the source.
170
+
The destination table in that case is as follows:
80
171
81
172
```sql
82
173
CREATETABLEevents
@@ -91,88 +182,29 @@ ORDER BY _fivetran_id
91
182
SETTINGS index_granularity =8192
92
183
```
93
184
94
-
## Data deduplication {#deduplication}
185
+
Since `_fivetran_id` is unique and there are no other primary key options, it is used as a table sorting key.
186
+
187
+
### Selecting the latest version of the data without duplicates
95
188
96
-
`SharedReplacingMergeTree` performs background data deduplication [only during merges at an unknown time](/engines/table-engines/mergetree-family/replacingmergetree). To query the latest version of data without duplicates, use the `FINAL` keyword with [`select_sequential_consistency`](/operations/settings/settings#select_sequential_consistency):
189
+
`SharedReplacingMergeTree` performs background data deduplication
190
+
[only during merges at an unknown time](/engines/table-engines/mergetree-family/replacingmergetree).
191
+
However, selecting the latest version of the data without duplicates ad-hoc is possible with the `FINAL` keyword and
See also [Duplicate records with ReplacingMergeTree](/integrations/fivetran/troubleshooting#duplicate-records) in the troubleshooting guide.
106
203
107
-
## Fivetran metadata columns {#metadata-columns}
204
+
## Retries on network failures
108
205
109
-
Every destination table includes the following metadata columns:
110
-
111
-
| Column | Type | Description |
112
-
|--------|------|-------------|
113
-
|`_fivetran_synced`|`DateTime64(9, 'UTC')`| Timestamp when the record was synced by Fivetran. Used as the version column for `ReplacingMergeTree`. |
114
-
|`_fivetran_deleted`|`Bool`| Soft delete marker. Set to `true` when the source record is deleted. |
115
-
|`_fivetran_id`|`String`| Auto-generated unique identifier. Only present when the source table has no primary keys. |
116
-
117
-
## Ownership and support model {#ownership}
118
-
119
-
The ClickHouse Fivetran destination has a split ownership model:
120
-
121
-
-**ClickHouse** develops and maintains the destination connector code ([GitHub](https://github.com/ClickHouse/clickhouse-fivetran-destination)).
122
-
-**Fivetran** hosts the connector and is responsible for data movement, pipeline scheduling, and source connectors.
123
-
124
-
When diagnosing sync failures:
125
-
- Check the ClickHouse `system.query_log` for server-side issues.
126
-
- Request Fivetran connector process logs for client-side issues.
127
-
128
-
For connector bugs, [create a GitHub issue](https://github.com/ClickHouse/clickhouse-fivetran-destination/issues) or contact [ClickHouse Support](/about-us/support).
129
-
130
-
## Debugging Fivetran syncs {#debugging}
131
-
132
-
Use the following queries to diagnose sync failures on the ClickHouse side.
0 commit comments