Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions docs/en/connectors/sink/Iceberg.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,23 @@ libfb303-xxx.jar
| data_save_mode | Enum | no | APPEND_DATA | the data save mode, please refer to `data_save_mode` below |
| custom_sql | string | no | - | Custom `delete` data sql for data save mode. e.g: `delete from ... where ...` |
| iceberg.table.commit-branch | string | no | - | Default branch for commits |
| krb5_path | string | no | /etc/krb5.conf | The path of `krb5.conf`, used for Kerberos authentication. |
| kerberos_principal | string | no | - | The principal for Kerberos authentication. |
| kerberos_keytab_path | string | no | - | The keytab file path for Kerberos authentication. |

## Sink Option descriptions

### krb5_path [string]

The path of `krb5.conf`, used for Kerberos authentication.

### kerberos_principal [string]

The principal for Kerberos authentication.

### kerberos_keytab_path [string]

The keytab file path for Kerberos authentication.

## Task Example

Expand Down Expand Up @@ -234,6 +251,42 @@ sink {
}
```

### Kerberos Authentication

The following example demonstrates how to configure Iceberg sink with Kerberos authentication when using Hadoop catalog with HDFS:

```hocon
sink {
Iceberg {
catalog_name = "seatunnel_test"
iceberg.catalog.config = {
type = "hadoop"
warehouse = "hdfs://your_cluster/tmp/seatunnel/iceberg/"
}
namespace = "seatunnel_namespace"
table = "iceberg_sink_table"
iceberg.table.write-props = {
write.format.default = "parquet"
write.target-file-size-bytes = 536870912
}
krb5_path = "/etc/krb5.conf"
kerberos_principal = "hive/your_host@EXAMPLE.COM"
kerberos_keytab_path = "/path/to/your.keytab"
iceberg.table.primary-keys = "id"
iceberg.table.partition-keys = "f_datetime"
iceberg.table.upsert-mode-enabled = true
iceberg.table.schema-evolution-enabled = true
case_sensitive = true
}
}
```

Description:

- `krb5_path`: The path to the `krb5.conf` file used for Kerberos authentication.
- `kerberos_principal`: The principal for Kerberos authentication in the format `primary/instance@REALM`.
- `kerberos_keytab_path`: The keytab file path for Kerberos authentication.

### Multiple table

#### example1
Expand Down
39 changes: 35 additions & 4 deletions docs/en/connectors/source/Iceberg.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,10 +89,13 @@ libfb303-xxx.jar
| end_snapshot_id | long | no | - | Instructs this scan to look for changes up to a particular snapshot (inclusive). |
| use_snapshot_id | long | no | - | Instructs this scan to look for use the given snapshot ID. |
| use_snapshot_timestamp | long | no | - | Instructs this scan to look for use the most recent snapshot as of the given time in milliseconds. timestamp – the timestamp in millis since the Unix epoch |
| stream_scan_strategy | enum | no | FROM_LATEST_SNAPSHOT | Starting strategy for stream mode execution, Default to use `FROM_LATEST_SNAPSHOT` if dont specify any value,The optional values are:<br/>TABLE_SCAN_THEN_INCREMENTAL: Do a regular table scan then switch to the incremental mode.<br/>FROM_LATEST_SNAPSHOT: Start incremental mode from the latest snapshot inclusive.<br/>FROM_EARLIEST_SNAPSHOT: Start incremental mode from the earliest snapshot inclusive.<br/>FROM_SNAPSHOT_ID: Start incremental mode from a snapshot with a specific id inclusive.<br/>FROM_SNAPSHOT_TIMESTAMP: Start incremental mode from a snapshot with a specific timestamp inclusive. |
| stream_scan_strategy | enum | no | FROM_LATEST_SNAPSHOT | Starting strategy for stream mode execution, Default to use `FROM_LATEST_SNAPSHOT` if don't specify any value,The optional values are:<br/>TABLE_SCAN_THEN_INCREMENTAL: Do a regular table scan then switch to the incremental mode.<br/>FROM_LATEST_SNAPSHOT: Start incremental mode from the latest snapshot inclusive.<br/>FROM_EARLIEST_SNAPSHOT: Start incremental mode from the earliest snapshot inclusive.<br/>FROM_SNAPSHOT_ID: Start incremental mode from a snapshot with a specific id inclusive.<br/>FROM_SNAPSHOT_TIMESTAMP: Start incremental mode from a snapshot with a specific timestamp inclusive. |
| increment.scan-interval | long | no | 2000 | The interval of increment scan(mills) |
| common-options | | no | - | Source plugin common parameters, please refer to [Source Common Options](../common-options/source-common-options.md) for details. |
| query | String | no | - | The select DML to select the iceberg data. It mustn't contain the table name, and doesn't support alias. For example: `select * from table where f1 > 100`, `select fn from table where f1 > 100`. The current support for the LIKE syntax is limited: the LIKE clause shouldn't start with `%`. The supported one is: `select f1 from t where f2 like 'tom%' ` |
| query | String | no | - | The select DML to select the iceberg data. It mustn't contain the table name, and doesn’t support alias. For example: `select * from table where f1 > 100`, `select fn from table where f1 > 100`. The current support for the LIKE syntax is limited: the LIKE clause shouldn't start with `%`. The supported one is: `select f1 from t where f2 like 'tom%' ` |
| krb5_path | string | no | /etc/krb5.conf | The path to the `krb5.conf` file for Kerberos authentication. |
| kerberos_principal | string | no | - | The principal for Kerberos authentication. |
| kerberos_keytab_path | string | no | - | The path to the keytab file for Kerberos authentication. |


## Task Example
Expand Down Expand Up @@ -149,7 +152,7 @@ source {
query = "select fn from table where f1 > 100"
}
]
plugin_output = "iceberg"
}
}
Expand Down Expand Up @@ -191,13 +194,41 @@ source {
warehouse = "hdfs://your_cluster//tmp/seatunnel/iceberg/"
}
catalog_type = "hive"
namespace = "your_iceberg_database"
table = "your_iceberg_table"
}
}
```

### Kerberos Authentication

The following example demonstrates how to configure Kerberos authentication for the Iceberg Source when using Hadoop Catalog and HDFS:

```hocon
source {
Iceberg {
catalog_name = "seatunnel"
iceberg.catalog.config = {
type = "hadoop"
warehouse = "hdfs://your_cluster/tmp/seatunnel/iceberg/"
}
namespace = "your_iceberg_database"
table = "your_iceberg_table"
krb5_path = "/etc/krb5.conf"
kerberos_principal = "hive/your_host@EXAMPLE.COM"
kerberos_keytab_path = "/path/to/your.keytab"
plugin_output = "iceberg_kerberos"
}
}
```

Description:

- `krb5_path`: The path to the `krb5.conf` file for Kerberos authentication.
- `kerberos_principal`: The principal for Kerberos authentication, in the format of `primary/instance@REALM`.
- `kerberos_keytab_path`: The path to the keytab file for Kerberos authentication.

### Column Projection

```hocon
Expand Down
55 changes: 54 additions & 1 deletion docs/zh/connectors/sink/Iceberg.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,23 @@ libfb303-xxx.jar
| data_save_mode | Enum | no | APPEND_DATA | 数据写入方式, 请参考下面的 `data_save_mode` |
| custom_sql | string | no | - | 自定义 `delete` 数据的 SQL 语句,用于数据写入方式。例如: `delete from ... where ...` |
| iceberg.table.commit-branch | string | no | - | 提交的默认分支 |
| krb5_path | string | no | /etc/krb5.conf | `krb5.conf` 文件的路径,用于 Kerberos 认证。 |
| kerberos_principal | string | no | - | Kerberos 认证的 principal。 |
| kerberos_keytab_path | string | no | - | Kerberos 认证的 keytab 文件路径。 |

## Sink 选项说明

### krb5_path [string]

`krb5.conf` 文件的路径,用于 Kerberos 认证。

### kerberos_principal [string]

Kerberos 认证的 principal。

### kerberos_keytab_path [string]

Kerberos 认证的 keytab 文件路径。

## 任务示例

Expand Down Expand Up @@ -207,6 +224,42 @@ sink {
}
```

### Kerberos 认证

以下示例演示了在使用 Hadoop Catalog 和 HDFS 时如何配置 Iceberg Sink 的 Kerberos 认证:

```hocon
sink {
Iceberg {
catalog_name = "seatunnel_test"
iceberg.catalog.config = {
type = "hadoop"
warehouse = "hdfs://your_cluster/tmp/seatunnel/iceberg/"
}
namespace = "seatunnel_namespace"
table = "iceberg_sink_table"
iceberg.table.write-props = {
write.format.default = "parquet"
write.target-file-size-bytes = 536870912
}
krb5_path = "/etc/krb5.conf"
kerberos_principal = "hive/your_host@EXAMPLE.COM"
kerberos_keytab_path = "/path/to/your.keytab"
iceberg.table.primary-keys = "id"
iceberg.table.partition-keys = "f_datetime"
iceberg.table.upsert-mode-enabled = true
iceberg.table.schema-evolution-enabled = true
case_sensitive = true
}
}
```

说明:

- `krb5_path`:用于 Kerberos 认证的 `krb5.conf` 文件路径。
- `kerberos_principal`:Kerberos 认证的 principal,格式为 `primary/instance@REALM`
- `kerberos_keytab_path`:Kerberos 认证的 keytab 文件路径。

### Multiple table(多表写入)

#### 示例1
Expand All @@ -223,7 +276,7 @@ source {
url = "jdbc:mysql://127.0.0.1:3306/seatunnel"
username = "root"
password = "******"
table-names = ["seatunnel.role","seatunnel.user","galileo.Bucket"]
}
}
Expand Down
Loading
Loading