From 061f56d14e546fc46ef1d666aa5fa65c32a49006 Mon Sep 17 00:00:00 2001 From: jrmccluskey Date: Thu, 23 Apr 2026 15:05:17 +0000 Subject: [PATCH] Update managed-io.md for release 2.73.0-RC2. --- .../content/en/documentation/io/managed-io.md | 866 +++++++++--------- 1 file changed, 433 insertions(+), 433 deletions(-) diff --git a/website/www/site/content/en/documentation/io/managed-io.md b/website/www/site/content/en/documentation/io/managed-io.md index 0e365eeb23e3..46f20c5e11ab 100644 --- a/website/www/site/content/en/documentation/io/managed-io.md +++ b/website/www/site/content/en/documentation/io/managed-io.md @@ -58,31 +58,6 @@ and Beam SQL is invoked via the Managed API under the hood. Read Configuration Write Configuration - - ICEBERG - - table (str)
- catalog_name (str)
- catalog_properties (map[str, str])
- config_properties (map[str, str])
- drop (list[str])
- filter (str)
- keep (list[str])
- - - table (str)
- catalog_name (str)
- catalog_properties (map[str, str])
- config_properties (map[str, str])
- direct_write_byte_limit (int32)
- drop (list[str])
- keep (list[str])
- only (str)
- partition_fields (list[str])
- table_properties (map[str, str])
- triggering_frequency_seconds (int32)
- - ICEBERG_CDC @@ -134,10 +109,37 @@ and Beam SQL is invoked via the Managed API under the hood. - POSTGRES + ICEBERG + + table (str)
+ catalog_name (str)
+ catalog_properties (map[str, str])
+ config_properties (map[str, str])
+ drop (list[str])
+ filter (str)
+ keep (list[str])
+ + + table (str)
+ catalog_name (str)
+ catalog_properties (map[str, str])
+ config_properties (map[str, str])
+ direct_write_byte_limit (int32)
+ drop (list[str])
+ keep (list[str])
+ only (str)
+ partition_fields (list[str])
+ table_properties (map[str, str])
+ triggering_frequency_seconds (int32)
+ + + + MYSQL jdbc_url (str)
+ connection_init_sql (list[str])
connection_properties (str)
+ disable_auto_commit (boolean)
fetch_size (int32)
location (str)
num_partitions (int32)
@@ -151,6 +153,7 @@ and Beam SQL is invoked via the Managed API under the hood. jdbc_url (str)
autosharding (boolean)
batch_size (int64)
+ connection_init_sql (list[str])
connection_properties (str)
location (str)
password (str)
@@ -159,29 +162,10 @@ and Beam SQL is invoked via the Managed API under the hood. - BIGQUERY - - kms_key (str)
- query (str)
- row_restriction (str)
- fields (list[str])
- table (str)
- - - table (str)
- drop (list[str])
- keep (list[str])
- kms_key (str)
- only (str)
- triggering_frequency_seconds (int64)
- - - - SQLSERVER + POSTGRES jdbc_url (str)
connection_properties (str)
- disable_auto_commit (boolean)
fetch_size (int32)
location (str)
num_partitions (int32)
@@ -203,10 +187,9 @@ and Beam SQL is invoked via the Managed API under the hood. - MYSQL + SQLSERVER jdbc_url (str)
- connection_init_sql (list[str])
connection_properties (str)
disable_auto_commit (boolean)
fetch_size (int32)
@@ -222,7 +205,6 @@ and Beam SQL is invoked via the Managed API under the hood. jdbc_url (str)
autosharding (boolean)
batch_size (int64)
- connection_init_sql (list[str])
connection_properties (str)
location (str)
password (str)
@@ -230,12 +212,30 @@ and Beam SQL is invoked via the Managed API under the hood. write_statement (str)
+ + BIGQUERY + + kms_key (str)
+ query (str)
+ row_restriction (str)
+ fields (list[str])
+ table (str)
+ + + table (str)
+ drop (list[str])
+ keep (list[str])
+ kms_key (str)
+ only (str)
+ triggering_frequency_seconds (int64)
+ + ## Configuration Details -### `ICEBERG` Read +### `ICEBERG_CDC` Read
@@ -310,6 +310,28 @@ and Beam SQL is invoked via the Managed API under the hood. SQL-like predicate to filter data at scan time. Example: "id > 5 AND status = 'ACTIVE'". Uses Apache Calcite syntax: https://calcite.apache.org/docs/reference.html + + + + + + + + + + -
+ from_snapshot + + int64 + + Starts reading from this snapshot ID (inclusive). +
+ from_timestamp + + int64 + + Starts reading from the first snapshot (inclusive) that was created after this timestamp (in milliseconds). +
keep @@ -321,155 +343,154 @@ and Beam SQL is invoked via the Managed API under the hood. A subset of column names to read exclusively. If null or empty, all columns will be read.
-
- -### `ICEBERG` Write - -
- - - - + + + +
ConfigurationTypeDescription + poll_interval_seconds + + int32 + + The interval at which to poll for new snapshots. Defaults to 60 seconds. +
- table + starting_strategy str - A fully-qualified table identifier. You may also provide a template to write to multiple dynamic destinations, for example: `dataset.my_{col1}_{col2.nested}_table`. + The source's starting strategy. Valid options are: "earliest" or "latest". Can be overriden by setting a starting snapshot or timestamp. Defaults to earliest for batch, and latest for streaming.
- catalog_name + streaming - str + boolean - Name of the catalog containing the table. + Enables streaming reads, where source continuously polls for snapshots forever.
- catalog_properties + to_snapshot - map[str, str] + int64 - Properties used to set up the Iceberg catalog. + Reads up to this snapshot ID (inclusive).
- config_properties + to_timestamp - map[str, str] + int64 - Properties passed to the Hadoop Configuration. + Reads up to the latest snapshot (inclusive) created before this timestamp (in milliseconds).
+
+ +### `KAFKA` Write + +
+ + + + + +
ConfigurationTypeDescription
- direct_write_byte_limit + bootstrap_servers - int32 + str - For a streaming pipeline, sets the limit for lifting bundles into the direct write path. + A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. | Format: host1:port1,host2:port2,...
- drop + format - list[str] + str - A list of field names to drop from the input record before writing. Is mutually exclusive with 'keep' and 'only'. + The encoding format for the data stored in Kafka. Valid options are: RAW,JSON,AVRO,PROTO
- keep + topic - list[str] + str - A list of field names to keep in the input record. All other fields are dropped before writing. Is mutually exclusive with 'drop' and 'only'. + n/a
- only + file_descriptor_path str - The name of a single record field that should be written. Is mutually exclusive with 'keep' and 'drop'. + The path to the Protocol Buffer File Descriptor Set file. This file is used for schema definition and message serialization.
- partition_fields + message_name - list[str] + str - Fields used to create a partition spec that is applied when tables are created. For a field 'foo', the available partition transforms are: - -- `foo` -- `truncate(foo, N)` -- `bucket(foo, N)` -- `hour(foo)` -- `day(foo)` -- `month(foo)` -- `year(foo)` -- `void(foo)` - -For more information on partition transforms, please visit https://iceberg.apache.org/spec/#partition-transforms. + The name of the Protocol Buffer message to be used for schema extraction and data conversion.
- table_properties + producer_config_updates map[str, str] - Iceberg table properties to be set on the table when it is created. -For more information on table properties, please visit https://iceberg.apache.org/docs/latest/configuration/#table-properties. + A list of key-value pairs that act as configuration parameters for Kafka producers. Most of these configurations will not be needed, but if you need to customize your Kafka producer, you may use this. See a detailed list: https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html
- triggering_frequency_seconds + schema - int32 + str - For a streaming pipeline, sets the frequency at which snapshots are produced. + n/a
-### `ICEBERG_CDC` Read +### `KAFKA` Read
@@ -480,162 +501,251 @@ For more information on table properties, please visit https://iceberg.apache.or + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
- table + bootstrap_servers str - Identifier of the Iceberg table. + A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form `host1:port1,host2:port2,...`
- catalog_name + topic str - Name of the catalog containing the table. + n/a
- catalog_properties + allow_duplicates - map[str, str] + boolean - Properties used to set up the Iceberg catalog. + If the Kafka read allows duplicates.
- config_properties + confluent_schema_registry_subject + + str + + n/a +
+ confluent_schema_registry_url + + str + + n/a +
+ consumer_config_updates map[str, str] - Properties passed to the Hadoop Configuration. + A list of key-value pairs that act as configuration parameters for Kafka consumers. Most of these configurations will not be needed, but if you need to customize your Kafka consumer, you may use this. See a detailed list: https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html
- drop + file_descriptor_path - list[str] + str - A subset of column names to exclude from reading. If null or empty, all columns will be read. + The path to the Protocol Buffer File Descriptor Set file. This file is used for schema definition and message serialization.
- filter + format str - SQL-like predicate to filter data at scan time. Example: "id > 5 AND status = 'ACTIVE'". Uses Apache Calcite syntax: https://calcite.apache.org/docs/reference.html + The encoding format for the data stored in Kafka. Valid options are: RAW,STRING,AVRO,JSON,PROTO
- from_snapshot + message_name - int64 + str - Starts reading from this snapshot ID (inclusive). + The name of the Protocol Buffer message to be used for schema extraction and data conversion.
- from_timestamp + offset_deduplication - int64 + boolean + + If the redistribute is using offset deduplication mode. +
+ redistribute_by_record_key + + boolean + + If the redistribute keys by the Kafka record key. +
+ redistribute_num_keys + + int32 + + The number of keys for redistributing Kafka inputs. +
+ redistributed + + boolean + + If the Kafka read should be redistributed. +
+ schema + + str + + The schema in which the data is encoded in the Kafka topic. For AVRO data, this is a schema defined with AVRO schema syntax (https://avro.apache.org/docs/1.10.2/spec.html#schemas). For JSON data, this is a schema defined with JSON-schema syntax (https://json-schema.org/). If a URL to Confluent Schema Registry is provided, then this field is ignored, and the schema is fetched from Confluent Schema Registry. +
+
+ +### `ICEBERG` Read + +
+ + + + + + + + +
ConfigurationTypeDescription
+ table + + str - Starts reading from the first snapshot (inclusive) that was created after this timestamp (in milliseconds). + Identifier of the Iceberg table.
- keep + catalog_name - list[str] + str - A subset of column names to read exclusively. If null or empty, all columns will be read. + Name of the catalog containing the table.
- poll_interval_seconds + catalog_properties - int32 + map[str, str] - The interval at which to poll for new snapshots. Defaults to 60 seconds. + Properties used to set up the Iceberg catalog.
- starting_strategy + config_properties - str + map[str, str] - The source's starting strategy. Valid options are: "earliest" or "latest". Can be overriden by setting a starting snapshot or timestamp. Defaults to earliest for batch, and latest for streaming. + Properties passed to the Hadoop Configuration.
- streaming + drop - boolean + list[str] - Enables streaming reads, where source continuously polls for snapshots forever. + A subset of column names to exclude from reading. If null or empty, all columns will be read.
- to_snapshot + filter - int64 + str - Reads up to this snapshot ID (inclusive). + SQL-like predicate to filter data at scan time. Example: "id > 5 AND status = 'ACTIVE'". Uses Apache Calcite syntax: https://calcite.apache.org/docs/reference.html
- to_timestamp + keep - int64 + list[str] - Reads up to the latest snapshot (inclusive) created before this timestamp (in milliseconds). + A subset of column names to read exclusively. If null or empty, all columns will be read.
-### `KAFKA` Write +### `ICEBERG` Write
@@ -646,251 +756,285 @@ For more information on table properties, please visit https://iceberg.apache.or + + + + + -
- bootstrap_servers + table str - A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. | Format: host1:port1,host2:port2,... + A fully-qualified table identifier. You may also provide a template to write to multiple dynamic destinations, for example: `dataset.my_{col1}_{col2.nested}_table`.
- format + catalog_name str - The encoding format for the data stored in Kafka. Valid options are: RAW,JSON,AVRO,PROTO + Name of the catalog containing the table.
- topic + catalog_properties - str + map[str, str] - n/a + Properties used to set up the Iceberg catalog.
- file_descriptor_path + config_properties - str + map[str, str] - The path to the Protocol Buffer File Descriptor Set file. This file is used for schema definition and message serialization. + Properties passed to the Hadoop Configuration.
- message_name + direct_write_byte_limit - str + int32 - The name of the Protocol Buffer message to be used for schema extraction and data conversion. + For a streaming pipeline, sets the limit for lifting bundles into the direct write path.
- producer_config_updates + drop - map[str, str] + list[str] - A list of key-value pairs that act as configuration parameters for Kafka producers. Most of these configurations will not be needed, but if you need to customize your Kafka producer, you may use this. See a detailed list: https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html + A list of field names to drop from the input record before writing. Is mutually exclusive with 'keep' and 'only'.
- schema + keep + + list[str] + + A list of field names to keep in the input record. All other fields are dropped before writing. Is mutually exclusive with 'drop' and 'only'. +
+ only str - n/a + The name of a single record field that should be written. Is mutually exclusive with 'keep' and 'drop'.
-
+ + + partition_fields + + + list[str] + + + Fields used to create a partition spec that is applied when tables are created. For a field 'foo', the available partition transforms are: -### `KAFKA` Read +- `foo` +- `truncate(foo, N)` +- `bucket(foo, N)` +- `hour(foo)` +- `day(foo)` +- `month(foo)` +- `year(foo)` +- `void(foo)` -
- - - - - +For more information on partition transforms, please visit https://iceberg.apache.org/spec/#partition-transforms. + +
ConfigurationTypeDescription
- bootstrap_servers + table_properties - str + map[str, str] - A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form `host1:port1,host2:port2,...` + Iceberg table properties to be set on the table when it is created. +For more information on table properties, please visit https://iceberg.apache.org/docs/latest/configuration/#table-properties.
- topic + triggering_frequency_seconds - str + int32 - n/a + For a streaming pipeline, sets the frequency at which snapshots are produced.
+
+ +### `MYSQL` Read + +
+ + + + + +
ConfigurationTypeDescription
- allow_duplicates + jdbc_url - boolean + str - If the Kafka read allows duplicates. + Connection URL for the JDBC source.
- confluent_schema_registry_subject + connection_init_sql - str + list[str] - n/a + Sets the connection init sql statements used by the Driver. Only MySQL and MariaDB support this.
- confluent_schema_registry_url + connection_properties str - n/a + Used to set connection properties passed to the JDBC driver not already defined as standalone parameter (e.g. username and password can be set using parameters above accordingly). Format of the string must be "key1=value1;key2=value2;".
- consumer_config_updates + disable_auto_commit - map[str, str] + boolean - A list of key-value pairs that act as configuration parameters for Kafka consumers. Most of these configurations will not be needed, but if you need to customize your Kafka consumer, you may use this. See a detailed list: https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html + Whether to disable auto commit on read. Defaults to true if not provided. The need for this config varies depending on the database platform. Informix requires this to be set to false while Postgres requires this to be set to true.
- file_descriptor_path + fetch_size - str + int32 - The path to the Protocol Buffer File Descriptor Set file. This file is used for schema definition and message serialization. + This method is used to override the size of the data that is going to be fetched and loaded in memory per every database call. It should ONLY be used if the default value throws memory errors.
- format + location str - The encoding format for the data stored in Kafka. Valid options are: RAW,STRING,AVRO,JSON,PROTO + Name of the table to read from.
- message_name + num_partitions - str + int32 - The name of the Protocol Buffer message to be used for schema extraction and data conversion. + The number of partitions
- offset_deduplication + output_parallelization boolean - If the redistribute is using offset deduplication mode. + Whether to reshuffle the resulting PCollection so results are distributed to all workers.
- redistribute_by_record_key + partition_column - boolean + str - If the redistribute keys by the Kafka record key. + Name of a column of numeric type that will be used for partitioning.
- redistribute_num_keys + password - int32 + str - The number of keys for redistributing Kafka inputs. + Password for the JDBC source.
- redistributed + read_query - boolean + str - If the Kafka read should be redistributed. + SQL query used to query the JDBC source.
- schema + username str - The schema in which the data is encoded in the Kafka topic. For AVRO data, this is a schema defined with AVRO schema syntax (https://avro.apache.org/docs/1.10.2/spec.html#schemas). For JSON data, this is a schema defined with JSON-schema syntax (https://json-schema.org/). If a URL to Confluent Schema Registry is provided, then this field is ignored, and the schema is fetched from Confluent Schema Registry. + Username for the JDBC source.
-### `POSTGRES` Write +### `MYSQL` Write
@@ -932,6 +1076,17 @@ For more information on table properties, please visit https://iceberg.apache.or n/a + + + + +
+ connection_init_sql + + list[str] + + Sets the connection init sql statements used by the Driver. Only MySQL and MariaDB support this. +
connection_properties @@ -990,7 +1145,7 @@ For more information on table properties, please visit https://iceberg.apache.or
-### `POSTGRES` Read +### `POSTGRES` Write
@@ -1007,73 +1162,51 @@ For more information on table properties, please visit https://iceberg.apache.or str - - - - - - - - - - @@ -1089,30 +1222,30 @@ For more information on table properties, please visit https://iceberg.apache.or
- Connection URL for the JDBC source. -
- connection_properties - - str - - Used to set connection properties passed to the JDBC driver not already defined as standalone parameter (e.g. username and password can be set using parameters above accordingly). Format of the string must be "key1=value1;key2=value2;". -
- fetch_size - - int32 - - This method is used to override the size of the data that is going to be fetched and loaded in memory per every database call. It should ONLY be used if the default value throws memory errors. + Connection URL for the JDBC sink.
- location + autosharding - str + boolean - Name of the table to read from. + If true, enables using a dynamically determined number of shards to write.
- num_partitions + batch_size - int32 + int64 - The number of partitions + n/a
- output_parallelization + connection_properties - boolean + str - Whether to reshuffle the resulting PCollection so results are distributed to all workers. + Used to set connection properties passed to the JDBC driver not already defined as standalone parameter (e.g. username and password can be set using parameters above accordingly). Format of the string must be "key1=value1;key2=value2;".
- partition_column + location str - Name of a column of numeric type that will be used for partitioning. + Name of the table to write to.
- read_query + username str - SQL query used to query the JDBC source. + Username for the JDBC source.
- username + write_statement str - Username for the JDBC source. + SQL query used to insert records into the JDBC sink.
-### `BIGQUERY` Read +### `POSTGRES` Read
@@ -1123,135 +1256,112 @@ For more information on table properties, please visit https://iceberg.apache.or - - - - - -
- kms_key - - str - - Use this Cloud KMS key to encrypt your data -
- query + jdbc_url str - The SQL query to be executed to read from the BigQuery table. + Connection URL for the JDBC source.
- row_restriction + connection_properties str - Read only rows that match this filter, which must be compatible with Google standard SQL. This is not supported when reading via query. + Used to set connection properties passed to the JDBC driver not already defined as standalone parameter (e.g. username and password can be set using parameters above accordingly). Format of the string must be "key1=value1;key2=value2;".
- fields + fetch_size - list[str] + int32 - Read only the specified fields (columns) from a BigQuery table. Fields may not be returned in the order specified. If no value is specified, then all fields are returned. Example: "col1, col2, col3" + This method is used to override the size of the data that is going to be fetched and loaded in memory per every database call. It should ONLY be used if the default value throws memory errors.
- table + location str - The fully-qualified name of the BigQuery table to read from. Format: [${PROJECT}:]${DATASET}.${TABLE} + Name of the table to read from.
-
- -### `BIGQUERY` Write - -
- - - - - -
ConfigurationTypeDescription
- table + num_partitions - str + int32 - The bigquery table to write to. Format: [${PROJECT}:]${DATASET}.${TABLE} + The number of partitions
- drop + output_parallelization - list[str] + boolean - A list of field names to drop from the input record before writing. Is mutually exclusive with 'keep' and 'only'. + Whether to reshuffle the resulting PCollection so results are distributed to all workers.
- keep + partition_column - list[str] + str - A list of field names to keep in the input record. All other fields are dropped before writing. Is mutually exclusive with 'drop' and 'only'. + Name of a column of numeric type that will be used for partitioning.
- kms_key + password str - Use this Cloud KMS key to encrypt your data + Password for the JDBC source.
- only + read_query str - The name of a single record field that should be written. Is mutually exclusive with 'keep' and 'drop'. + SQL query used to query the JDBC source.
- triggering_frequency_seconds + username - int64 + str - Determines how often to 'commit' progress into BigQuery. Default is every 5 seconds. + Username for the JDBC source.
@@ -1490,7 +1600,7 @@ For more information on table properties, please visit https://iceberg.apache.or
-### `MYSQL` Read +### `BIGQUERY` Read
@@ -1501,140 +1611,63 @@ For more information on table properties, please visit https://iceberg.apache.or - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- jdbc_url - - str - - Connection URL for the JDBC source. -
- connection_init_sql - - list[str] - - Sets the connection init sql statements used by the Driver. Only MySQL and MariaDB support this. -
- connection_properties - - str - - Used to set connection properties passed to the JDBC driver not already defined as standalone parameter (e.g. username and password can be set using parameters above accordingly). Format of the string must be "key1=value1;key2=value2;". -
- disable_auto_commit - - boolean - - Whether to disable auto commit on read. Defaults to true if not provided. The need for this config varies depending on the database platform. Informix requires this to be set to false while Postgres requires this to be set to true. -
- fetch_size - - int32 - - This method is used to override the size of the data that is going to be fetched and loaded in memory per every database call. It should ONLY be used if the default value throws memory errors. -
- location + kms_key str - Name of the table to read from. -
- num_partitions - - int32 - - The number of partitions -
- output_parallelization - - boolean - - Whether to reshuffle the resulting PCollection so results are distributed to all workers. + Use this Cloud KMS key to encrypt your data
- partition_column + query str - Name of a column of numeric type that will be used for partitioning. + The SQL query to be executed to read from the BigQuery table.
- password + row_restriction str - Password for the JDBC source. + Read only rows that match this filter, which must be compatible with Google standard SQL. This is not supported when reading via query.
- read_query + fields - str + list[str] - SQL query used to query the JDBC source. + Read only the specified fields (columns) from a BigQuery table. Fields may not be returned in the order specified. If no value is specified, then all fields are returned. Example: "col1, col2, col3"
- username + table str - Username for the JDBC source. + The fully-qualified name of the BigQuery table to read from. Format: [${PROJECT}:]${DATASET}.${TABLE}
-### `MYSQL` Write +### `BIGQUERY` Write
@@ -1645,101 +1678,68 @@ For more information on table properties, please visit https://iceberg.apache.or - - - - - - - - - - - - - - -
- jdbc_url + table str - Connection URL for the JDBC sink. -
- autosharding - - boolean - - If true, enables using a dynamically determined number of shards to write. -
- batch_size - - int64 - - n/a + The bigquery table to write to. Format: [${PROJECT}:]${DATASET}.${TABLE}
- connection_init_sql + drop list[str] - Sets the connection init sql statements used by the Driver. Only MySQL and MariaDB support this. -
- connection_properties - - str - - Used to set connection properties passed to the JDBC driver not already defined as standalone parameter (e.g. username and password can be set using parameters above accordingly). Format of the string must be "key1=value1;key2=value2;". + A list of field names to drop from the input record before writing. Is mutually exclusive with 'keep' and 'only'.
- location + keep - str + list[str] - Name of the table to write to. + A list of field names to keep in the input record. All other fields are dropped before writing. Is mutually exclusive with 'drop' and 'only'.
- password + kms_key str - Password for the JDBC source. + Use this Cloud KMS key to encrypt your data
- username + only str - Username for the JDBC source. + The name of a single record field that should be written. Is mutually exclusive with 'keep' and 'drop'.
- write_statement + triggering_frequency_seconds - str + int64 - SQL query used to insert records into the JDBC sink. + Determines how often to 'commit' progress into BigQuery. Default is every 5 seconds.