You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/dataflow_spec_ref_cdc.rst
+23-22Lines changed: 23 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ The ``cdcSettings`` and ``cdcSnapshotSettings`` enable and pass configuration in
18
18
- See :ref:`cdcSnapshotSettings` for more information.
19
19
20
20
cdcSettings
21
-
~~~~~~~~~~~~~~~~~
21
+
~~~~~~~~~~~~~~~~~
22
22
23
23
The ``cdcSettings`` object contains the following properties:
24
24
@@ -36,7 +36,7 @@ The ``cdcSettings`` object contains the following properties:
36
36
- The column name specifying the logical order of CDC events in the source data. Delta Live Tables uses this sequencing to handle change events that arrive out of order.
37
37
* - **scd_type**
38
38
- ``string``
39
-
- Whether to store records as SCD type 1 or SCD type 2. Set to ``1`` for SCD type 1 or 2 for SCD type ``2``.
39
+
- Whether to store records as SCD type 1 or SCD type 2. Set to ``1`` for SCD type 1 or 2 for SCD type ``2``.
40
40
* - **apply_as_deletes**
41
41
- ``string``
42
42
- (*optional*) Specifies when a CDC event should be treated as a DELETE rather than an upsert.
If ``recursiveFileLookup`` is set to ``true``, ensure that the ``path`` parameter is compatible with recursive directory traversal. When using the ``{version}`` placeholder, place it in the directory portion of the path rather than the filename (e.g. ``/data/{version}/file.parquet``). When using regex named capture groups, the pattern spans the full relative path from the first dynamic segment, so ``recursiveFileLookup`` must be ``true`` if the version spans multiple directory levels.
149
149
150
+
The ``source`` object contains the following properties for ``table`` based sources:
151
+
152
+
.. list-table::
153
+
:header-rows: 1
154
+
155
+
* - Parameter
156
+
- Type
157
+
- Description
158
+
* - **table**
159
+
- ``string``
160
+
- The table name to load the source data from, as either a 2-part ``schema.table`` (resolving to *<pipeline_target>*.schema.table) or 3-part ``catalog.schema.table`` identifier.
161
+
* - **versionColumn**
162
+
- ``string``
163
+
- The column name to use for versioning.
164
+
* - **startingVersion**
165
+
- ``string`` or ``integer``
166
+
- (*optional*) The version to start processing from.
167
+
* - **selectExp**
168
+
- ``list``
169
+
- (*optional*) A list of select expressions to apply to the source data.
170
+
150
171
.. _file-path-patterns:
151
172
152
173
File Path Patterns
@@ -235,23 +256,3 @@ File Path Patterns
235
256
236
257
See ``samples/bronze_sample/src/dataflows/feature_samples/dataflowspec/historical_snapshot_files_datetime_recursive_and_partitioned_regex_main.json`` for a complete working example.
237
258
238
-
The ``source`` object contains the following properties for ``table`` based sources:
239
-
240
-
.. list-table::
241
-
:header-rows: 1
242
-
243
-
* - Parameter
244
-
- Type
245
-
- Description
246
-
* - **table**
247
-
- ``string``
248
-
- The table name to load the source data from.
249
-
* - **versionColumn**
250
-
- ``string``
251
-
- The column name to use for versioning.
252
-
* - **startingVersion**
253
-
- ``string`` or ``integer``
254
-
- (*optional*) The version to start processing from.
255
-
* - **selectExp**
256
-
- ``list``
257
-
- (*optional*) A list of select expressions to apply to the source data.
self.logger.debug(f"CDC Snapshot: Skipping version {version} because it is less than or equal to the latest snapshot version {latest_snapshot_version}")
0 commit comments