You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(cdc_snapshot): add Spark binaryFile fallback for dbutils.fs.ls() and fix parquet directory enumeration in historical CDC snapshot (#83)
* fix: add Spark binaryFile fallback for dbutils.fs.ls() and fix parquet directory enumeration in historical CDC snapshot
- _list_files now tries dbutils.fs.ls() first; falls back to Spark
binaryFile on any exception (e.g. Py4JSecurityException in Serverless
with Restricted Access / SEG)
- Fix bug where dbutils.fs().ls() was called with parentheses on fs
- binaryFile fallback stops at .parquet directories and deduplicates
part files so each snapshot version is counted once
- dbutils path also guards against recursing into .parquet directories
(trailing "/" stripped before the endswith check)
* bump version
* resolve merge and improve log message
* bump version
---------
Co-authored-by: rederik76 <rederik76@gmail.com>
0 commit comments