Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 1 addition & 9 deletions delta-sharing/change-data-feed/Oracle Delta Sharing CDF.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,6 @@
"This notebook reads **Change Data Feed (CDF)** from an Oracle Autonomous Database\n",
"that publishes a Delta Sharing endpoint, and displays the raw change rows.\n",
"\n",
"## Why a custom REST approach?\n",
"\n",
"| Problem | Root cause | Workaround |\n",
"|---|---|---|\n",
"| `spark.read.format(\"deltaSharing\")` throws `InvocationTargetException` | Spark's Java Delta Sharing connector is incompatible with Oracle endpoints on serverless compute | Use the Python `delta_sharing` REST client instead |\n",
"| `load_table_changes_as_pandas()` throws `KeyError: '_commit_timestamp'` | Oracle's file-level CDF omits the `_commit_timestamp` column that the library expects | Call the REST API directly and parse with `DeltaSharingReader._to_pandas()` |\n",
"| `spark.read.parquet(*urls)` throws `UNSUPPORTED_FILE_SYSTEM` | Spark can't read HTTPS pre-signed URLs from Oracle object storage | Download via HTTP with `_to_pandas()`, then convert to Spark DataFrame |\n",
"\n",
"## Oracle's file-level CDF\n",
"\n",
"Oracle implements CDF at the **file level**, not the row level. When *any* row in a\n",
Expand Down Expand Up @@ -247,4 +239,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}