Commit 656275e
fix: add S3 timeouts to GlueCatalog to prevent PyIceberg thread pool hangs
PyIceberg's ExecutorFactory creates a singleton ThreadPoolExecutor that
uses executor.map() to read Iceberg manifest files from S3 in parallel
during plan_files() — which is invoked even by Polars' native Iceberg
reader path at collect() time.
When an S3 connection enters CLOSE_WAIT state (server closed connection,
client not yet), the PyArrow S3FileSystem read blocks indefinitely because
no request timeout is configured. This causes executor.map() to never
return, collect() to never complete, and the Dagster step to hang forever
holding a concurrency slot.
Setting s3.connect-timeout and s3.request-timeout on the GlueCatalog
causes PyArrowFileIO to pass these values to pyarrow.fs.S3FileSystem,
bounding stuck S3 reads to 120 seconds. Affected steps will now fail
with a recoverable TimeoutError rather than hanging indefinitely.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>1 parent 4c5651f commit 656275e
1 file changed
Lines changed: 8 additions & 1 deletion
Lines changed: 8 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
109 | | - | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
110 | 117 | | |
111 | 118 | | |
112 | 119 | | |
0 commit comments