Skip to content

Commit 656275e

Browse files
blarghmateyCopilot
andcommitted
fix: add S3 timeouts to GlueCatalog to prevent PyIceberg thread pool hangs
PyIceberg's ExecutorFactory creates a singleton ThreadPoolExecutor that uses executor.map() to read Iceberg manifest files from S3 in parallel during plan_files() — which is invoked even by Polars' native Iceberg reader path at collect() time. When an S3 connection enters CLOSE_WAIT state (server closed connection, client not yet), the PyArrow S3FileSystem read blocks indefinitely because no request timeout is configured. This causes executor.map() to never return, collect() to never complete, and the Dagster step to hang forever holding a concurrency slot. Setting s3.connect-timeout and s3.request-timeout on the GlueCatalog causes PyArrowFileIO to pass these values to pyarrow.fs.S3FileSystem, bounding stuck S3 reads to 120 seconds. Affected steps will now fail with a recoverable TimeoutError rather than hanging indefinitely. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 4c5651f commit 656275e

1 file changed

Lines changed: 8 additions & 1 deletion

File tree

packages/ol-orchestrate-lib/src/ol_orchestrate/lib/glue_helper.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,14 @@ def get_dbt_model_as_dataframe(database_name: str, table_name: str) -> pl.LazyFr
106106
KeyError: If the table metadata doesn't contain the expected fields
107107
boto3 exceptions: If the AWS Glue API call fails
108108
"""
109-
glue = GlueCatalog("default", client=boto3.client("glue", region_name="us-east-1"))
109+
glue = GlueCatalog(
110+
"default",
111+
client=boto3.client("glue", region_name="us-east-1"),
112+
**{
113+
"s3.connect-timeout": "10",
114+
"s3.request-timeout": "120",
115+
},
116+
)
110117
table = glue.load_table(f"{database_name}.{table_name}")
111118

112119
return table.to_polars()

0 commit comments

Comments
 (0)