Commit 7be6cff
fix: finalize PyArrow S3 threads after Iceberg read in instructor_onboarding (#2221)
PyArrowFileIO initializes the C++ AWS SDK which spawns non-daemon threads.
These threads block subprocess exit, causing the Dagster step subprocess to
hang after STEP_SUCCESS with the multiprocess executor never logging
'parent process exiting'.
Calling pa_fs.finalize_s3() after collect() — the last PyArrow S3 read in
this asset — shuts down the C++ thread pool and allows clean subprocess exit.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>1 parent 639d305 commit 7be6cff
1 file changed
Lines changed: 13 additions & 1 deletion
Lines changed: 13 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
77 | 78 | | |
78 | 79 | | |
79 | 80 | | |
80 | | - | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
81 | 89 | | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
82 | 94 | | |
83 | 95 | | |
84 | 96 | | |
| |||
0 commit comments