You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/user-guide/performance.md
+22Lines changed: 22 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -217,6 +217,28 @@ Lower concurrency values may be beneficial when:
217
217
- Memory is constrained (each concurrent operation requires buffer space)
218
218
- Using Zarr within a parallel computing framework (see below)
219
219
220
+
### Thread pool size (`threading.max_workers`)
221
+
222
+
When synchronous Zarr code calls async operations internally, Zarr uses a
223
+
`ThreadPoolExecutor` to run those coroutines. The `threading.max_workers`
224
+
configuration option controls the maximum number of worker threads in that pool.
225
+
By default it is `None`, which lets Python choose the pool size (typically
226
+
`min(32, os.cpu_count() + 4)`).
227
+
228
+
You can set it explicitly when you want more predictable resource usage:
229
+
230
+
```python
231
+
import zarr
232
+
233
+
zarr.config.set({'threading.max_workers': 8})
234
+
```
235
+
236
+
Reducing this value can help avoid overloading the event loop when Zarr is used
237
+
inside a parallel computing framework such as Dask that already manages its own
238
+
thread pool (see the Dask section below). Increasing it may improve throughput
239
+
in CPU-bound workloads where many synchronous-to-async dispatches happen
240
+
concurrently.
241
+
220
242
### Using Zarr with Dask
221
243
222
244
[Dask](https://www.dask.org/) is a popular parallel computing library that works well with Zarr for processing large arrays. When using Zarr with Dask, it's important to consider the interaction between Dask's thread pool and Zarr's concurrency settings.
0 commit comments