Skip to content

Commit 060f63d

Browse files
committed
1.0.1
1 parent 343ff86 commit 060f63d

1 file changed

Lines changed: 23 additions & 16 deletions

File tree

README.md

Lines changed: 23 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
Make multi-threaded concurrency backward- and forward-compatible for the free-threaded future of Python.
44

55

6-
### Multi-Threading In and Out of Free-Threading
6+
## Multi-Threading In and Out of Free-Threading
77

8-
The following is a table of performance results for the execution of a function across each row of NumPy array, with `python3.14t` and `python3.14`, and with and without using a `ThreadPoolExecutor`. Performance improves with `python3.14t` but degrades with `python3.14`.
8+
The following is a table of performance results for the execution of a function across each row of a NumPy array ([code](#example)), with (no [GIL](https://docs.python.org/3/glossary.html#term-global-interpreter-lock)) `python3.14t` and (GIL enabled) `python3.14`, and with and without `ThreadPoolExecutor`. Performance improves with `python3.14t` but degrades with `python3.14`.
99

1010
|Interpreter |Executor |Duration|
1111
|------------|------------------------------|--------|
@@ -14,7 +14,7 @@ The following is a table of performance results for the execution of a function
1414
|python3.14 |None |🟡 0.544 |
1515
|python3.14 |ThreadPoolExecutor |🔴 2.231 |
1616

17-
`ConditionalThreadPoolExecutor` lets a single interface get the best result in both contexts.
17+
`ConditionalThreadPoolExecutor` provides a single interface to get the best result in either context.
1818

1919
|Interpreter |Executor |Duration|
2020
|------------|------------------------------|--------|
@@ -24,28 +24,27 @@ The following is a table of performance results for the execution of a function
2424
|python3.14 |ConditionalThreadPoolExecutor |🟡 0.532 |
2525

2626

27-
### Introduction
27+
## Introduction
2828

29-
The new free-threaded version of Python (with the [GIL](https://docs.python.org/3/glossary.html#term-global-interpreter-lock) disabled) offers extraordinary improvement in performance of CPU-bound processes. Upgrading your code to take advantage of this performance, however, is problematic. The same multi-threaded code, if run with the GIL enabled, can actually perform significantly worse than single-threaded execution. Worse, even when using a free-threaded interpreter, importing an incompatible C-extension will automatically re-enable the GIL.
29+
The new free-threaded version of Python (with the GIL disabled) offers extraordinary performance improvements in multi-threading CPU-bound processes. Upgrading your code to take advantage of this performance, however, is problematic. The same multi-threaded code, if run with the GIL enabled, can actually perform significantly worse than single-threaded execution. Even when using a free-threaded interpreter, importing an incompatible C-extension will automatically re-enable the GIL.
3030

31-
For code that will run in multiple interpreters, we need interfaces that perform multi-threaded processing only when the GIL is disabled.
31+
For code that will run across many interpreters with or without the GIL, we need interfaces that perform multi-threaded processing only when the GIL is disabled.
3232

3333
The `conditional-futures` package provides `ConditionalThreadPoolExecutor`, a drop-in replacement for `ThreadPoolExecutor` that adapts based on the runtime state of the GIL.
3434

35-
When running under free-threaded Python with the GIL disabled `ConditionalThreadPoolExecutor` behaves like a normal thread pool. When running under a GIL-enabled build, it falls back on single-threaded execution, potentially avoiding a significant degradation in performance. The same implementation thus offers optimal performance in all contexts.
35+
When running under free-threaded Python with the GIL disabled `ConditionalThreadPoolExecutor` behaves like a normal thread pool. When running under a GIL-enabled build, it falls back on single-threaded execution, potentially avoiding a significant degradation in performance. The same implementation offers optimal performance in all contexts.
3636

3737
Note that, even with the GIL enabled, multi-threading can perform well for I/O-bound processes. `ConditionalThreadPoolExecutor` is appropriate only for CPU-bound processes that perform worse with the GIL.
3838

3939

40-
### Example
40+
## Example
4141

42-
Function application on the rows of a 2D NumPy array can prove the benefits of both free-threaded Python and the need for `ConditionalThreadPoolExecutor`.
42+
The performance of function application on the rows of a 2D NumPy array can be used to show both the benefits of free-threaded Python and the need for `ConditionalThreadPoolExecutor`.
4343

4444
First, using the free-threaded build of Python 3.14, we can create an array and apply a function to each row of that array. The `ipython` `%time` utility is used to measure duration.
4545

4646
```python
4747
$ python3.14t
48-
>>> import numpy as np
4948
>>> array = np.arange(100_000_000).reshape(100_000, 1_000)
5049
>>> func = lambda row: (row[row % 2 == 0]**2).sum()
5150
>>> %time _ = np.fromiter((func(row) for row in array), dtype=float, count=array.shape[0])
@@ -56,19 +55,17 @@ Wall time: 581 ms
5655
Using `ConditionalThreadPoolExecutor` with this GIL-disabled build of Python we can take advantage of multi-threaded performance on a CPU-bound process: the same routine is almost twice as fast:
5756

5857
```python
59-
>>> from conditional_futures import ConditionalThreadPoolExecutor
6058
>>> with ConditionalThreadPoolExecutor(max_workers=4) as ex:
6159
... %time _ = np.fromiter(ex.map(func, array), dtype=float, count=array.shape[0])
6260
...
6361
CPU times: user 1.31 s, sys: 98 ms, total: 1.41 s
6462
Wall time: 352 ms
6563
```
6664

67-
Now, if using the standard Python 3.14 interpreter (with the GIL enabled), we can see detrimental performance using when using the standard `ThreadPoolExecutor`: the same operation takes six times as long!
65+
Now, if using the standard Python 3.14 interpreter (with the GIL enabled), `ThreadPoolExecutor` degrades performance: the same operation takes six times as long!
6866

6967
```python
7068
$ python3.14
71-
>>> from concurrent.futures import ThreadPoolExecutor
7269
>>> array = np.arange(100_000_000).reshape(100_000, 1_000)
7370
>>> func = lambda row: (row[row % 2 == 0] ** 2).sum()
7471
>>> with ThreadPoolExecutor(max_workers=4) as ex:
@@ -78,11 +75,10 @@ CPU times: user 1.9 s, sys: 2.21 s, total: 4.12 s
7875
Wall time: 2.33 s
7976
```
8077

81-
Using `ConditionalThreadPoolExecutor` we can have one implementation that performs optimally in both contexts. Running the same code with the GIL enabled, `ConditionalThreadPoolExecutor` does not perform as well as in `python3.14t` but provides the best option available, single-threaded performance.
78+
Using `ConditionalThreadPoolExecutor`, one implementation performs optimally in both contexts. Running the same code with `python3.14`, `ConditionalThreadPoolExecutor` does not perform as well as with `python3.14t`, but provides the best option available: single-threaded performance.
8279

8380

8481
```python
85-
>>> from conditional_futures import ConditionalThreadPoolExecutor
8682
>>> with ConditionalThreadPoolExecutor(max_workers=4) as ex:
8783
... %time _ = np.fromiter(ex.map(func, array), dtype=float, count=array.shape[0])
8884
...
@@ -91,9 +87,20 @@ Wall time: 533 ms
9187
```
9288

9389

94-
### Installation
90+
## Installation
9591

9692
```bash
9793
pip install conditional-futures
9894
```
9995

96+
97+
## What is New in `conditional-futures`
98+
99+
### 1.0.1
100+
101+
Extended documentation.
102+
103+
104+
### 1.0.0
105+
106+
Initial release.

0 commit comments

Comments
 (0)