Skip to content

Commit e33f34a

Browse files
authored
fix: thread-safety hardening of Configuration (cache and in-place mutation races) (#620)
1 parent 48ce9dc commit e33f34a

11 files changed

Lines changed: 157 additions & 13 deletions

File tree

docs/releases/3.1.1.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# StateChart 3.1.1
2+
3+
*May 15, 2026*
4+
5+
## Bug fixes in 3.1.1
6+
7+
### Thread-safety hardening of the configuration cache
8+
9+
Two races in `Configuration` (introduced indirectly by the cache + no-copy
10+
design in 3.1.0) have been fixed. Both surfaced under concurrent reads of
11+
`machine.configuration` while another thread is sending events to the same
12+
state machine instance, a scenario explicitly supported by the sync engine.
13+
14+
1. **Cache read race.** `Configuration.states` checked
15+
`self._cached is not None` and then returned `self._cached`. Another
16+
thread invalidating between the check and the return could cause the
17+
property to return `None`, leading to a `TypeError` in callers that
18+
iterate the result (e.g., `list(machine.configuration)`). The getter now
19+
snapshots the cache fields locally before the freshness check.
20+
[#620](https://github.com/fgmacedo/python-statemachine/pull/620).
21+
22+
2. **In-place mutation race.** `Configuration.add()` and
23+
`Configuration.discard()` mutated the `OrderedSet` stored on the model
24+
in place and rewrote the same reference. A concurrent reader iterating
25+
`.configuration` could observe a partially mutated set (raising
26+
`RuntimeError: Set changed size during iteration`) or read back a stale
27+
cached resolution missing the new state. Both methods now use
28+
copy-on-write, producing a fresh `OrderedSet` per call. This affects
29+
only `StateChart` (where `atomic_configuration_update=False` is the
30+
default to support parallel regions). The atomic update path used by
31+
`StateMachine` was never affected.
32+
[#620](https://github.com/fgmacedo/python-statemachine/pull/620).
33+
34+
Both fixes are covered by new stress tests in
35+
`tests/test_threading.py::TestThreadSafety`:
36+
`test_concurrent_send_and_read_configuration` and
37+
`test_concurrent_parallel_region_send_and_read`, plus a deterministic
38+
copy-on-write contract test `test_add_discard_produce_fresh_orderedset`.
39+
40+
### Performance impact
41+
42+
Copy-on-write in `add()` / `discard()` reintroduces an O(n) shallow copy of
43+
the active configuration on every state entry and exit. For the typical
44+
configuration sizes used in practice (1–7 states), this is sub-microsecond.
45+
46+
Measured on macOS / Python 3.14, pytest-benchmark median, vs 3.1.0:
47+
48+
| Benchmark | 3.1.0 | 3.1.1 | Δ |
49+
|------------------------------------|-------------|-------------|--------|
50+
| `test_parallel_region_events` | 175.2 μs | 184.5 μs | +5.3% |
51+
| `test_many_transitions_reset` | 125.9 μs | 139.5 μs | +10.9% |
52+
| `test_guarded_transitions` | 70.0 μs | 75.7 μs | +8.2% |
53+
| `test_history_pause_resume` | 88.4 μs | 91.4 μs | +3.4% |
54+
| `test_many_transitions_full_cycle` | 156.9 μs | 162.1 μs | +3.3% |
55+
| `test_flat_self_transition` | 38.7 μs | 39.1 μs | +1.0% |
56+
57+
Overall 4.7x–7.7x event throughput improvement vs 3.0.0 (declared in
58+
[3.1.0 release notes](3.1.0.md)) is unchanged.

docs/releases/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Requires Python 3.9+.
1616
```{toctree}
1717
:maxdepth: 2
1818
19+
3.1.1
1920
3.1.0
2021
3.0.0
2122

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "python-statemachine"
3-
version = "3.1.0"
3+
version = "3.1.1"
44
description = "Python Finite State Machines made easy."
55
authors = [{ name = "Fernando Macedo", email = "fgmacedo@gmail.com" }]
66
maintainers = [{ name = "Fernando Macedo", email = "fgmacedo@gmail.com" }]

statemachine/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
__author__ = """Fernando Macedo"""
1010
__email__ = "fgmacedo@gmail.com"
11-
__version__ = "3.1.0"
11+
__version__ = "3.1.1"
1212

1313
__all__ = [
1414
"StateChart",

statemachine/configuration.py

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -73,8 +73,13 @@ def values(self) -> OrderedSet[Any]:
7373
def states(self) -> "OrderedSet[State]":
7474
"""The set of currently active :class:`State` instances (cached)."""
7575
raw = self.value
76-
if self._cached is not None and self._cached_value is raw:
77-
return self._cached
76+
# Snapshot the cache fields locally — another thread may call
77+
# ``_invalidate()`` between the freshness check and the return,
78+
# so reading ``self._cached`` twice would risk returning ``None``.
79+
cached = self._cached
80+
cached_value = self._cached_value
81+
if cached is not None and cached_value is raw:
82+
return cached
7883
if raw is None:
7984
return OrderedSet()
8085

@@ -92,14 +97,17 @@ def states(self, new_configuration: "OrderedSet[State]"):
9297
# -- Incremental mutation (used by the engine) -----------------------------
9398

9499
def add(self, state: "State"):
95-
"""Add *state* to the configuration."""
96-
values = self._read_from_model()
100+
"""Add *state* to the configuration (copy-on-write for thread safety)."""
101+
# Copy so we never mutate the OrderedSet still held by concurrent
102+
# readers or by the cache identity check. ``_read_from_model`` may
103+
# return the same ref stored on the model.
104+
values = OrderedSet(self._read_from_model())
97105
values.add(state.value)
98106
self._write_to_model(values)
99107

100108
def discard(self, state: "State"):
101-
"""Remove *state* from the configuration."""
102-
values = self._read_from_model()
109+
"""Remove *state* from the configuration (copy-on-write for thread safety)."""
110+
values = OrderedSet(self._read_from_model())
103111
values.discard(state.value)
104112
self._write_to_model(values)
105113

statemachine/locale/en/LC_MESSAGES/statemachine.po

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
msgid ""
55
msgstr ""
6-
"Project-Id-Version: 3.1.0\n"
6+
"Project-Id-Version: 3.1.1\n"
77
"Report-Msgid-Bugs-To: fgmacedo@gmail.com\n"
88
"POT-Creation-Date: 2026-05-15 12:08-0300\n"
99
"PO-Revision-Date: 2026-02-24 14:31-0300\n"

statemachine/locale/hi_IN/LC_MESSAGES/statemachine.po

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
msgid ""
55
msgstr ""
6-
"Project-Id-Version: 3.1.0\n"
6+
"Project-Id-Version: 3.1.1\n"
77
"Report-Msgid-Bugs-To: fgmacedo@gmail.com\n"
88
"POT-Creation-Date: 2026-05-15 12:08-0300\n"
99
"PO-Revision-Date: 2024-06-07 17:41-0300\n"

statemachine/locale/pt_BR/LC_MESSAGES/statemachine.po

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
msgid ""
55
msgstr ""
6-
"Project-Id-Version: 3.1.0\n"
6+
"Project-Id-Version: 3.1.1\n"
77
"Report-Msgid-Bugs-To: fgmacedo@gmail.com\n"
88
"POT-Creation-Date: 2026-05-15 12:08-0300\n"
99
"PO-Revision-Date: 2024-06-07 17:41-0300\n"

statemachine/locale/zh_CN/LC_MESSAGES/statemachine.po

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
msgid ""
55
msgstr ""
6-
"Project-Id-Version: 3.1.0\n"
6+
"Project-Id-Version: 3.1.1\n"
77
"Report-Msgid-Bugs-To: fgmacedo@gmail.com\n"
88
"POT-Creation-Date: 2026-05-15 12:08-0300\n"
99
"PO-Revision-Date: 2024-06-07 17:41-0300\n"

tests/test_threading.py

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,22 @@ class CyclingMachine(StateChart):
134134

135135
return CyclingMachine()
136136

137+
@pytest.fixture()
138+
def parallel_machine(self):
139+
class TwoRegions(StateChart):
140+
class both(State.Parallel):
141+
class left(State.Compound):
142+
l1 = State(initial=True)
143+
l2 = State()
144+
tick_l = l1.to(l2) | l2.to(l1)
145+
146+
class right(State.Compound):
147+
r1 = State(initial=True)
148+
r2 = State()
149+
tick_r = r1.to(r2) | r2.to(r1)
150+
151+
return TwoRegions()
152+
137153
@pytest.mark.parametrize("num_threads", [4, 8])
138154
def test_concurrent_sends_no_lost_events(self, cycling_machine, num_threads):
139155
"""All events sent concurrently must be processed — none lost."""
@@ -294,6 +310,67 @@ def reader():
294310

295311
assert not errors, f"Thread errors: {errors}"
296312

313+
def test_concurrent_parallel_region_send_and_read(self, parallel_machine):
314+
"""Reading configuration while parallel-region events mutate it must not raise.
315+
316+
Regresses an in-place mutation of the model's ``OrderedSet`` during
317+
``Configuration.add()`` / ``discard()``: a concurrent reader iterating
318+
``.configuration`` could crash with
319+
``RuntimeError: Set changed size during iteration`` or briefly observe
320+
a stale cached set missing the newly entered state.
321+
"""
322+
num_senders = 4
323+
events_per_sender = 400
324+
barrier = threading.Barrier(num_senders + 1)
325+
stop_event = threading.Event()
326+
errors = []
327+
328+
def sender(event_name):
329+
try:
330+
barrier.wait(timeout=5)
331+
for _ in range(events_per_sender):
332+
parallel_machine.send(event_name)
333+
except Exception as e:
334+
errors.append(e)
335+
336+
def reader():
337+
barrier.wait(timeout=5)
338+
while not stop_event.is_set():
339+
try:
340+
# Force resolution + iteration each loop.
341+
_ = list(parallel_machine.configuration)
342+
_ = [s.id for s in parallel_machine.configuration]
343+
except Exception as e:
344+
errors.append(e)
345+
346+
senders = []
347+
# Alternate tick_l / tick_r across threads so both regions mutate concurrently.
348+
for i in range(num_senders):
349+
event = "tick_l" if i % 2 == 0 else "tick_r"
350+
senders.append(threading.Thread(target=sender, args=(event,)))
351+
reader_thread = threading.Thread(target=reader)
352+
353+
for t in senders:
354+
t.start()
355+
reader_thread.start()
356+
357+
for t in senders:
358+
t.join(timeout=30)
359+
stop_event.set()
360+
reader_thread.join(timeout=5)
361+
362+
assert not errors, f"Thread errors: {errors}"
363+
364+
def test_add_discard_produce_fresh_orderedset(self, parallel_machine):
365+
"""``add`` / ``discard`` must produce a fresh ``OrderedSet`` ref each call.
366+
367+
Pins the copy-on-write contract independently of timing: otherwise a
368+
reader holding the prior ref could observe mid-mutation.
369+
"""
370+
snapshot = parallel_machine._config.values
371+
parallel_machine.send("tick_l")
372+
assert parallel_machine._config.values is not snapshot
373+
297374

298375
async def test_regression_443_with_modifications_for_async_engine():
299376
"""

0 commit comments

Comments
 (0)