Commit 79a769a
[mypyc] Fix reference leak when setting unboxed refcounted attrs (#21657)
I ran into this memory leak which can reproduced with:
```python
from dataclasses import dataclass
@DataClass
class MyClass:
v: int
c = MyClass(1 << 70)
```
This PR adds a fix and a test that fails without the fix. I am no expert
on mypyc internals, so I asked Claude Code to fix this bug. It is a
one-line change so I believe it will be easy to review.
From quick inspection it seems that the rest of the code has fewer
comments than what Claude wrote, so if you prefer, I'll remove the
verbose comment that this PR adds. Similarly for the test code, it can
be shortened if wanted.
<details>
<summary>Long explanation from Claude code</summary>
### Description
A native attribute setter generated by mypyc over-increfs the stored
value when
the attribute has a **refcounted unboxed type** — most importantly `int`
(`CPyTagged`), and also tuples with refcounted items.
For an unboxed type, `generate_setter` emitted:
```c
tmp = CPyTagged_FromObject(value); // already creates a NEW (owned) reference
CPyTagged_INCREF(tmp); // ...and then takes a second one
self->_v = tmp;
```
CPyTagged_FromObject already increfs in the heap-boxed case, so the
setter
takes two references while the deallocator releases only one — leaking
one
reference on every set through the setter. The other two branches of
generate_setter (object and the emit_cast path) are correct because they
produce a borrowed value and rely on the single emit_inc_ref to take
ownership; only the unboxed branch was inconsistent.
Why this shows up with dataclasses
This is reached whenever an attribute is set from interpreted code. The
clearest
real-world case is the __init__ that the stdlib dataclasses module
synthesizes for a mypyc-compiled @DataClass: its self.v = v runs as
interpreted code and goes through the generated descriptor setter. So
every
constructed instance of a compiled dataclass with a heap-boxed int field
(value ≥ 2**62) leaked one PyLong — silent, unbounded growth in
long-lived
programs that build many such objects (e.g. 64-bit ids).
It does not depend on slots, frozen, eq, field count, or field
position; a hand-written native __init__ is unaffected because it stores
via
SetAttr rather than the descriptor. Small (inline-tagged) ints, floats,
and
object-typed fields are also unaffected since they aren't refcounted
through
this path.
Reproducer
```python
from dataclasses import dataclass
import sys
@DataClass
class C:
v: int
big = 1 << 70 # heap-boxed int (>= 2**62), so it is refcounted
base = sys.getrefcount(big)
xs = [C(big) for _ in range(1000)]
del xs
print(sys.getrefcount(big) - base) # before: 1000 after: 0
```
Fix
Unbox with borrow=True in the setter's unboxed branch, so all three
branches
of generate_setter produce a borrowed value and the single emit_inc_ref
takes exactly one owned reference. borrow=True is a no-op for
non-refcounted
unboxed types (float, fixed-width ints, bool) and is propagated
correctly
through RTuple unboxing, which fixes the analogous leak for tuple-typed
attributes too.
Tests
Added testNativeAttrSetterRefcountLeak to
mypyc/test-data/run-classes.test,
covering a boxed-int dataclass field, an unboxed Tuple[int, int] field,
and
setter re-assignment. It fails before the fix (expected 100 live refs,
got 200) and passes after.
</details>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent 629f456 commit 79a769a
2 files changed
Lines changed: 67 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1211 | 1211 | | |
1212 | 1212 | | |
1213 | 1213 | | |
1214 | | - | |
| 1214 | + | |
| 1215 | + | |
| 1216 | + | |
| 1217 | + | |
| 1218 | + | |
| 1219 | + | |
| 1220 | + | |
| 1221 | + | |
| 1222 | + | |
1215 | 1223 | | |
1216 | 1224 | | |
1217 | 1225 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6006 | 6006 | | |
6007 | 6007 | | |
6008 | 6008 | | |
| 6009 | + | |
| 6010 | + | |
| 6011 | + | |
| 6012 | + | |
| 6013 | + | |
| 6014 | + | |
| 6015 | + | |
| 6016 | + | |
| 6017 | + | |
| 6018 | + | |
| 6019 | + | |
| 6020 | + | |
| 6021 | + | |
| 6022 | + | |
| 6023 | + | |
| 6024 | + | |
| 6025 | + | |
| 6026 | + | |
| 6027 | + | |
| 6028 | + | |
| 6029 | + | |
| 6030 | + | |
| 6031 | + | |
| 6032 | + | |
| 6033 | + | |
| 6034 | + | |
| 6035 | + | |
| 6036 | + | |
| 6037 | + | |
| 6038 | + | |
| 6039 | + | |
| 6040 | + | |
| 6041 | + | |
| 6042 | + | |
| 6043 | + | |
| 6044 | + | |
| 6045 | + | |
| 6046 | + | |
| 6047 | + | |
| 6048 | + | |
| 6049 | + | |
| 6050 | + | |
| 6051 | + | |
| 6052 | + | |
| 6053 | + | |
| 6054 | + | |
| 6055 | + | |
| 6056 | + | |
| 6057 | + | |
| 6058 | + | |
| 6059 | + | |
| 6060 | + | |
| 6061 | + | |
| 6062 | + | |
| 6063 | + | |
| 6064 | + | |
| 6065 | + | |
| 6066 | + | |
0 commit comments