[mypyc] Fix memory leak on property setter call#21095
[mypyc] Fix memory leak on property setter call#21095p-sawicki merged 9 commits intopython:masterfrom
Conversation
| # to avoid leaking. | ||
| src_rtype = op.src.type | ||
| if src_rtype.is_refcounted: | ||
| self.emitter.emit_dec_ref(src, src_rtype) |
There was a problem hiding this comment.
A potentially cleaner approach would be to adjust stolen of SetAttr for properties -- i.e. don't steal the value if it's a property. Thoughts?
There was a problem hiding this comment.
ah good call, this way the decref is visible in the IR which makes for better tests.
i've removed the leak test from the run test as it seemed a little flaky when running it multiple times in a row.
There was a problem hiding this comment.
looks like the failures were actually an implementation mistake - using ClassIR.methods during IR generation may not work in multimodule compilation because the SetAttr constructor might run before the property setter method is added in the other module.
i've changed it to use ClassIR.method_decls instead which is populated in prepare.py so it's safe to use.
`is_native_attr_ref()` uses `has_attr(name) and not get_method(name)` to decide if an attribute access can borrow. `has_attr()` is populated during preparation (always complete), but `get_method()` checks `ir.methods` which is populated during compilation. When cross-module classes haven't been compiled yet, `get_method()` returns None for property getters, incorrectly allowing borrowing. Property getters return new owned references, so skipping the DECREF leaks one reference per call. The fix checks `ir.attributes` directly (struct fields only, always populated during preparation). Properties are never in `ir.attributes`, so they always return False. This is the getter-side counterpart to python#21095.
## Issue
Mypyc can incorrectly borrow references from property getters (owned
references).
The borrow mechanism is safe for struct field access since they return a
pointer into the object's memory; Property getters on the other hand are
method calls that return new owned references i.e the caller must
`DECREF` them. If mypyc mistakes a property for a struct field, it skips
the `DECREF`, leaking one reference per call.
I think this affects any expression context that enables borrowing
(comparisons, `isinstance`, `is None` checks) when the borrowed
expression is a property access on a cross-module class. We discovered
this through OOM issues in SQLGlot, where `isinstance(self.this, Func)`
(`this` is a `@property`) leaked on every call.
## Repro
```Python3
# base.py:
class Bar:
pass
class Foo:
def __init__(self) -> None:
self.obj: object = Bar()
@Property
def val(self) -> object:
return self.obj
# derived.py:
from base import Foo, Bar
def check(foo: Foo) -> bool:
return isinstance(foo.val, Bar)
# test_leak.py:
import sys
from base import Foo
from derived import check
foo = Foo()
init = sys.getrefcount(foo.obj)
for _ in range(100):
check(foo)
print(f"Leaked refs: {sys.getrefcount(foo.obj) - init}")
```
Compile both with mypyc passing in `PYTHONHASHSEED=3` prints `Leaked
refs: 100`.
## Root cause
1. `is_native_attr_ref()` decides if an attribute access can safely
borrow by checking `has_attr(name) and not get_method(name)` i.e "if
there's an attribute but no method, it's a struct field".
2. Although `has_attr()` is always fully populated, `get_method()`
depends on whether the class's module (`base` in this case) has been
compiled yet.
3. Modules within an SCC are compiled in nondeterministic order; If the
module defining the property hasn't been compiled yet -> `get_method()`
returns `None` -> `is_native_attr_ref` incorrectly treats the property
as a borrowed struct field.
I think there's a broader issue here: Even when `derived.py` imports
from `base.py` (no cycle), mypyc can place them in the same SCC and
compile derived before base. Since `ir.methods` is only populated when a
module's function bodies are compiled, any code that reads `ir.methods`
(i.e `get_methods`) during compilation of a different module in the same
SCC could have similar order-dependent bugs.
## The suggested fix
Check `ir.attributes` directly (struct fields only, always populated)
instead of the unreliable `get_method`. I believe this is the
getter-side counterpart to #21095 which fixed the same class of bug for
property setters.
Calling a property setter method results in a memory leak because the setter method internally increfs the passed value and there is no corresponding decref on the caller side. Probably because regular attribute set steals the value so there is no need for a decref there. So add the decref after calling a property setter.
The leak can be observed either by manually looking at memory use in
topwhen running a loop like this one: