Skip to content

Commit 7d7bf7e

Browse files
author
Dani Pinyol
committed
fix: rare crash due to memoty race condition
The vectors PYJLVALUES and PYJLFREEVALUES are modified without any protection against GC-triggered re-entrancy. _pyjl_dealloc(o1) or PyJuliaValue_SetValue └→ push!(PYJLFREEVALUES, idx) # begins resize (sets internal flag) └→ vector must grow → allocates → triggers Julia GC └→ GC collects a Py object → runs py_finalizer └→ enqueue(ptr) → PyGILState_Check()==1 (same thread holds GIL) └→ Py_DecRef(ptr) → refcount hits 0 └→ _pyjl_dealloc(o2) └→ push!(PYJLFREEVALUES, idx2) ← BOOM: flag already set ConcurrencyViolationError! 1. push! needs to grow the vector (exceeds capacity), so it allocates 2. The allocation triggers Julia GC, which runs finalizers 3. A finalizer calls Py_DecRef (because enqueue sees the GIL is held on this same thread), dropping refcount to 0, which triggers _pyjl_dealloc re-entrantly on the same vector This is especially likely during shutdown (at_jl_exit → jl_atexit_hook), because jl_gc_run_all_finalizers runs finalizers on the calling thread (the main thread, which holds the GIL). Many Py objects are finalized at once, repeatedly pushing to PYJLFREEVALUES, and each push that triggers a reallocation can cause the re-entrant chain. Disabled Julia's GC during the critical vector modifications, so that push!/pop! cannot trigger allocation-based GC, which prevents the finalizer chain from re-entering. Added a SpinLock as defense-in-depth against true multi-thread races.
1 parent 8f7baa5 commit 7d7bf7e

File tree

1 file changed

+16
-0
lines changed

1 file changed

+16
-0
lines changed

src/JlWrap/C.jl

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ const PyJuliaBase_Type = Ref(C.PyNULL)
2525
const PYJLVALUES = []
2626
# unused indices in PYJLVALUES
2727
const PYJLFREEVALUES = Int[]
28+
# lock protecting PYJLVALUES and PYJLFREEVALUES from concurrent modification
29+
const PYJL_LOCK = Threads.SpinLock()
2830

2931
function _pyjl_new(t::C.PyPtr, ::C.PyPtr, ::C.PyPtr)
3032
alloc = C.PyType_GetSlot(t, C.Py_tp_alloc)
@@ -39,8 +41,16 @@ end
3941
function _pyjl_dealloc(o::C.PyPtr)
4042
idx = C.@ft UnsafePtr{PyJuliaValueObject}(o).value[]
4143
if idx != 0
44+
# Disable GC to prevent push! from triggering a GC that runs finalizers
45+
# re-entrantly (the finalizer chain: push! → alloc → GC → py_finalizer →
46+
# enqueue → Py_DecRef → _pyjl_dealloc → push! on same vector →
47+
# ConcurrencyViolationError). The lock guards against true multi-thread races.
48+
prev = Base.GC.enable(false)
49+
lock(PYJL_LOCK)
4250
PYJLVALUES[idx] = nothing
4351
push!(PYJLFREEVALUES, idx)
52+
unlock(PYJL_LOCK)
53+
Base.GC.enable(prev)
4454
end
4555
(C.@ft UnsafePtr{PyJuliaValueObject}(o).weaklist[!]) == C.PyNULL || C.PyObject_ClearWeakRefs(o)
4656
freeptr = C.PyType_GetSlot(C.Py_Type(o), C.Py_tp_free)
@@ -375,13 +385,19 @@ PyJuliaValue_SetValue(_o, @nospecialize(v)) = Base.GC.@preserve _o begin
375385
o = C.asptr(_o)
376386
idx = C.@ft UnsafePtr{PyJuliaValueObject}(o).value[]
377387
if idx == 0
388+
# Disable GC to prevent push!/pop! from triggering a GC whose finalizers
389+
# re-entrantly resize the same vectors (see _pyjl_dealloc comment).
390+
prev = Base.GC.enable(false)
391+
lock(PYJL_LOCK)
378392
if isempty(PYJLFREEVALUES)
379393
push!(PYJLVALUES, v)
380394
idx = length(PYJLVALUES)
381395
else
382396
idx = pop!(PYJLFREEVALUES)
383397
PYJLVALUES[idx] = v
384398
end
399+
unlock(PYJL_LOCK)
400+
Base.GC.enable(prev)
385401
C.@ft UnsafePtr{PyJuliaValueObject}(o).value[] = idx
386402
else
387403
PYJLVALUES[idx] = v

0 commit comments

Comments
 (0)