Skip to content

Commit 887d200

Browse files
committed
dialog: re-read state under lock before bye_on_timeout (GH-3835)
Fix A (timer removal under lock) and Fix B (cached state read at top of dlg_ontimeout) from commit 196b51f are insufficient for a third race path: 1. Timer fires, dlg_ontimeout reads dlg_state=CONFIRMED under lock 2. BYE arrives on another worker, next_state_dlg transitions to DELETED (timer removal inside lock returns >0: already dequeued) 3. Timer handler still sees cached dlg_state==CONFIRMED 4. Timer enters bye_on_timeout: dlg_end_dlg() + unref_dlg(1) 5. BYE worker completes its unref chain, destroys dialog 6. Timer callbacks access freed memory -> bogus ref -1 Fix: re-read dlg->state under the hash lock immediately before each critical decision point (rt_on_timeout route and bye_on_timeout path). Ref: #3835
1 parent 196b51f commit 887d200

1 file changed

Lines changed: 20 additions & 0 deletions

File tree

modules/dialog/dlg_handlers.c

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2487,6 +2487,15 @@ void dlg_ontimeout(struct dlg_tl *tl)
24872487
}
24882488
}
24892489

2490+
/* Re-read state under lock before running on_timeout route.
2491+
* The cached dlg_state from the top of this function may be stale
2492+
* if a BYE arrived between our initial read and now - a concurrent
2493+
* worker may have transitioned the dialog to DELETED.
2494+
* (GH-3835, third race path) */
2495+
dlg_lock(d_table, d_entry);
2496+
dlg_state = dlg->state;
2497+
dlg_unlock(d_table, d_entry);
2498+
24902499
if (do_expire_actions
24912500
&& ref_script_route_check_and_update(dlg->rt_on_timeout)
24922501
&& dlg_state<DLG_STATE_DELETED) {
@@ -2515,6 +2524,17 @@ void dlg_ontimeout(struct dlg_tl *tl)
25152524
* here, as the following code will do this for us later */
25162525
}
25172526

2527+
/* Re-check state under lock immediately before acting on it.
2528+
* The cached dlg_state may be stale if a BYE worker transitioned
2529+
* the dialog to DELETED between our initial read and now. Without
2530+
* this re-read, the timer handler enters the bye_on_timeout path,
2531+
* calls dlg_end_dlg() + unref, while the BYE worker also completes
2532+
* its unref chain - resulting in "bogus ref -1 with cnt 1".
2533+
* (GH-3835, third race path) */
2534+
dlg_lock(d_table, d_entry);
2535+
dlg_state = dlg->state;
2536+
dlg_unlock(d_table, d_entry);
2537+
25182538
if ((dlg->flags&DLG_FLAG_BYEONTIMEOUT) &&
25192539
(dlg_state==DLG_STATE_CONFIRMED_NA || dlg_state==DLG_STATE_CONFIRMED)) {
25202540

0 commit comments

Comments
 (0)