Skip to content

Commit a66d482

Browse files
nicopaussclaude
andcommitted
thr-job: fix TSan race between worker exit and main-thread globals.
Workers set `alive=false` inside thr_info_cleanup (from pthread_cleanup_pop) before thr_hooks_wrapper calls thr_detach, which runs per-thread exit hooks such as log_shutdown_thread -> mem_stack_pool_wipe -> mp_ifree (reads mem_pool_libc.free). Main's busy-wait in thr_shutdown can therefore observe alive=false and resume while the worker still accesses globals in thr_detach, with no happens-before. Move the alive store into a dedicated thr_mark_dead exit hook registered at the tail of thr_hooks_g.exit_cbs so it runs after every other per-thread exit hook. The atomic release store now happens-before the main thread's acquire load, covering all tear-down work. ``` ================== WARNING: ThreadSanitizer: data race (pid=738787) Write of size 8 at 0x555556b4c4e0 by main thread: #0 __z_str_block_invoke_2 tests/zchk-str.c:107:28 (zchk+0x9cf192) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #1 z_str tests/zchk-str.c:116:7 (zchk+0x9cb5ca) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #2 z_run src/core/z.blk:1545:9 (zchk+0x5d92c8) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #3 main tests/zchk.c:1206:12 (zchk+0x706599) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) Previous read of size 8 at 0x555556b4c4e0 by thread T4: #0 mp_ifree src/core/mem.blk:351:11 (zchk+0x1189890) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #1 mem_pool_wipe src/core/mem-priv.h:70:5 (zchk+0x1182f03) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #2 mem_stack_pool_wipe src/core/mem-stack.c:611:5 (zchk+0x1182cb6) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #3 log_shutdown_thread src/core/log.c:1157:9 (zchk+0x11734e4) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #4 thr_detach src/core/thr.c:50:9 (zchk+0x11dad05) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #5 thr_hooks_wrapper src/core/thr.c:90:5 (zchk+0x11db21c) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) Location is global 'mem_pool_libc' of size 64 at 0x555556b4c4c0 (zchk+0x15f84e0) Thread T4 (tid=738600, finished) created by main thread at: #0 pthread_create <null> (zchk+0x40a1ff) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #1 thr_create src/core/thr.c:111:11 (zchk+0x11db027) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #2 thr_fork_threads src/core/thr-job.blk:969:21 (zchk+0x11d6634) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #3 thr_initialize src/core/thr-job.blk:1140:5 (zchk+0x11d6eb5) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #4 module_require_internal src/core/module.c:323:9 (zchk+0x118e1fd) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #5 module_require src/core/module.c:335:5 (zchk+0x118cf19) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #6 z_thrjobs tests/zchk-thrjob.blk:598:5 (zchk+0xa7c0e2) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #7 z_run src/core/z.blk:1545:9 (zchk+0x5d92c8) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) #8 main tests/zchk.c:1206:12 (zchk+0x706599) (BuildId: dfdf1f6253fa8ff7dbda901d2ec5a372fc4e3092) SUMMARY: ThreadSanitizer: data race tests/zchk-str.c:107:28 in __z_str_block_invoke_2 ================== ``` Change-Id: Ie58fc71343a5d09374d7e85e8651df4a39d790f0 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Priv-Id: 06e9bf4d91b91b670766cab1818facf14a6c3fb4
1 parent 8699f19 commit a66d482

1 file changed

Lines changed: 22 additions & 0 deletions

File tree

src/core/thr-job.blk

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -792,10 +792,32 @@ static void thr_info_cleanup(void *unused)
792792
self_g->ncache.qnode.next = n->qnode.next;
793793
free(n);
794794
}
795+
}
796+
797+
/* Mark the current worker thread as dead. Registered as an exit hook at the
798+
* tail of thr_hooks_g.exit_cbs so it runs after all other per-thread exit
799+
* hooks (e.g. log_shutdown_thread, t_pool_wipe). Setting `alive = false`
800+
* last ensures that main's busy-wait on `alive` in thr_shutdown establishes
801+
* a happens-before with every access the exiting thread made inside
802+
* thr_detach; otherwise TSan reports races between worker tear-down and
803+
* subsequent main-thread globals accesses.
804+
*/
805+
static void thr_mark_dead(void)
806+
{
807+
if (!self_g) {
808+
return;
809+
}
795810
atomic_thread_fence(memory_order_acq_rel);
796811
atomic_store(&self_g->alive, false);
797812
}
798813

814+
static struct thr_ctor thr_mark_dead_ctor = { .cb = &thr_mark_dead };
815+
816+
static __attribute__((constructor)) void thr_mark_dead_register(void)
817+
{
818+
dlist_add_tail(&thr_hooks_g.exit_cbs, &thr_mark_dead_ctor.link);
819+
}
820+
799821
static int thr_sleep_until(thr_evc_t *ec, int timeout, bool (^cond)(void))
800822
{
801823
uint64_t key = thr_ec_get(ec);

0 commit comments

Comments
 (0)