You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix data race on should_stop_ flag in LLM runner (pytorch#18652)
should_stop_ is written from the caller thread via stop() and read from
the inference thread in the generate loop. A plain bool without
synchronization is undefined behavior per the C++ standard and can cause
the compiler to optimize away the cross-thread visibility on ARM
targets.
Change bool to std::atomic<bool> with relaxed memory ordering, which is
sufficient for a simple cancellation flag and has negligible overhead.
0 commit comments