Skip to content

Commit 048c11e

Browse files
superdav42claude
andcommitted
Fix GH-8739: opcache_reset()/invalidate() races via epoch-based reclamation
opcache_reset() and opcache_invalidate() destroy shared memory (hash table memset, interned strings restore, SHM watermark reset) while concurrent reader threads/processes still hold pointers into that memory. This causes zend_mm_heap corrupted crashes under concurrent load in both ZTS (FrankenPHP, parallel) and FPM configurations. A prior fix attempt (PR #14803) wrapped cache lookups in zend_shared_alloc_lock(), which was rejected by Dmitry Stogov because readers must remain lock-free. This patch introduces epoch-based reclamation (EBR), a proven lock-free pattern also used by RCU in the Linux kernel and by Crossbeam in Rust: - Readers publish their epoch on request start (one atomic store) and clear it on request end (one atomic store). No locks acquired. - Writers (opcache_reset path) increment the global epoch and defer the actual hash/interned-string cleanup until all pre-epoch readers have completed. The deferred reset completes at the next request boundary after the drain is complete. - The existing immediate-cleanup fast path is retained for the case when accel_is_inactive() reports no active readers. Additional safety: - When the per-slot pool (256 slots) is exhausted, readers fall back to an aggregate overflow counter that unconditionally blocks deferred reclamation while non-zero. This preserves safety for installations with more concurrent readers than slots. - UNASSIGNED vs OVERFLOW sentinels distinguish "slot not yet attempted" from "allocation failed", avoiding per-request atomic contention on the slot-next counter once slots are exhausted. - A full memory barrier after zend_accel_hash_clean()'s memset ensures readers on weak-memory architectures see the zeroed table before any subsequent insert. - A defensive ZCG(locked) check in accel_try_complete_deferred_reset() prevents the ZEND_ASSERT(!ZCG(locked)) failure that would otherwise fire if the completer is re-entered while the SHM lock is held. Fixes GH-8739 Fixes GH-14471 Fixes GH-18517 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent fbd3017 commit 048c11e

File tree

7 files changed

+542
-0
lines changed

7 files changed

+542
-0
lines changed

NEWS

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,9 @@ PHP NEWS
3434
. Fixed bug GH-21593 (Borked function JIT JMPNZ smart branch). (ilutov)
3535
. Fixed bug GH-21460 (COND optimization regression). (Dmitry, Arnaud)
3636
. Fixed faulty returns out of zend_try block in zend_jit_trace(). (ilutov)
37+
. Fixed bug GH-8739 (opcache_reset()/opcache_invalidate() race causes
38+
zend_mm_heap corrupted under concurrent load in ZTS and FPM).
39+
(superdav42)
3740

3841
- OpenSSL:
3942
. Fix a bunch of memory leaks and crashes on edge cases. (ndossche)

ext/opcache/ZendAccelerator.c

Lines changed: 298 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,8 @@ static zend_result (*orig_post_startup_cb)(void);
136136

137137
static zend_result accel_post_startup(void);
138138
static zend_result accel_finish_startup(void);
139+
static void zend_reset_cache_vars(void);
140+
static void accel_interned_strings_restore_state(void);
139141

140142
static void preload_shutdown(void);
141143
static void preload_activate(void);
@@ -402,6 +404,228 @@ static inline void accel_unlock_all(void)
402404
#endif
403405
}
404406

407+
/* ============================================================
408+
* Epoch-based reclamation (EBR) for safe opcache_reset()/invalidate()
409+
*
410+
* Background:
411+
* ZCSG(hash) and the interned strings buffer are read lock-free by
412+
* reader threads, but mutated destructively by opcache_reset()/
413+
* opcache_invalidate() (memset of hash, restore of interned strings).
414+
* This causes zend_mm_heap corruption under concurrent load: a reader
415+
* that obtained a pointer just before the writer ran may follow it
416+
* after the writer freed/zeroed the underlying memory (GH-8739,
417+
* GH-14471, GH-18517).
418+
*
419+
* Approach:
420+
* Readers publish their "epoch" on request start (one atomic store)
421+
* and clear it on request end (one atomic store). Writers do not
422+
* immediately reclaim memory; instead they bump the global epoch and
423+
* defer the destructive cleanup until every reader that was active
424+
* when the writer ran has completed its request. This is the same
425+
* pattern used by RCU in the Linux kernel and by Crossbeam in Rust.
426+
*
427+
* Constraints satisfied:
428+
* - Lock-free reads preserved: the per-request cost is two cache-line
429+
* padded atomic stores. Readers never block on writers.
430+
* - Bounded delay: deferred cleanup completes as soon as all readers
431+
* active at write time finish their requests.
432+
* - Safe under slot exhaustion: readers that fail to obtain a per-
433+
* reader slot increment ZCSG(epoch_overflow_active) instead, which
434+
* unconditionally blocks deferred reclamation while non-zero.
435+
* ============================================================ */
436+
437+
void accel_epoch_init(void)
438+
{
439+
int i;
440+
441+
/* Start at epoch 1 so that drain_epoch=0 is a sentinel meaning
442+
* "no reset has ever been deferred". */
443+
ZCSG(current_epoch) = 1;
444+
ZCSG(drain_epoch) = 0;
445+
ZCSG(reset_deferred) = false;
446+
ZCSG(epoch_slot_next) = 0;
447+
ZCSG(epoch_overflow_active) = 0;
448+
449+
for (i = 0; i < ACCEL_EPOCH_MAX_SLOTS; i++) {
450+
ZCSG(epoch_slots)[i].epoch = ACCEL_EPOCH_INACTIVE;
451+
}
452+
}
453+
454+
static int32_t accel_epoch_alloc_slot(void)
455+
{
456+
/* Atomically claim the next slot. Once a slot is claimed, it stays
457+
* with this thread/process for the lifetime of the SHM segment.
458+
* If the per-slot pool is exhausted, return SLOT_OVERFLOW so the
459+
* caller can fall back to the aggregate overflow counter. */
460+
int32_t slot;
461+
462+
#if defined(ZEND_WIN32)
463+
slot = (int32_t)InterlockedIncrement((volatile LONG *)&ZCSG(epoch_slot_next)) - 1;
464+
#elif defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 7))
465+
slot = __atomic_fetch_add(&ZCSG(epoch_slot_next), 1, __ATOMIC_SEQ_CST);
466+
#elif defined(__GNUC__)
467+
slot = __sync_fetch_and_add(&ZCSG(epoch_slot_next), 1);
468+
#else
469+
slot = ZCSG(epoch_slot_next)++;
470+
#endif
471+
472+
if (slot < 0 || slot >= ACCEL_EPOCH_MAX_SLOTS) {
473+
return ACCEL_EPOCH_SLOT_OVERFLOW;
474+
}
475+
return slot;
476+
}
477+
478+
void accel_epoch_enter(void)
479+
{
480+
int32_t slot = ZCG(epoch_slot);
481+
482+
if (UNEXPECTED(slot == ACCEL_EPOCH_SLOT_UNASSIGNED)) {
483+
slot = accel_epoch_alloc_slot();
484+
ZCG(epoch_slot) = slot;
485+
}
486+
487+
if (EXPECTED(slot >= 0)) {
488+
uint64_t epoch = ACCEL_ATOMIC_LOAD_64(&ZCSG(current_epoch));
489+
ZCG(local_epoch) = epoch;
490+
/* Release-store: ensures any subsequent reads of shared OPcache
491+
* data are seen as part of this published epoch. */
492+
ACCEL_ATOMIC_STORE_64(&ZCSG(epoch_slots)[slot].epoch, epoch);
493+
} else {
494+
/* Slot allocation failed (overflow). Track this reader via the
495+
* aggregate counter so deferred reclamation knows to wait. */
496+
ACCEL_ATOMIC_INC_32(&ZCSG(epoch_overflow_active));
497+
ZCG(local_epoch) = 0;
498+
}
499+
}
500+
501+
void accel_epoch_leave(void)
502+
{
503+
int32_t slot = ZCG(epoch_slot);
504+
505+
if (EXPECTED(slot >= 0)) {
506+
ACCEL_ATOMIC_STORE_64(&ZCSG(epoch_slots)[slot].epoch, ACCEL_EPOCH_INACTIVE);
507+
} else if (slot == ACCEL_EPOCH_SLOT_OVERFLOW) {
508+
ACCEL_ATOMIC_DEC_32(&ZCSG(epoch_overflow_active));
509+
}
510+
/* SLOT_UNASSIGNED: enter() was never called for this thread/process
511+
* (e.g. file_cache_only path); nothing to release. */
512+
}
513+
514+
static uint64_t accel_min_active_epoch(void)
515+
{
516+
uint64_t min_epoch = ACCEL_EPOCH_INACTIVE;
517+
int i;
518+
519+
for (i = 0; i < ACCEL_EPOCH_MAX_SLOTS; i++) {
520+
uint64_t e = ACCEL_ATOMIC_LOAD_64(&ZCSG(epoch_slots)[i].epoch);
521+
if (e < min_epoch) {
522+
min_epoch = e;
523+
}
524+
}
525+
return min_epoch;
526+
}
527+
528+
bool accel_deferred_reset_pending(void)
529+
{
530+
return ZCSG(reset_deferred);
531+
}
532+
533+
void accel_try_complete_deferred_reset(void)
534+
{
535+
uint64_t drain_epoch;
536+
uint64_t min_epoch;
537+
uint32_t overflow;
538+
539+
/* Lock-free fast-path check: if no reset is deferred, return
540+
* immediately. The check is racy but correct — the worst case is
541+
* that we miss completing a just-deferred reset on this request,
542+
* and the next request picks it up. */
543+
if (!ZCSG(reset_deferred)) {
544+
return;
545+
}
546+
547+
/* Defensive: if this thread is already holding the SHM lock (e.g.
548+
* because we are nested inside the existing restart_pending path or
549+
* a leftover lock from an earlier failure), skip — re-entering
550+
* zend_shared_alloc_lock() would assert in debug builds. The next
551+
* request boundary will retry. */
552+
if (ZCG(locked)) {
553+
return;
554+
}
555+
556+
overflow = ACCEL_ATOMIC_LOAD_32(&ZCSG(epoch_overflow_active));
557+
if (overflow > 0) {
558+
/* At least one reader exists that we cannot precisely track.
559+
* It may hold pointers into shared memory we are about to
560+
* reclaim, so we must wait until it leaves. */
561+
return;
562+
}
563+
564+
drain_epoch = ACCEL_ATOMIC_LOAD_64(&ZCSG(drain_epoch));
565+
min_epoch = accel_min_active_epoch();
566+
567+
if (min_epoch <= drain_epoch) {
568+
/* Some pre-drain reader is still publishing an old epoch. */
569+
return;
570+
}
571+
572+
/* All pre-drain readers have left and no overflow readers are
573+
* active. Take the SHM lock and complete the cleanup. The lock
574+
* also serializes us against any other thread that may be racing
575+
* to complete the same deferred reset. */
576+
zend_shared_alloc_lock();
577+
578+
if (ZCSG(reset_deferred)) {
579+
/* Re-check overflow under the SHM lock. A new overflow reader
580+
* could have entered between our load and acquiring the lock,
581+
* but new readers always observe current_epoch (which is >
582+
* drain_epoch), so they are not at risk from the cleanup we
583+
* are about to perform. The overflow check above was therefore
584+
* sufficient — but the re-check on min_active_epoch protects
585+
* against any readers that published drain_epoch since then. */
586+
min_epoch = accel_min_active_epoch();
587+
if (min_epoch > ACCEL_ATOMIC_LOAD_64(&ZCSG(drain_epoch))) {
588+
zend_accel_error(ACCEL_LOG_DEBUG,
589+
"Completing deferred opcache reset (drain_epoch=%" PRIu64
590+
", min_active_epoch=%" PRIu64 ")",
591+
drain_epoch, min_epoch);
592+
593+
accel_restart_enter();
594+
595+
zend_map_ptr_reset();
596+
zend_reset_cache_vars();
597+
zend_accel_hash_clean(&ZCSG(hash));
598+
599+
if (ZCG(accel_directives).interned_strings_buffer) {
600+
accel_interned_strings_restore_state();
601+
}
602+
603+
zend_shared_alloc_restore_state();
604+
605+
if (ZCSG(preload_script)) {
606+
preload_restart();
607+
}
608+
609+
#ifdef HAVE_JIT
610+
zend_jit_restart();
611+
#endif
612+
613+
ZCSG(accelerator_enabled) = ZCSG(cache_status_before_restart);
614+
if (ZCSG(last_restart_time) < ZCG(request_time)) {
615+
ZCSG(last_restart_time) = ZCG(request_time);
616+
} else {
617+
ZCSG(last_restart_time)++;
618+
}
619+
620+
ZCSG(reset_deferred) = false;
621+
622+
accel_restart_leave();
623+
}
624+
}
625+
626+
zend_shared_alloc_unlock();
627+
}
628+
405629
/* Interned strings support */
406630

407631
/* O+ disables creation of interned strings by regular PHP compiler, instead,
@@ -2692,6 +2916,19 @@ ZEND_RINIT_FUNCTION(zend_accelerator)
26922916
ZCG(counted) = false;
26932917
}
26942918

2919+
/* Epoch-based reclamation (GH-8739): publish this reader's epoch
2920+
* BEFORE any check or use of shared OPcache data. If a writer
2921+
* subsequently defers a reset, our published epoch ensures it
2922+
* waits for us to leave before reclaiming any memory we may have
2923+
* read. */
2924+
accel_epoch_enter();
2925+
2926+
/* If a previous reset is awaiting reader drain, attempt to complete
2927+
* it now. Safe because our just-published epoch is > drain_epoch. */
2928+
if (UNEXPECTED(ZCSG(reset_deferred))) {
2929+
accel_try_complete_deferred_reset();
2930+
}
2931+
26952932
if (ZCSG(restart_pending)) {
26962933
zend_shared_alloc_lock();
26972934
if (ZCSG(restart_pending)) { /* check again, to ensure that the cache wasn't already cleaned by another process */
@@ -2735,6 +2972,43 @@ ZEND_RINIT_FUNCTION(zend_accelerator)
27352972
ZCSG(last_restart_time)++;
27362973
}
27372974
accel_restart_leave();
2975+
} else if (!ZCSG(reset_deferred)) {
2976+
/* Active readers detected (legacy flock check is non-empty)
2977+
* AND no reset is already deferred. Convert the pending
2978+
* restart into a deferred reset: bump the global epoch
2979+
* (so new readers publish a higher epoch and won't be
2980+
* waited on), record drain_epoch, and disable the cache
2981+
* until cleanup completes. The actual hash/SHM cleanup
2982+
* happens in accel_try_complete_deferred_reset() called
2983+
* from the next request boundary, after all readers from
2984+
* the old epoch have left. (GH-8739) */
2985+
zend_accel_error(ACCEL_LOG_DEBUG,
2986+
"Deferring opcache restart: active readers detected");
2987+
ZCSG(restart_pending) = false;
2988+
2989+
switch ZCSG(restart_reason) {
2990+
case ACCEL_RESTART_OOM:
2991+
ZCSG(oom_restarts)++;
2992+
break;
2993+
case ACCEL_RESTART_HASH:
2994+
ZCSG(hash_restarts)++;
2995+
break;
2996+
case ACCEL_RESTART_USER:
2997+
ZCSG(manual_restarts)++;
2998+
break;
2999+
}
3000+
3001+
/* Record drain_epoch as the CURRENT epoch (the one we
3002+
* and any other active reader published). Then bump
3003+
* current_epoch so subsequent readers publish a higher
3004+
* epoch and don't block the eventual cleanup. */
3005+
ACCEL_ATOMIC_STORE_64(&ZCSG(drain_epoch),
3006+
ACCEL_ATOMIC_LOAD_64(&ZCSG(current_epoch)));
3007+
ACCEL_ATOMIC_INC_64(&ZCSG(current_epoch));
3008+
3009+
ZCSG(cache_status_before_restart) = ZCSG(accelerator_enabled);
3010+
ZCSG(accelerator_enabled) = false;
3011+
ZCSG(reset_deferred) = true;
27383012
}
27393013
}
27403014
zend_shared_alloc_unlock();
@@ -2789,6 +3063,22 @@ zend_result accel_post_deactivate(void)
27893063
return SUCCESS;
27903064
}
27913065

3066+
/* Leave the epoch — this thread/process no longer holds references
3067+
* to shared OPcache data. Must happen BEFORE attempting to complete
3068+
* a deferred reset, so our own slot doesn't block the drain check. */
3069+
if (!file_cache_only && accel_shared_globals) {
3070+
SHM_UNPROTECT();
3071+
accel_epoch_leave();
3072+
3073+
/* If a deferred reset is pending, this may be the request that
3074+
* completes the drain. Try it now to keep latency low — the
3075+
* try_complete function is a no-op if drain isn't yet complete. */
3076+
if (UNEXPECTED(ZCSG(reset_deferred))) {
3077+
accel_try_complete_deferred_reset();
3078+
}
3079+
SHM_PROTECT();
3080+
}
3081+
27923082
zend_shared_alloc_safe_unlock(); /* be sure we didn't leave cache locked */
27933083
accel_unlock_all();
27943084
ZCG(counted) = false;
@@ -2936,6 +3226,11 @@ static zend_result zend_accel_init_shm(void)
29363226

29373227
zend_reset_cache_vars();
29383228

3229+
/* Initialize epoch-based reclamation tracking. Must happen after the
3230+
* memset of accel_shared_globals so we don't immediately overwrite
3231+
* the values. (GH-8739) */
3232+
accel_epoch_init();
3233+
29393234
ZCSG(oom_restarts) = 0;
29403235
ZCSG(hash_restarts) = 0;
29413236
ZCSG(manual_restarts) = 0;
@@ -2962,6 +3257,9 @@ static void accel_globals_ctor(zend_accel_globals *accel_globals)
29623257
memset(accel_globals, 0, sizeof(zend_accel_globals));
29633258
accel_globals->key = zend_string_alloc(ZCG_KEY_LEN, true);
29643259
GC_MAKE_PERSISTENT_LOCAL(accel_globals->key);
3260+
/* No epoch slot has been allocated to this thread/process yet. The
3261+
* first call to accel_epoch_enter() will atomically claim one. */
3262+
accel_globals->epoch_slot = ACCEL_EPOCH_SLOT_UNASSIGNED;
29653263
}
29663264

29673265
static void accel_globals_dtor(zend_accel_globals *accel_globals)

0 commit comments

Comments
 (0)