You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Harden CodeCacheArray and profiler against signal-handler races (#413)
* Harden CodeCacheArray and profiler against signal-handler races
- Guard CodeCacheArray readers against NULL during concurrent add()
- Use CAS loop in add() to prevent exceeding MAX_NATIVE_LIBS
- Make at() non-blocking (return NULL instead of spinning)
- Cache anchor() in getJavaTraceAsync() to eliminate TOCTOU
- Add NULL guards on all anchor() dereferences in signal context
- Stop at first NULL in writeNativeLibraries() to avoid skipping entries
- Skip patching sanitizer runtime libs to prevent heap corruption
- Add NULL guard for lib->name() in patch_library_unlocked()
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Block profiling signals around RoutineInfo new/delete
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Address PR feedback: fix at() indentation, simplify writeNativeLibraries
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Harden J9+ASAN tests: forkEvery=1 and CI retry
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* ASAN test timeout, aarch64 retry parity, gdb watchdog
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix arm64 lockups: remove RT signal blocking, cap FP-chain and anchor recovery
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Address PR feedback: ACQUIRE ordering, assert in add(), comment fix
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add arm64 ordering, concurrent iteration, and comment precision rules
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Pointer-first CodeCacheArray::add() eliminates NULL window
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: AGENTS.md
+17Lines changed: 17 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -357,6 +357,17 @@ The profiler uses a sophisticated double-buffered storage system for call traces
357
357
-**Atomic Operations**: Instance ID management and counter updates use atomics
358
358
-**Memory Allocation**: Minimize malloc() in hot paths, use pre-allocated containers
359
359
360
+
### Atomic Memory Ordering (Critical for arm64)
361
+
arm64 has a weakly-ordered memory model (unlike x86 TSO). Incorrect ordering causes real lockups on arm64 that never reproduce on x86.
362
+
-**Cross-thread reads**: Always use `__ATOMIC_ACQUIRE` for loads that must see stores from another thread. Never use `__ATOMIC_RELAXED` for cross-thread visibility unless you can prove no ordering dependency exists.
363
+
-**Cross-thread writes**: Use `__ATOMIC_RELEASE` for stores that must be visible to other threads. Pair with `__ATOMIC_ACQUIRE` loads.
364
+
-**Count + pointer patterns**: When a data structure publishes a count and a separate pointer (two-phase add), both the count load and pointer load need acquire semantics so the reader sees the pointer store that preceded the count increment.
365
+
-**Default stance**: When in doubt, use acquire/release. The performance cost is negligible; the correctness cost of relaxed ordering bugs is enormous (silent arm64-only lockups).
366
+
367
+
### Concurrent Data Structure Iteration
368
+
-**NULL gaps**: When iterating a concurrent array (e.g., `CodeCacheArray`), always NULL-check each slot — a slot may be count-allocated but pointer-not-yet-stored.
369
+
-**Cursor advancement**: Never permanently advance an iteration cursor past NULL gaps. Stop at the first NULL or track the last contiguous non-NULL entry, so missing entries are retried on the next pass.
370
+
360
371
### 64-bit Trace ID System
361
372
-**Collision Avoidance**: Instance-based IDs prevent collisions across storage swaps
362
373
-**JFR Compatibility**: 64-bit IDs work with JFR constant pool indices
@@ -705,3 +716,9 @@ The CI caches JDKs via `.github/workflows/cache_java.yml`. When adding a new JDK
705
716
- Always provide tests for bug fixes - test fails before the fix, passes after the fix
706
717
- All code needs to strive to be lean in terms of resources consumption and easy to follow -
707
718
do not shy away from factoring out self containing code to shorter functions with explicit name
719
+
720
+
### C/C++ Code Style
721
+
- **Indentation**: Match the exact indentation style of the surrounding code in each file. Do not introduce inconsistent indentation — reviewers will flag it.
722
+
- **Minimal complexity**: Do not split inline logic into separate helper functions unless the helpers are reused or the original is genuinely hard to follow. Unnecessary splits add indirection without value.
723
+
- **Comment precision**: Comments explaining "why" must reference concrete mechanisms (e.g., "ASAN's allocator lock is reentrant" not "internal bookkeeping"). Vague comments get challenged in review. Every claim in a comment must be verifiable from the code or documented behavior of the referenced system (ASAN, glibc NPTL, HotSpot, etc.).
724
+
- **No speculative comments**: Do not claim a system (HotSpot, glibc, ASAN) uses a specific mechanism unless you are certain. If unsure, describe the observable symptom instead of guessing the cause.
0 commit comments