-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Fix OpenBLAS atfork SIGSEGV during Kit startup #5693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| Fixed | ||
| ^^^^^ | ||
|
|
||
| * Fixed a ``SIGSEGV`` crash during Kit startup caused by NumPy's bundled | ||
| OpenBLAS ``pthread_atfork`` handler. When ``import torch`` (or any | ||
| transitive NumPy import) runs before :class:`AppLauncher` creates the | ||
| :class:`~isaacsim.SimulationApp`, OpenBLAS spawns worker threads and | ||
| registers ``blas_thread_shutdown_`` as a child-side ``atfork`` handler. | ||
| Kit's ``libomni.platforminfo.plugin`` then calls ``fork()`` during | ||
| startup; in the child process the handler tries to ``pthread_join`` | ||
| threads that no longer exist, causing a segmentation fault. The fix | ||
| sets ``OPENBLAS_NUM_THREADS=1`` (via ``setdefault``) before the library | ||
| is loaded so that no worker threads are created and the handler is a | ||
| safe no-op. Both :mod:`app_launcher` (for standalone scripts) and | ||
| ``tools/conftest.py`` (for CI test subprocesses) are patched. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -316,6 +316,14 @@ def run_individual_tests(test_files, workspace_root, isaacsim_ci): | |
| file_name = os.path.basename(test_file) | ||
| env = os.environ.copy() | ||
| env["PYTHONFAULTHANDLER"] = "1" | ||
| # Prevent OpenBLAS fork-safety crash: when NumPy or SciPy is imported | ||
| # before Kit starts, OpenBLAS spawns a worker-thread pool and registers | ||
| # a pthread_atfork handler (blas_thread_shutdown_). Kit's platform-info | ||
| # plugin calls fork() during startup; in the child the handler tries to | ||
| # pthread_join threads that no longer exist → SIGSEGV. Limiting | ||
| # OpenBLAS to a single thread before the subprocess starts avoids the | ||
| # crash because no worker threads are created and the handler is a no-op. | ||
| env.setdefault("OPENBLAS_NUM_THREADS", "1") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The guard is injected only when tests are dispatched through |
||
|
|
||
| timeout = test_settings.PER_TEST_TIMEOUTS.get(file_name, test_settings.DEFAULT_TIMEOUT) | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OPENBLAS_NUM_THREADS=1is set unconditionally at module scope, so every process that importsapp_launcher— including users running batch physics computations, inverse-kinematics solves, or any heavy NumPy/SciPy workload — silently loses multi-threaded BLAS performance. The fork-safety hazard only materialises during startup when Kit callsfork(), so on hardware where the issue does not reproduce, users pay the single-thread tax with no benefit. A narrower alternative would be to reset the env var (or the pool) only when a fork is about to occur, e.g. viaos.register_at_fork, but if the performance cost is accepted for Isaac Lab's GPU-first workloads this is fine as-is.