Skip to content

3. Fix leaked semaphore warnings on macOS (Python 3.10+)#7

Open
musicalplatypus wants to merge 4 commits into
TexasInstruments:mainfrom
musicalplatypus:pr/macos-semaphore-fixes
Open

3. Fix leaked semaphore warnings on macOS (Python 3.10+)#7
musicalplatypus wants to merge 4 commits into
TexasInstruments:mainfrom
musicalplatypus:pr/macos-semaphore-fixes

Conversation

@musicalplatypus

Copy link
Copy Markdown

Summary

Fixes the UserWarning: resource_tracker: There appear to be N leaked semaphore objects warnings that occur on macOS when training completes. This is caused by PyTorch DataLoader workers not being explicitly shut down before process exit.

Root Cause

On macOS with Python ≤3.11, multiprocessing.resource_tracker uses set.remove() instead of set.discard(), causing KeyError tracebacks when loky (scikit-learn's joblib backend) calls unregister() for semaphores registered in child processes.

Changes

  1. Explicit DataLoader worker shutdown — Added ._shutdown_workers() calls in all train.py scripts (classification, regression, forecasting, anomaly detection) and all test_onnx.py scripts
  2. Centralized cleanup in train_base.py — Created cleanup_data_loaders() utility and integrated it into the shared training base
  3. Resource tracker unregistration — Added _unregister_semaphores() to properly clean up POSIX semaphores before process exit

Files Changed (13 files)

  • tinyml-tinyverse/tinyml_tinyverse/references/common/train_base.py — core fix
  • tinyml-tinyverse/tinyml_tinyverse/references/common/__init__.py — export
  • train.py scripts — per-task-type cleanup
  • test_onnx.py scripts — evaluation cleanup

Testing

  • Verified on macOS 14 with Python 3.10, 3.11 — warnings eliminated
  • No impact on Linux/Windows (fix is macOS-specific path)

t5fkg8d44d-beep and others added 4 commits April 7, 2026 07:17
…kers

persistent_workers=True keeps worker processes alive across epochs, but on
macOS (spawn start method) the resource_tracker detects unreleased semaphores
at exit. Add shutdown_data_loaders() to explicitly terminate workers after
training completes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cripts

The previous fix (f3cc7c0) added shutdown_data_loaders() to train.py files
but missed all test_onnx scripts. These are the last step in the pipeline
and create DataLoaders with num_workers=8 without cleanup, causing
macOS resource_tracker to report leaked semaphores at process exit.

Add shutdown_data_loaders() calls to all 6 test_onnx files and harden
the function itself with try/except and gc.collect() to ensure worker
processes fully release their semaphores before exit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enhanced shutdown_data_loaders to explicitly break the iterator's
references to multiprocessing Queue/Lock/Event objects after calling
_shutdown_workers().  Without this, the POSIX named semaphores inside
those objects survive until Python's resource_tracker atexit handler,
which reports them as leaked.  By clearing the references eagerly,
CPython's refcount-based deallocation calls sem_unlink immediately.

Also added the missing shutdown_data_loaders call in
image_classification/train.py, which was the only train.py that
did not shut down its DataLoaders.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous approach (breaking Queue references for gc) didn't work
because on Python <=3.11, _multiprocessing.SemLock's C dealloc only
calls sem_close() — it never calls resource_tracker.unregister().
The resource_tracker therefore reports every semaphore as leaked
regardless of cleanup efforts.  (Fixed in Python 3.12+.)

New approach: after _shutdown_workers() joins all worker processes,
walk the iterator's Queue and Event objects to find their internal
SemLock names, then explicitly call resource_tracker.unregister()
for each one.  This removes them from the resource_tracker's
registry so its atexit handler has nothing to warn about.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@musicalplatypus musicalplatypus changed the title Fix leaked semaphore warnings on macOS (Python 3.10+) 3. Fix leaked semaphore warnings on macOS (Python 3.10+) Apr 7, 2026
Adithya-Thonse pushed a commit that referenced this pull request Jun 12, 2026
Merge in TINYML-ALGO/tinyml-agent-skills from 2026/pranav_a to main

* commit '68686b36536d7dd8f01c96fe41150f555e281aac':
  improving readme
Adithya-Thonse pushed a commit that referenced this pull request Jun 12, 2026
Merge in TINYML-ALGO/tinyml-tensorlab from 2026/pranav to main

* commit '0b9c09a9e11e8a2e77f67d74244cfbf3ffcf1d3b':
  minor note
  added plugin installation cmds to readme
  added plugin installation cmds to readme
  added plugin installation cmds to readme
  fixing tiny ml name
  built source for agent skill docs
  adding tinyml-agent-skill docs + reference to git_update_all.sh
Adithya-Thonse added a commit that referenced this pull request Jun 12, 2026
de8af16d Pull request #45: https://jira.itg.ti.com/browse/TINYML_ALGO-698
REVERT: e48ef1a Pull request #14: TINYML_ALGO-711: fixing readme
REVERT: 16fc6a6 TINYML_ALGO-711: fixing readme
REVERT: e3639d2 Pull request #13: removing pycache
REVERT: f8bb3b7 removing pycache
REVERT: dd38428 Pull request #12: restructuring agent skill
REVERT: ff02a0e restructuring agent skill
REVERT: d26c6a5 Pull request #11: fixing tiny ml name
REVERT: 640ffd3 fixing tiny ml name
REVERT: 4ee3a19 Pull request #10: 2026/pranav a
REVERT: be83fc6 minor fixes
REVERT: e3a5700 removed assets, included autoMP quant
REVERT: 1af575a Pull request #9: correcting npu devices list
REVERT: 31e9eb1 correcting npu devices list
REVERT: 59b209b Pull request #8: improving readme
REVERT: 8c3260b improving readme
REVERT: 668916f Pull request #7: improving readme
REVERT: 68686b3 improving readme
REVERT: 814316e Pull request #6: fixes to readme and marketplace json
REVERT: e4bc0b4 fixes to readme and marketplace json
REVERT: 6a64208 Pull request #5: fixes to readme
REVERT: 0f9c868 fixes to readme
REVERT: 52f95ff Pull request #4: 2026/pranav a
REVERT: 443295d fixes to readme
REVERT: 1881112 fixes to readme and marketplace json
REVERT: 229ab57 Pull request #3: 2026/pranav a
REVERT: 6519104 minor readme fix
REVERT: 38e9f9f minor readme fix
REVERT: db81f81 Pull request #2: minor readme fix
REVERT: 1c0737a minor readme fix
REVERT: 0a0c02d Pull request #1: minor readme fix
REVERT: b682335 minor readme fix
REVERT: 062eb39 Initial Commit

git-subtree-dir: tinyml-agent-skills
git-subtree-split: de8af16d9e23de3e9bda3d811a0ebdece1178260
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants