You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[None][test] CBTS sitecustomize: fix nested-pytest hang + argv detection
Two fixes in jenkins/scripts/cbts/coverage_utils/sitecustomize.py.
1. Use sys.orig_argv instead of sys.argv for pytest detection.
When invoked as `python -m pytest`, sys.argv at sitecustomize-run
time is ['-m', ...] -- the pytest path is injected later by the
runpy machinery. The previous check `any("pytest" in a for a in
sys.argv[:2])` therefore returned False for every `python -m pytest`
process, including pytest main itself, causing daemon threads to
spawn where they shouldn't.
2. Detect the nested-pytest layer used by
tests/integration/defs/test_unittests.py::test_unittests_v2 (an
outer pytest test that subprocess-spawns `python -m pytest
<unittest_file>`), and in that layer skip the periodic-save /
marker-poll / mpi-watcher daemon threads while still synchronously
applying install_mpi_pool_patch. The 100ms-loop daemons in the
middle layer were contending with the zmq IPC threads talking to
MPI workers, pushing LLM() fixture setup past the 120s
pytest-timeout on A30-PyTorch-1 (128 timeouts in
unittest/_torch/sampler/test_logits_logprobs.py). MPI workers
still get the widened env via the synchronous patch, so worker
coverage (the bulk of py_executor.py data) is preserved.
Validated locally on H100 with a minimal repro: nested pytest spawns
LLM(TinyLlama), inner pytest has zero cbts-* daemon threads, mpi patch
is applied, MPI worker starts full coverage, LLM init + generate
complete normally.
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
0 commit comments