Skip to content
This repository was archived by the owner on Mar 3, 2026. It is now read-only.
This repository was archived by the owner on Mar 3, 2026. It is now read-only.

Some tests will hang when running tests with pytest on a TPU VM #375

@erichuang-cienet

Description

@erichuang-cienet

When I use pytest to run all tests in the repository on a TPU VM, some of them hang. The hanging tests are as follows:

  • torchprime/launcher/test_run_model.py
  • torchprime/tests/test_parallelism_utils.py
  • torchprime/tests/test_system_check.py
  • torchprime/torch_xla_models/tests/test_assume_pure.py
  • torchprime/torch_xla_models/tests/test_deepseek_v3.py
  • torchprime/torch_xla_models/tests/test_llama.py
  • torchprime/torch_xla_models/tests/test_llama4.py
  • torchprime/torch_xla_models/tests/test_mixtral.py
  • torchprime/torch_xla_models/tests/test_model_loading_saving.py
  • torchprime/torch_xla_models/tests/test_sft_trainer.py
  • torchprime/torch_xla_models/tests/test_trainer.py

However, when I run the above tests individually, they do not hang, except for torchprime/launcher/test_run_model.py.

I found that I need to disable the xla_tpu_use_enhanced_launch_barrier flag to prevent these tests from hanging.

Command:

export LIBTPU_INIT_ARGS='--xla_tpu_use_enhanced_launch_barrier=false'
pytest -v

Environment:

TPU VM: v6e-8
Python 3.11
torch 2.9.0.dev20250825+cpu
torch-xla 2.9.0+git8243a25

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions