ImportError / TypeError on Windows with AMD ROCm PyTorch due to torch.distributed dependency

### System Info

- Transformers version: 5.9.0
- Platform: Windows-11-10.0.26200-SP0
- Python version: 3.12.10
- PyTorch version: 2.9.1+rocm7.2.1
- CUDA/ROCm available: True
- GPU Name: AMD Radeon RX 7600

### Who can help?

@ivarflakstad 

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

When running PyTorch on Windows with an AMD GPU (ROCm/DirectML), torch.distributed is incomplete or missing core networking components (like FileStore, Store, _DistributedBackendOptions) by design from the PyTorch team.

Newer versions of transformers unconditionally import torch.distributed.tensor.device_mesh inside transformers/generation/continuous_batching/distributed.py. This breaks basic imports like CLIPSegProcessor or CLIPSegForImageSegmentation on these platforms, making it impossible to use these models on Windows + AMD setups.

Traceback encountered:
```
File "...\transformers\generation\continuous_batching\distributed.py", line 19, in <module>
  from torch.distributed.tensor.device_mesh import DeviceMesh
ImportError: cannot import name 'FileStore' from 'torch.distributed'
```
(Additionally, mocking the module to return None leads to a TypeError: unsupported operand type(s) for |: 'NoneType' due to the DeviceMesh | None type hint in the same file).


### Expected behavior

The `transformers` library should be able to import basic model pipelines (like `CLIPSegProcessor` and `CLIPSegForImageSegmentation`) without throwing errors, even on platforms where `torch.distributed` is missing or incomplete (such as PyTorch for AMD/ROCm on Windows).

Since distributed networking features are not required for single-GPU inference setups, missing components in `torch.distributed` should be handled gracefully (e.g., wrapped in a try/except block or guarded with a fallback), rather than causing a hard crash during initial module loading.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImportError / TypeError on Windows with AMD ROCm PyTorch due to torch.distributed dependency #46170

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

ImportError / TypeError on Windows with AMD ROCm PyTorch due to torch.distributed dependency #46170

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions