Summary
test/formatters/granite/test_intrinsics_formatters.py::test_run_transformers[query_clarification] hangs for 180+ seconds during torch.nn.Linear.forward (stuck in F.linear()) when running on CPU/MPS without GPU.
Context
Discovered during full test run for #1294 (aLoRA input_ids fix). This is a separate pre-existing issue.
Details
- Model:
ibm-granite/granite-4.1-3b
- Context: 4 conversation turns + 12 retrieved documents (~2000+ tokens)
- Generation:
max_completion_tokens: 512
- Hardware: Apple Silicon (MPS), no CUDA
This is a performance/resource limitation, not a code bug. The test is already gated on CI via gh_run == 1 xfail, but it blocks local CI-aware runs.
Proposed Fix
Add a @pytest.mark.slow marker or a CPU-specific skip to prevent it from blocking local test runs, or increase the timeout for this specific test case.
Summary
test/formatters/granite/test_intrinsics_formatters.py::test_run_transformers[query_clarification]hangs for 180+ seconds duringtorch.nn.Linear.forward(stuck inF.linear()) when running on CPU/MPS without GPU.Context
Discovered during full test run for #1294 (aLoRA input_ids fix). This is a separate pre-existing issue.
Details
ibm-granite/granite-4.1-3bmax_completion_tokens: 512This is a performance/resource limitation, not a code bug. The test is already gated on CI via
gh_run == 1xfail, but it blocks local CI-aware runs.Proposed Fix
Add a
@pytest.mark.slowmarker or a CPU-specific skip to prevent it from blocking local test runs, or increase the timeout for this specific test case.