test(formatters): query_clarification times out on CPU during full test run

## Summary

`test/formatters/granite/test_intrinsics_formatters.py::test_run_transformers[query_clarification]` hangs for 180+ seconds during `torch.nn.Linear.forward` (stuck in `F.linear()`) when running on CPU/MPS without GPU.

## Context

Discovered during full test run for #1294 (aLoRA input\_ids fix). This is a separate pre-existing issue.

## Details

- **Model:** `ibm-granite/granite-4.1-3b`
- **Context:** 4 conversation turns + 12 retrieved documents (~2000+ tokens)
- **Generation:** `max_completion_tokens: 512`
- **Hardware:** Apple Silicon (MPS), no CUDA

This is a performance/resource limitation, not a code bug. The test is already gated on CI via `gh_run == 1` xfail, but it blocks local CI-aware runs.

## Proposed Fix

Add a `@pytest.mark.slow` marker or a CPU-specific skip to prevent it from blocking local test runs, or increase the timeout for this specific test case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(formatters): query_clarification times out on CPU during full test run #1295

Summary

Context

Details

Proposed Fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

test(formatters): query_clarification times out on CPU during full test run #1295

Description

Summary

Context

Details

Proposed Fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions