All LoRA sampler checkpoints permanently hang on sample_async() — base model works fine

## Bug Description

Calling `sample_async()` with any LoRA checkpoint hangs indefinitely (times out after 60s), while base model sampling works normally (~2.4s response time).

## Environment
- Tinker SDK: 0.18.2 (latest from PyPI)
- Base model: `Qwen/Qwen3-4B-Instruct-2507`
- LoRA rank: 16
- 5 training runs, 31 checkpoints total, all show "Never" expiry in console

## Reproduction

```python
from tinker import ServiceClient, ModelInput, SamplingParams
import asyncio

async def test():
    sc = ServiceClient()
    
    # Base model — works, ~2.4s
    sampler = await sc.create_sampling_client_async(base_model="Qwen/Qwen3-4B-Instruct-2507")
    
    # Any LoRA checkpoint — permanently hangs, 60s timeout
    sampler = await sc.create_sampling_client_async(
        model_path="tinker://0ac02541-f609-540f-860e-0bf885b7292e:train:0/sampler_weights/dpo_final"
    )
    # create_sampling_client_async() returns OK (0.5s)
    # but sample_async() never returns — times out after 60s with CancelledError
```

## Tested checkpoints (all hang)
- `tinker://0ac02541-f609-540f-860e-0bf885b7292e:train:0/sampler_weights/dpo_final` (created 23 min ago, 136 MB)
- `tinker://a735527d-a478-59b1-b510-e3b0211f4989:train:0/sampler_weights/dpo_final` (created 2 weeks ago, 136 MB)
- All show "Never" expiry in the Tinker console

## Call stack at timeout
```
sample_async() → _sample_async_impl() → _APIFuture.result_async()
  → asyncio.wait_for(future, timeout=60) → CancelledError
```

No server response is returned — not a 400/404/500 error. The request appears to be queued server-side and never processed.

## Related Issues
- GitHub Issue #234 ("Stuck when sampling") was closed on 2026-03-27 by @YujiaBao with comment "Closing for now — if still experiencing, please reopen." The issue described the same 7200s permanent hang with `sample_async()` in a CUA RL workflow. That fix does not appear to have resolved this issue.

## Impact
All inference via LoRA checkpoints is blocked. DPO/SFT training works fine and checkpoints save successfully, but sampling from trained checkpoints is completely unusable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All LoRA sampler checkpoints permanently hang on sample_async() — base model works fine #108

Bug Description

Environment

Reproduction

Tested checkpoints (all hang)

Call stack at timeout

Related Issues

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

All LoRA sampler checkpoints permanently hang on sample_async() — base model works fine #108

Description

Bug Description

Environment

Reproduction

Tested checkpoints (all hang)

Call stack at timeout

Related Issues

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions