auto-route candidate enumeration collapses multi-tier subscription harnesses to DefaultModel only (claude → opus-4.7 always wins, sonnet never scored)

## Summary

Under `policy=default` (the auto-route mode used when no `--harness` / `--provider` / `--model` is pinned), Fizeau collapses each subprocess harness to **one** scoring candidate — its `DefaultModel`. Multi-tier harnesses like `claude` (which advertises sonnet, opus, haiku as siblings under one auth/subscription) therefore only ever offer their default tier to the auto-route scorer. Sonnet and Haiku can never win on cost vs Opus even though the catalog has full power/cost metadata for all three.

Concrete symptom from a downstream caller (DDx): every `policy=default` dispatch lands on `claude/opus-4.7` because `registry.go:51` sets `DefaultModel = "opus-4.7"`. With a 2207-token implementation prompt at role=implementer, Opus won score `100.5`. When the same dispatch is rerun with `--model sonnet-4.6` pinned, sonnet scores **196.0** — it would have beaten opus by 95.5 points if it had been enumerated.

## Evidence

Repro: a downstream `ddx work` dispatch with `policy=default` against the live catalog.

**Run 1 (no model pin, `policy=default`):** the `routing_decision` event includes exactly one claude candidate:

```json
{
  "harness": "claude",
  "model": "opus-4.7",
  "score": 100.5,
  "eligible": true,
  "score_components": {
    "base": 100, "context_headroom": 30, "cost": -22.5,
    "performance": -5, "power": -2, "quota_health": 5,
    "utilization": -5
  }
}
```

No row for sonnet — not even `eligible: false` with a `filter_reason`. Silent drop.

**Run 2 (same prompt, `--model sonnet-4.6` pinned):**

```
17:33:25 readiness route fiz/anthropic/claude-sonnet-4.6
         provider=openrouter reason=policy=default; score=196.0
```

Sonnet scores 196.0. So the cost-aware scorer works correctly when it can see sonnet. The pin also exposed a second bug: Fizeau routes the pinned sonnet through `fiz/openrouter` (the catalog's `openrouter_id`), **not** through the `claude` subscription harness — so pinning by model name does not produce a same-harness alternative tier even when one exists on the same auth/subscription.

## Root cause (tentative — pointers, not a patch)

The auto-route enumeration at `internal/harnesses/registry.go:51-52` and `service_routing.go:1172-1189` (v0.12.2 module path: `github.com/easel/fizeau/internal/harnesses/registry.go`, `service_routing.go`) goes:

```go
// service_routing.go:1172-1189 (auto-route candidate add)
for _, h := range entries {
    if h.DefaultModel != "" {
        add(h.DefaultModel, true, status)      // adds opus-4.7 for claude
    }
    for _, modelID := range h.SupportedModels {
        add(modelID, true, status)              // SHOULD add sonnet-4.6 too
    }
    ...
}
```

`subprocessHarnessModelIDs("claude", cfg)` at `service_models.go:91-101` returns the full `["sonnet", "sonnet-4.6", "opus", "opus-4.7", "claude-sonnet-4-6"]` set. So sonnet *should* be reaching the candidate pool. The evidence says it doesn't — either:

- The downstream aggregation/dedup is keeping one model per `{harness, provider}` key (picking the DefaultModel as representative), or
- The eligibility map keyed by `modelID` at line ~1170 is collapsing sonnet entries by family before the candidate list is emitted, or
- `h.SupportedModels` is being populated empty at the call site that builds `entries` (different from the `metadata_billing.go:71` path).

I didn't read deep enough into `routing.Inputs` construction to pin down which. The `routing_decision` evidence is unambiguous, though: **no sonnet row at all** in the candidates array, not even excluded. A row with `eligible: false, filter_reason: ...` would be diagnostic; silent absence is consistent with "never enumerated."

## Catalog is **not** the gap

I checked the embedded manifest at `internal/modelcatalog/catalog/models.yaml`:

```yaml
sonnet-4.6:
  family: claude-sonnet
  power: 8
  cost_input_per_m: 3.0
  cost_output_per_m: 15.0
  context_window: 1000000
  surfaces:
    agent.anthropic: sonnet-4.6
    claude-code: sonnet-4.6
```

Complete. Power is set (8 vs opus 10), cost is 5x cheaper than opus ($3/$15 vs $15/$75), the `claude-code` surface matches. This rules out the `power_missing` exclusion path (the one that drops `openrouter/anthropic/claude-haiku-4.5` with `filter_reason: "power_missing"`).

## Why this matters

The whole point of cost-aware auto-routing is to let cheap models do cheap work and reserve expensive models for hard work. Today `policy=default` on a multi-tier subscription harness silently always picks the most expensive tier:

- 2207-token implementation prompt → opus (would be sonnet if enumerated)
- short status-check prompt → opus (would be haiku if enumerated)

The cost gap is large: opus is 5× sonnet on input tokens, ~5× on output. For a project running `ddx work` continuously, defaulting every dispatch to opus is materially wrong.

## Repro

1. Configure Fizeau with the `claude` harness (subscription path, no model pin).
2. Send any execute request with `policy=default`, no `--harness`/`--provider`/`--model`.
3. Inspect the `routing_decision` event — only `claude/opus-4.7` appears under `claude`. No sonnet, no haiku.
4. Resend the same prompt with `--model sonnet-4.6`. The route goes to `fiz/openrouter/anthropic/claude-sonnet-4.6` with score 196.0, not to the claude subscription path.

## Suggested fix direction (not prescriptive)

For multi-tier subscription harnesses, enumerate one candidate per tier (opus, sonnet, haiku) with that tier's catalog power/cost, all routing through the same harness. Let cost-aware scoring pick the cheapest tier that meets `power_hint_fit` for the prompt. Today the harness behaves like "one model, take it or leave it"; it should behave like "one auth, multiple tiers."

If aggregation-by-harness is intentional for some other reason, the alternative is to make the harness configurable to expose multiple `DefaultModel`s per role/power-band, and have ddx pass a power hint that selects the right one.

## Caller

DDx CLI v? — see `https://github.com/erik-labianca/ddx` (or wherever appropriate). The DDx side does not pre-resolve routing knobs (per CONTRACT-003 / FEAT-010) and passes `policy=default` through verbatim.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto-route candidate enumeration collapses multi-tier subscription harnesses to DefaultModel only (claude → opus-4.7 always wins, sonnet never scored) #6

Summary

Evidence

Root cause (tentative — pointers, not a patch)

Catalog is not the gap

Why this matters

Repro

Suggested fix direction (not prescriptive)

Caller

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

auto-route candidate enumeration collapses multi-tier subscription harnesses to DefaultModel only (claude → opus-4.7 always wins, sonnet never scored) #6

Description

Summary

Evidence

Root cause (tentative — pointers, not a patch)

Catalog is not the gap

Why this matters

Repro

Suggested fix direction (not prescriptive)

Caller

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions