Summary
Under policy=default (the auto-route mode used when no --harness / --provider / --model is pinned), Fizeau collapses each subprocess harness to one scoring candidate — its DefaultModel. Multi-tier harnesses like claude (which advertises sonnet, opus, haiku as siblings under one auth/subscription) therefore only ever offer their default tier to the auto-route scorer. Sonnet and Haiku can never win on cost vs Opus even though the catalog has full power/cost metadata for all three.
Concrete symptom from a downstream caller (DDx): every policy=default dispatch lands on claude/opus-4.7 because registry.go:51 sets DefaultModel = "opus-4.7". With a 2207-token implementation prompt at role=implementer, Opus won score 100.5. When the same dispatch is rerun with --model sonnet-4.6 pinned, sonnet scores 196.0 — it would have beaten opus by 95.5 points if it had been enumerated.
Evidence
Repro: a downstream ddx work dispatch with policy=default against the live catalog.
Run 1 (no model pin, policy=default): the routing_decision event includes exactly one claude candidate:
{
"harness": "claude",
"model": "opus-4.7",
"score": 100.5,
"eligible": true,
"score_components": {
"base": 100, "context_headroom": 30, "cost": -22.5,
"performance": -5, "power": -2, "quota_health": 5,
"utilization": -5
}
}
No row for sonnet — not even eligible: false with a filter_reason. Silent drop.
Run 2 (same prompt, --model sonnet-4.6 pinned):
17:33:25 readiness route fiz/anthropic/claude-sonnet-4.6
provider=openrouter reason=policy=default; score=196.0
Sonnet scores 196.0. So the cost-aware scorer works correctly when it can see sonnet. The pin also exposed a second bug: Fizeau routes the pinned sonnet through fiz/openrouter (the catalog's openrouter_id), not through the claude subscription harness — so pinning by model name does not produce a same-harness alternative tier even when one exists on the same auth/subscription.
Root cause (tentative — pointers, not a patch)
The auto-route enumeration at internal/harnesses/registry.go:51-52 and service_routing.go:1172-1189 (v0.12.2 module path: github.com/easel/fizeau/internal/harnesses/registry.go, service_routing.go) goes:
// service_routing.go:1172-1189 (auto-route candidate add)
for _, h := range entries {
if h.DefaultModel != "" {
add(h.DefaultModel, true, status) // adds opus-4.7 for claude
}
for _, modelID := range h.SupportedModels {
add(modelID, true, status) // SHOULD add sonnet-4.6 too
}
...
}
subprocessHarnessModelIDs("claude", cfg) at service_models.go:91-101 returns the full ["sonnet", "sonnet-4.6", "opus", "opus-4.7", "claude-sonnet-4-6"] set. So sonnet should be reaching the candidate pool. The evidence says it doesn't — either:
- The downstream aggregation/dedup is keeping one model per
{harness, provider} key (picking the DefaultModel as representative), or
- The eligibility map keyed by
modelID at line ~1170 is collapsing sonnet entries by family before the candidate list is emitted, or
h.SupportedModels is being populated empty at the call site that builds entries (different from the metadata_billing.go:71 path).
I didn't read deep enough into routing.Inputs construction to pin down which. The routing_decision evidence is unambiguous, though: no sonnet row at all in the candidates array, not even excluded. A row with eligible: false, filter_reason: ... would be diagnostic; silent absence is consistent with "never enumerated."
Catalog is not the gap
I checked the embedded manifest at internal/modelcatalog/catalog/models.yaml:
sonnet-4.6:
family: claude-sonnet
power: 8
cost_input_per_m: 3.0
cost_output_per_m: 15.0
context_window: 1000000
surfaces:
agent.anthropic: sonnet-4.6
claude-code: sonnet-4.6
Complete. Power is set (8 vs opus 10), cost is 5x cheaper than opus ($3/$15 vs $15/$75), the claude-code surface matches. This rules out the power_missing exclusion path (the one that drops openrouter/anthropic/claude-haiku-4.5 with filter_reason: "power_missing").
Why this matters
The whole point of cost-aware auto-routing is to let cheap models do cheap work and reserve expensive models for hard work. Today policy=default on a multi-tier subscription harness silently always picks the most expensive tier:
- 2207-token implementation prompt → opus (would be sonnet if enumerated)
- short status-check prompt → opus (would be haiku if enumerated)
The cost gap is large: opus is 5× sonnet on input tokens, ~5× on output. For a project running ddx work continuously, defaulting every dispatch to opus is materially wrong.
Repro
- Configure Fizeau with the
claude harness (subscription path, no model pin).
- Send any execute request with
policy=default, no --harness/--provider/--model.
- Inspect the
routing_decision event — only claude/opus-4.7 appears under claude. No sonnet, no haiku.
- Resend the same prompt with
--model sonnet-4.6. The route goes to fiz/openrouter/anthropic/claude-sonnet-4.6 with score 196.0, not to the claude subscription path.
Suggested fix direction (not prescriptive)
For multi-tier subscription harnesses, enumerate one candidate per tier (opus, sonnet, haiku) with that tier's catalog power/cost, all routing through the same harness. Let cost-aware scoring pick the cheapest tier that meets power_hint_fit for the prompt. Today the harness behaves like "one model, take it or leave it"; it should behave like "one auth, multiple tiers."
If aggregation-by-harness is intentional for some other reason, the alternative is to make the harness configurable to expose multiple DefaultModels per role/power-band, and have ddx pass a power hint that selects the right one.
Caller
DDx CLI v? — see https://github.com/erik-labianca/ddx (or wherever appropriate). The DDx side does not pre-resolve routing knobs (per CONTRACT-003 / FEAT-010) and passes policy=default through verbatim.
Summary
Under
policy=default(the auto-route mode used when no--harness/--provider/--modelis pinned), Fizeau collapses each subprocess harness to one scoring candidate — itsDefaultModel. Multi-tier harnesses likeclaude(which advertises sonnet, opus, haiku as siblings under one auth/subscription) therefore only ever offer their default tier to the auto-route scorer. Sonnet and Haiku can never win on cost vs Opus even though the catalog has full power/cost metadata for all three.Concrete symptom from a downstream caller (DDx): every
policy=defaultdispatch lands onclaude/opus-4.7becauseregistry.go:51setsDefaultModel = "opus-4.7". With a 2207-token implementation prompt at role=implementer, Opus won score100.5. When the same dispatch is rerun with--model sonnet-4.6pinned, sonnet scores 196.0 — it would have beaten opus by 95.5 points if it had been enumerated.Evidence
Repro: a downstream
ddx workdispatch withpolicy=defaultagainst the live catalog.Run 1 (no model pin,
policy=default): therouting_decisionevent includes exactly one claude candidate:{ "harness": "claude", "model": "opus-4.7", "score": 100.5, "eligible": true, "score_components": { "base": 100, "context_headroom": 30, "cost": -22.5, "performance": -5, "power": -2, "quota_health": 5, "utilization": -5 } }No row for sonnet — not even
eligible: falsewith afilter_reason. Silent drop.Run 2 (same prompt,
--model sonnet-4.6pinned):Sonnet scores 196.0. So the cost-aware scorer works correctly when it can see sonnet. The pin also exposed a second bug: Fizeau routes the pinned sonnet through
fiz/openrouter(the catalog'sopenrouter_id), not through theclaudesubscription harness — so pinning by model name does not produce a same-harness alternative tier even when one exists on the same auth/subscription.Root cause (tentative — pointers, not a patch)
The auto-route enumeration at
internal/harnesses/registry.go:51-52andservice_routing.go:1172-1189(v0.12.2 module path:github.com/easel/fizeau/internal/harnesses/registry.go,service_routing.go) goes:subprocessHarnessModelIDs("claude", cfg)atservice_models.go:91-101returns the full["sonnet", "sonnet-4.6", "opus", "opus-4.7", "claude-sonnet-4-6"]set. So sonnet should be reaching the candidate pool. The evidence says it doesn't — either:{harness, provider}key (picking the DefaultModel as representative), ormodelIDat line ~1170 is collapsing sonnet entries by family before the candidate list is emitted, orh.SupportedModelsis being populated empty at the call site that buildsentries(different from themetadata_billing.go:71path).I didn't read deep enough into
routing.Inputsconstruction to pin down which. Therouting_decisionevidence is unambiguous, though: no sonnet row at all in the candidates array, not even excluded. A row witheligible: false, filter_reason: ...would be diagnostic; silent absence is consistent with "never enumerated."Catalog is not the gap
I checked the embedded manifest at
internal/modelcatalog/catalog/models.yaml:Complete. Power is set (8 vs opus 10), cost is 5x cheaper than opus ($3/$15 vs $15/$75), the
claude-codesurface matches. This rules out thepower_missingexclusion path (the one that dropsopenrouter/anthropic/claude-haiku-4.5withfilter_reason: "power_missing").Why this matters
The whole point of cost-aware auto-routing is to let cheap models do cheap work and reserve expensive models for hard work. Today
policy=defaulton a multi-tier subscription harness silently always picks the most expensive tier:The cost gap is large: opus is 5× sonnet on input tokens, ~5× on output. For a project running
ddx workcontinuously, defaulting every dispatch to opus is materially wrong.Repro
claudeharness (subscription path, no model pin).policy=default, no--harness/--provider/--model.routing_decisionevent — onlyclaude/opus-4.7appears underclaude. No sonnet, no haiku.--model sonnet-4.6. The route goes tofiz/openrouter/anthropic/claude-sonnet-4.6with score 196.0, not to the claude subscription path.Suggested fix direction (not prescriptive)
For multi-tier subscription harnesses, enumerate one candidate per tier (opus, sonnet, haiku) with that tier's catalog power/cost, all routing through the same harness. Let cost-aware scoring pick the cheapest tier that meets
power_hint_fitfor the prompt. Today the harness behaves like "one model, take it or leave it"; it should behave like "one auth, multiple tiers."If aggregation-by-harness is intentional for some other reason, the alternative is to make the harness configurable to expose multiple
DefaultModels per role/power-band, and have ddx pass a power hint that selects the right one.Caller
DDx CLI v? — see
https://github.com/erik-labianca/ddx(or wherever appropriate). The DDx side does not pre-resolve routing knobs (per CONTRACT-003 / FEAT-010) and passespolicy=defaultthrough verbatim.