Skip to content

Commit f2e2d65

Browse files
CreatmanCEOclaude
andcommitted
fix: latency-based routing to prefer Haiku over unreliable Gemini fallback
simple-shuffle randomly picked Gemini (503 overloaded) before trying Haiku. latency-based-routing prefers the fastest responding model, with proper fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent a51dfd9 commit f2e2d65

1 file changed

Lines changed: 4 additions & 3 deletions

File tree

backend/services/llm_router.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,9 +61,10 @@ def create_router() -> Router:
6161

6262
return Router(
6363
model_list=model_list,
64-
routing_strategy="simple-shuffle",
65-
num_retries=2,
66-
timeout=30,
64+
routing_strategy="latency-based-routing",
65+
num_retries=3,
66+
timeout=45,
67+
allowed_fails=1,
6768
)
6869

6970

0 commit comments

Comments
 (0)