Skip to content

Commit 4d4e1a9

Browse files
fregataaclaude
andcommitted
fix: populate replica info for hc-opt-out routes in check_running_routes
The hc_pairs filter was gating ``routes_missing_replica`` as well, so revisions that omit ``service.health_check`` never had their ``replica_host``/``replica_port`` populated. ``sync_appproxy`` would then reject them indefinitely with "no replica connection info to sync", breaking the BA-5985 promise that opt-out routes start receiving traffic as soon as they reach RUNNING. Split the two phases: - Replica population now runs over all session-verified successes, hc-agnostic, since AppProxy registration needs host/port regardless of probing policy. - RouteHealthRecord initialization stays hc-gated — opt-out revisions never get probed, so they don't need a Valkey record. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 5ce60f7 commit 4d4e1a9

1 file changed

Lines changed: 16 additions & 12 deletions

File tree

  • src/ai/backend/manager/sokovan/deployment/route

src/ai/backend/manager/sokovan/deployment/route/executor.py

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -255,24 +255,28 @@ async def check_running_routes(self, routes: Sequence[RouteData]) -> RouteExecut
255255
if not successes:
256256
return RouteExecutionResult(successes=successes, errors=errors)
257257

258-
hc_configs = await self._deployment_repo.fetch_health_check_configs({
259-
r.revision_id for r in successes
260-
})
261-
hc_pairs: list[_RouteWithHealthCheck] = [
262-
_RouteWithHealthCheck(route=r, health_check=hc)
263-
for r in successes
264-
if (hc := hc_configs.get(r.revision_id)) is not None
265-
]
266-
258+
# Replica population is hc-agnostic — opt-out routes still need
259+
# host/port for ``sync_appproxy`` to register them with AppProxy.
267260
# ``_populate_replica_info`` mutates ``replica_host``/``replica_port``
268-
# in place so the rows it just populated flow into the Phase-4 filter.
269-
routes_missing_replica = [p.route for p in hc_pairs if not p.route.replica_host]
261+
# in place so populated rows flow into the hc-pair filter below.
262+
routes_missing_replica = [r for r in successes if not r.replica_host]
270263
if routes_missing_replica:
271264
with RouteRecorderContext.shared_phase("populate_replica_info"):
272265
with RouteRecorderContext.shared_step("fetch_kernel_connection_info"):
273266
await self._populate_replica_info(routes_missing_replica)
274267

275-
pairs_with_replica = [p for p in hc_pairs if p.route.replica_host and p.route.replica_port]
268+
# Only revisions that declared ``service.health_check`` get a
269+
# RouteHealthRecord in Valkey; opt-out revisions never get probed.
270+
hc_configs = await self._deployment_repo.fetch_health_check_configs({
271+
r.revision_id for r in successes
272+
})
273+
pairs_with_replica: list[_RouteWithHealthCheck] = [
274+
_RouteWithHealthCheck(route=r, health_check=hc)
275+
for r in successes
276+
if r.replica_host
277+
and r.replica_port
278+
and (hc := hc_configs.get(r.revision_id)) is not None
279+
]
276280
if pairs_with_replica:
277281
await self._ensure_health_records(pairs_with_replica)
278282

0 commit comments

Comments
 (0)