You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(provision): honour caller deadline in postgres/mongo/queue k8s Provision (#53)
The redis k8s backend was fixed for the pro-provision-hang bug (#52) but the postgres, mongo, AND queue backends still derived their provisioning context from context.Background(), discarding the caller's gRPC deadline. When the api caller's deadline fired (or it cancelled the RPC), the provisioner kept blocking up to 5m on a wedged PVC/CSI attach and the api handler hung — the same class that drove the e2e-prod flakiness, for db/nosql/queue instead of cache.
Each backend now derives provCtx from the incoming ctx with a 5m ceiling backstop (min of the two), mirroring redis/k8s.go provisionContext. The api grants a generous provision deadline (provisionTimeout: 4m anon / 5m pro), so legitimate 30-90s pod startup is unaffected; only pathological hangs + early cancellations now fast-fail. waitPodReady already honours ctx.Err() each poll; the shared server.mapError already maps context.DeadlineExceeded/Canceled to retryable gRPC codes, so no server change is needed. Also bounded the rollback namespace-delete to a fresh 30s background ctx (redis parity) so cleanup runs even when the incoming ctx is cancelled without a wedged apiserver pinning the goroutine.
Tests: TestProvision_HonoursCallerDeadline per backend — a 300ms caller ctx fast-fails (<30s) wrapping context.DeadlineExceeded, instead of blocking for the pod-ready ceiling. make gate green.
Co-authored-by: Manas Srivastava <[email protected]>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0 commit comments