Skip to content

Commit da495a1

Browse files
feat(jobs): propagation_runner — eager retry consumer for pending_propagations
Adds the worker-side consumer of the new pending_propagations queue (api repo migration 058). Every 30s it picks eligible rows under FOR UPDATE SKIP LOCKED, dispatches by kind, and: - success → applied_at = now() + propagation.applied audit row (INFO) - failure → attempts++, next_attempt_at = now() + exp_backoff, last_error persisted, propagation.retrying audit row (DEBUG) - maxAttempts → failed_at = now() + propagation.dead_lettered audit row + structured slog.Error (CRITICAL — NR alert keys on this) Backoff schedule (cumulative ≈ 24h to dead-letter): 1m, 5m, 15m, 30m, 1h, 2h, 4h, 8h, 16h, 24h Kind registry (rule 18 / CLAUDE.md): tier_elevation → handleTierElevation iterates the team's active resources, calls provisioner RegradeResource per resource. Idempotent: provisioner's CONFIG GET / applied_conn_limit guard makes re-runs of an already-regraded resource a no-op. The existing entitlement_reconciler remains the eventually-consistent 5-min sweep backstop; this runner is the eager event-driven retry that makes per-team retries durable + alert-able. 6 regression tests: - EveryKindHasAHandler — rule 18 registry-iterating drift guard - AppliesEligibleRow — happy path - RetryOnFailure_PersistsBackoff — attempts++, backoff schedule, last_error - DeadLettersAfterMaxAttempts — failed_at + audit row - IdempotentReRun_AppliedRowSkipped — terminal rows invisible to picker - BackoffSchedule_IsMonotonicAndClamps — pin schedule + clamp behaviour Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 3c05f31 commit da495a1

3 files changed

Lines changed: 1129 additions & 0 deletions

File tree

0 commit comments

Comments
 (0)