feat: multi-worker uvicorn with shared rate-limit backend (#68) by bk86a · Pull Request #71 · bk86a/PostalCode2NUTS

bk86a · 2026-05-01T10:24:20Z

Summary

Implements #68: multi-worker uvicorn behind a shared rate-limit backend.

New env vars: PC2NUTS_WORKERS (default 1), PC2NUTS_RATE_LIMIT_STORAGE_URI (default unset).
Startup hard-fails if WORKERS > 1 without a storage URI configured (Pydantic model validator) — prevents silent per-IP rate-limit cap loosening.
slowapi in_memory_fallback_enabled=True handles transient backend outages with per-process MemoryStorage fallback and exponential-backoff re-probing — once-WARNING-per-outage, INFO on recovery.
Default behaviour (single worker, in-memory) is byte-for-byte unchanged: when neither env var is set, the Limiter is constructed exactly as before.

Spec: docs/superpowers/specs/2026-05-01-multi-worker-uvicorn-design.md
Plan: docs/superpowers/plans/2026-05-01-multi-worker-uvicorn.md

Acceptance criteria from #68

Dockerfile updated to launch N workers via PC2NUTS_WORKERS (shell-form CMD with exec uvicorn ... --workers ${PC2NUTS_WORKERS:-1})
Rate-limit behaviour with N workers documented in README.md (new "Multi-worker deployment" subsection under ## Configuration)
Memory usage measured against the container's allocated memory — operational follow-up post-deploy
Performance baseline re-run; docs/performance.md updated — operational follow-up post-deploy

Test plan

Automated (passing on this branch)

Existing rate-limit tests still pass (single-worker default branch is byte-for-byte unchanged: 178 passed)
tests/test_config.py::TestWorkersValidator proves the validator fires for WORKERS>1 + no URI (4 tests covering the truth table including empty-string aliasing)
tests/test_limiter.py::TestLimiterStorageSelection proves the storage URI is honoured and fallback flag is set
Docker smoke-test confirmed locally: build OK, single-worker startup OK, PC2NUTS_WORKERS=2 without storage URI fails immediately with the validator's error message naming both env vars

Pre-merge manual (operational)

Deploy this branch image to the target hosting environment with PC2NUTS_WORKERS=2-4 and a real Redis configured via PC2NUTS_RATE_LIMIT_STORAGE_URI
Confirm /health returns 200 from all workers
Fire >120 requests in a minute from a single anonymous client; confirm the 120/minute cap is observed across workers (i.e. shared Redis storage works)
Fire requests with a valid trusted-token bearer; confirm bypass still works (exempt_when=is_trusted_request runs before the storage call)
Briefly take Redis offline mid-traffic; confirm one WARNING log line, traffic continues to be served (fallback active), no 5xx burst
Bring Redis back; confirm one INFO "Rate limit storage recovered" line within ~30 s

Post-merge operational

Re-run perf baseline (scripts/perf_test.sh) and update docs/performance.md with the new RPS at the chosen worker count
Document measured per-worker resident-set size and confirm headroom against the container memory cap

Notes

Final review caught one supply-chain issue (commit d0baaeb): the initial requirements.txt declaration slowapi[redis]>=0.1.9,<1 resolves to redis<4 (slowapi 0.1.9's stale extra constraint), conflicting with the redis==7.4.0 lock pin. Switched to limits[redis]>=2.3 which exposes the modern redis>3,<8 constraint.
Spec was simplified mid-implementation when slowapi 0.1.9 turned out to ship the fail-degraded behaviour we'd designed (in_memory_fallback_enabled=True); a custom _FailDegradedStorage wrapper class was dropped from the design — see commit 7cf2d00.

🤖 Generated with Claude Code

Brainstormed design for issue #68. Two new opt-in env vars (PC2NUTS_WORKERS, PC2NUTS_RATE_LIMIT_STORAGE_URI) drive multi-worker uvicorn behind a fail-degraded shared rate-limit backend; defaults preserve the current single-worker / in-memory deploy byte-for-byte. Key decisions captured in the spec: - Option (a) Redis-backed slowapi (over edge-layer or per-process division), preserving the strict 120/min anonymous cap while keeping trusted-token bypass working. - Fail-degraded behaviour (option III): on Redis unavailability, fall back to per-worker in-memory storage for a 30 s window before re-probing. Logs once per outage. - Hard-fail at startup if PC2NUTS_WORKERS > 1 with no storage URI configured, so the cap can never silently loosen. - uvicorn --workers (not gunicorn); shell-form CMD in Dockerfile to expand the env var. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

While prepping the implementation plan, found that slowapi 0.1.9 already ships exactly the fail-degraded behaviour we'd designed: Limiter(in_memory_fallback_enabled=True) routes to a per-process MemoryStorage when the primary raises, logs once per outage, and re-probes with exponential backoff (better than the fixed 30s window we'd specified). Custom _FailDegradedStorage class is no longer needed — drops a new module and four unit tests' worth of code we'd have to maintain. Spec sections 4.2, 4.3, 5, 6, 7, and 10 updated to reflect the library-feature approach. Architecture and operator-visible behaviour are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Step-by-step TDD plan for issue #68 against the simplified spec at docs/superpowers/specs/2026-05-01-multi-worker-uvicorn-design.md. Seven tasks: Settings + validator, redis dep (ordered before the limiter test that exercises it), limiter module extraction, Dockerfile, README, CHANGELOG, final verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

New Pydantic model validator hard-fails startup if PC2NUTS_WORKERS > 1 without PC2NUTS_RATE_LIMIT_STORAGE_URI configured, so the per-IP rate limit can never silently loosen under multi-worker. Defaults preserve current behaviour: workers=1, storage URI unset.

Pulls in redis-py at the version limits 5.8.0 expects, used only when PC2NUTS_RATE_LIMIT_STORAGE_URI is set. Single-host deployers who never configure shared storage pay the install-size cost but no runtime cost (redis is imported eagerly by limits.storage.RedisStorage at Limiter construction, but only when the storage URI is configured).

Without this, regenerating requirements.lock via the documented 'pip install -r requirements.txt && pip freeze > requirements.lock' flow would drop the redis pin added in dc56d8c. Using slowapi's [redis] extra (rather than pinning redis directly) keeps the declaration aligned with our actual dependency and lets slowapi's constraint chain choose the right transitive version of redis-py.

When PC2NUTS_RATE_LIMIT_STORAGE_URI is set, construct the Limiter with that storage URI and in_memory_fallback_enabled=True so transient backend outages fall back to per-process MemoryStorage. When unset, construction is byte-for-byte the previous inline call. slowapi's built-in fallback handles outage detection, once-per-outage WARNING logging, and exponential-backoff recovery probes.

One-line whitespace fix flagged by ruff format --check on Task 3's new test file. CI runs format --check, so this would have broken the lint job.

Switches CMD from exec-form to shell-form with 'exec uvicorn …' so ${PC2NUTS_WORKERS:-1} expands at container start while uvicorn remains the foreground PID-1 process for proper SIGTERM handling. Default of 1 preserves current single-worker behaviour. Multi-worker mode also requires PC2NUTS_RATE_LIMIT_STORAGE_URI; the Settings validator (added in feat(config)) refuses to start otherwise.

New 'Multi-worker deployment' subsection covers both env vars, the startup-validation guard for the unsafe combination, and the slowapi fail-degraded behaviour during a backend outage.

….txt (#68) Reviewer caught that slowapi 0.1.9's [redis] extra pins 'redis>=3.4.1,<4.0.0' — a stale 6-year-old constraint that contradicts the redis==7.4.0 lock pin. Any operator running the documented 'pip install -r requirements.txt && pip freeze > requirements.lock' regeneration flow would silently downgrade to redis 3.5.3. limits[redis] exposes the modern constraint 'redis!=4.5.2,!=4.5.3,<8.0.0,>3' which matches the lock pin and is what we actually want. The slowapi -> limits dependency chain still pulls slowapi in transitively, so we don't lose any functionality by dropping the slowapi[redis] extra in favour of limits[redis]. Verified: pip install --dry-run -r requirements.txt now resolves redis to 7.4.0 cleanly, matching requirements.lock.

PR #71 shipped multi-worker uvicorn behind a shared rate-limit backend. This commit captures the re-run of scripts/perf_test.sh against the post-#68 deployment so the open AC items on #68 ("memory headroom" and "verify approximately N× headroom on /lookup") have measured numbers rather than estimates. Headlines: - Realistic-corpus knee (Scenario B) moved from 30 → 35-38 RPS. Single-worker collapsed at 35 (p99 4.47 s); multi-worker absorbs 35 cleanly (p99 150 ms) and only saturates between 35 and 40. - Hot-key plateau (Scenario A, persistent connections) doubled-ish: ~30 → ~50 RPS, with p99 at saturation 2.5× lower. - Recommended operating point unchanged at 27 RPS — Scenario E (3-min sustained) still meets the p99 ≤ 200 ms SLO. The win is headroom (~10% → ~30-40%), not the operating point itself. The 1.6× rather than 2× scaling is consistent with shared-edge TLS termination and Pydantic GIL contention being part of the cap, not just per-worker compute. Documented in the methodology notes. Also adds a new "Rate-limit shared-storage verification" subsection: 130 anonymous requests against the published 120/minute cap from a single source IP yielded exactly 120 × 200 + 10 × 429 — conclusive evidence the Redis sidecar is reachable from both workers and the cap is enforced globally rather than per-worker (the failure mode the startup validator at app/config.py:42-50 exists to prevent). CHANGELOG entry under [Unreleased] summarises both the re-baseline and the perf_test.sh fix from the previous commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

bk86a and others added 12 commits May 1, 2026 11:38

style: ruff format tests/test_limiter.py (#68)

bd755e5

One-line whitespace fix flagged by ruff format --check on Task 3's new test file. CI runs format --check, so this would have broken the lint job.

docs(README): document PC2NUTS_WORKERS and rate-limit storage URI (#68)

24ac509

New 'Multi-worker deployment' subsection covers both env vars, the startup-validation guard for the unsafe combination, and the slowapi fail-degraded behaviour during a backend outage.

docs(changelog): multi-worker deployment entry (#68)

ab721b8

bk86a merged commit 0973020 into main May 1, 2026
11 checks passed

bk86a deleted the feat/multi-worker-uvicorn branch May 1, 2026 10:31

bk86a mentioned this pull request May 1, 2026

Switch from single-worker to multi-worker uvicorn (likely large RPS win) #68

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: multi-worker uvicorn with shared rate-limit backend (#68)#71

feat: multi-worker uvicorn with shared rate-limit backend (#68)#71
bk86a merged 12 commits into
mainfrom
feat/multi-worker-uvicorn

bk86a commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bk86a commented May 1, 2026

Summary

Acceptance criteria from #68

Test plan

Automated (passing on this branch)

Pre-merge manual (operational)

Post-merge operational

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant