You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
chore: raise default rate limit from 60/minute to 120/minute
Doubles the friendliness of the per-IP cap for casual users while staying
well below the measured aggregate ceiling (~30 RPS = ~1,800 req/min). At
2 RPS per IP, up to 15 simultaneous full-rate anonymous clients can coexist
without degradation; batch users still feel friction and are nudged toward
requesting a trusted token. Will be revisited when multi-worker (#68) ships
and the aggregate ceiling rises.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: README.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -314,7 +314,7 @@ All settings are overridable via environment variables prefixed with `PC2NUTS_`:
314
314
|`PC2NUTS_DB_CACHE_TTL_DAYS`|`30`| Days between automatic TERCET data refreshes. If the refresh fails, the service falls back to the previous data and sets `data_stale: true` in the health endpoint. |
315
315
|`PC2NUTS_ESTIMATES_CSV`|`./tercet_missing_codes.csv`| Path to the estimates CSV. Loaded automatically at startup if the file exists. |
316
316
|`PC2NUTS_EXTRA_SOURCES`|*(empty)*| Comma-separated list of ZIP URLs containing additional postal code data. Loaded after TERCET; entries overwrite TERCET data. |
317
-
|`PC2NUTS_RATE_LIMIT`|`60/minute`| Rate limit for `/lookup` and `/pattern` endpoints. Uses [slowapi](https://github.com/laurentS/slowapi) syntax (e.g. `100/minute`, `5/second`). `/health` is exempt. |
317
+
|`PC2NUTS_RATE_LIMIT`|`120/minute`| Rate limit for `/lookup` and `/pattern` endpoints. Uses [slowapi](https://github.com/laurentS/slowapi) syntax (e.g. `100/minute`, `5/second`). `/health` is exempt. The default leaves comfortable headroom under the measured aggregate ceiling (~30 RPS) — see [`docs/performance.md`](docs/performance.md) for the rationale. |
318
318
|`PC2NUTS_STARTUP_TIMEOUT`|`300`| Maximum seconds allowed for initial data loading. If exceeded, the service starts with whatever data was loaded and sets `data_stale: true`. |
319
319
|`PC2NUTS_TRUSTED_TOKENS`|`""` (empty — bypass disabled) | Comma-separated list of opaque tokens that bypass the per-IP rate limit when sent via `Authorization: Bearer <token>`. Continues to work as a union with the DB-backed registry below; set this only as a disaster-recovery fallback or for env-var-only deployments. See [Authentication & rate-limit bypass](#authentication--rate-limit-bypass) for the operator runbook. |
320
320
|`PC2NUTS_TOKEN_DB_URL`|`""` (unset) | Connection string for the trusted-token database. Accepts both `https://…` and `libsql://…` (the latter is rewritten to `https://` automatically). Empty → DB-backed bypass disabled, falls back to env-var-only behaviour. |
@@ -328,7 +328,7 @@ All settings are overridable via environment variables prefixed with `PC2NUTS_`:
328
328
329
329
## Authentication & rate-limit bypass
330
330
331
-
The service applies a per-IP rate limit (`60/minute` by default) to `/lookup` and `/pattern`. Trusted callers — operator-issued, manually distributed — can bypass this limit by presenting an `Authorization: Bearer <token>` header. `/health` stays anonymous.
331
+
The service applies a per-IP rate limit (`120/minute` by default) to `/lookup` and `/pattern`. Trusted callers — operator-issued, manually distributed — can bypass this limit by presenting an `Authorization: Bearer <token>` header. `/health` stays anonymous.
332
332
333
333
### Configuration
334
334
@@ -413,7 +413,7 @@ Then remove that token from `PC2NUTS_TRUSTED_TOKENS` on the next config edit.
413
413
414
414
| Request | Result |
415
415
|---|---|
416
-
| No `Authorization` header | Per-IP `60/minute` cap, normal `200` / `429`|
416
+
| No `Authorization` header | Per-IP `120/minute` cap, normal `200` / `429`|
The current `60/minute`per-IP cap is therefore not the system bottleneck — the deployment can serve roughly **30× that volume in aggregate** before throughput plateaus. A single client could be permitted up to ~1,500/minute (25 RPS) without affecting overall headroom; the per-IP cap should be set well below the aggregate ceiling regardless.
18
+
The per-IP cap is therefore not the system bottleneck — the deployment can serve roughly **15× the default `120/minute` cap in aggregate** before throughput plateaus. A single client could in principle be permitted up to ~1,500/minute (25 RPS) without affecting overall headroom; the per-IP cap is set well below the aggregate ceiling so that ~15 simultaneous full-rate clients can coexist without degradation.
19
19
20
20
---
21
21
@@ -92,7 +92,7 @@ No drift over the 3-minute window. p99 stayed well under 200 ms throughout.
92
92
93
93
## Recommendations
94
94
95
-
1.**Keep per-IP cap conservative relative to aggregate ceiling.** The current `60/minute` (1 RPS per IP) leaves comfortable headroom: even ~30 saturation-rate clients in parallel could sustain themselves before degrading the aggregate. No change needed unless trusted-token traffic patterns become heavy.
95
+
1.**Per-IP cap set to `120/minute` (2 RPS per IP).** Chosen as 1/15 of the aggregate ceiling — up to 15 simultaneous full-rate anonymous clients can sustain themselves before the aggregate degrades. Friendlier UX for casual users (a small country's worth of postcodes finishes in roughly half the time it took at `60/minute`) while still tight enough that batch users feel the pressure to request a trustedtoken. Revisit when multi-worker (#68) ships and the aggregate ceiling rises.
96
96
97
97
2.**Pick `p99 ≤ 200 ms` as the SLO** at the recommended 27 RPS operating point. The full 3-minute sustained run met this.
0 commit comments