Skip to content

Commit a03821a

Browse files
bk86aclaude
andcommitted
chore: raise default rate limit from 60/minute to 120/minute
Doubles the friendliness of the per-IP cap for casual users while staying well below the measured aggregate ceiling (~30 RPS = ~1,800 req/min). At 2 RPS per IP, up to 15 simultaneous full-rate anonymous clients can coexist without degradation; batch users still feel friction and are nudged toward requesting a trusted token. Will be revisited when multi-worker (#68) ships and the aggregate ceiling rises. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 526f289 commit a03821a

5 files changed

Lines changed: 11 additions & 11 deletions

File tree

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ All settings are overridable via environment variables prefixed with `PC2NUTS_`:
314314
| `PC2NUTS_DB_CACHE_TTL_DAYS` | `30` | Days between automatic TERCET data refreshes. If the refresh fails, the service falls back to the previous data and sets `data_stale: true` in the health endpoint. |
315315
| `PC2NUTS_ESTIMATES_CSV` | `./tercet_missing_codes.csv` | Path to the estimates CSV. Loaded automatically at startup if the file exists. |
316316
| `PC2NUTS_EXTRA_SOURCES` | *(empty)* | Comma-separated list of ZIP URLs containing additional postal code data. Loaded after TERCET; entries overwrite TERCET data. |
317-
| `PC2NUTS_RATE_LIMIT` | `60/minute` | Rate limit for `/lookup` and `/pattern` endpoints. Uses [slowapi](https://github.com/laurentS/slowapi) syntax (e.g. `100/minute`, `5/second`). `/health` is exempt. |
317+
| `PC2NUTS_RATE_LIMIT` | `120/minute` | Rate limit for `/lookup` and `/pattern` endpoints. Uses [slowapi](https://github.com/laurentS/slowapi) syntax (e.g. `100/minute`, `5/second`). `/health` is exempt. The default leaves comfortable headroom under the measured aggregate ceiling (~30 RPS) — see [`docs/performance.md`](docs/performance.md) for the rationale. |
318318
| `PC2NUTS_STARTUP_TIMEOUT` | `300` | Maximum seconds allowed for initial data loading. If exceeded, the service starts with whatever data was loaded and sets `data_stale: true`. |
319319
| `PC2NUTS_TRUSTED_TOKENS` | `""` (empty — bypass disabled) | Comma-separated list of opaque tokens that bypass the per-IP rate limit when sent via `Authorization: Bearer <token>`. Continues to work as a union with the DB-backed registry below; set this only as a disaster-recovery fallback or for env-var-only deployments. See [Authentication & rate-limit bypass](#authentication--rate-limit-bypass) for the operator runbook. |
320320
| `PC2NUTS_TOKEN_DB_URL` | `""` (unset) | Connection string for the trusted-token database. Accepts both `https://…` and `libsql://…` (the latter is rewritten to `https://` automatically). Empty → DB-backed bypass disabled, falls back to env-var-only behaviour. |
@@ -328,7 +328,7 @@ All settings are overridable via environment variables prefixed with `PC2NUTS_`:
328328

329329
## Authentication & rate-limit bypass
330330

331-
The service applies a per-IP rate limit (`60/minute` by default) to `/lookup` and `/pattern`. Trusted callers — operator-issued, manually distributed — can bypass this limit by presenting an `Authorization: Bearer <token>` header. `/health` stays anonymous.
331+
The service applies a per-IP rate limit (`120/minute` by default) to `/lookup` and `/pattern`. Trusted callers — operator-issued, manually distributed — can bypass this limit by presenting an `Authorization: Bearer <token>` header. `/health` stays anonymous.
332332

333333
### Configuration
334334

@@ -413,7 +413,7 @@ Then remove that token from `PC2NUTS_TRUSTED_TOKENS` on the next config edit.
413413

414414
| Request | Result |
415415
|---|---|
416-
| No `Authorization` header | Per-IP `60/minute` cap, normal `200` / `429` |
416+
| No `Authorization` header | Per-IP `120/minute` cap, normal `200` / `429` |
417417
| `Authorization: Bearer <valid_token>` | Rate limit fully bypassed; `token_id=<8hex>` appended to access log |
418418
| `Authorization: Bearer <unknown_token>` | `401 Unauthorized` |
419419
| `Authorization: <not Bearer>` or malformed | `400 Bad Request` |

app/config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ class Settings(BaseSettings):
2222
token_db_url: str = ""
2323
token_db_auth_token: str = ""
2424
token_refresh_seconds: int = Field(default=60, ge=1)
25-
rate_limit: str = _defaults.get("rate_limit", "60/minute")
25+
rate_limit: str = _defaults.get("rate_limit", "120/minute")
2626
rate_limit_headers: bool = _defaults.get("rate_limit_headers", True)
2727
cache_max_age: int = _defaults.get("cache_max_age", 3600)
2828
startup_timeout: int = 300

app/settings.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
"nuts1": 0.90
2222
},
2323
"approximate_min_confidence": 0.1,
24-
"rate_limit": "60/minute",
24+
"rate_limit": "120/minute",
2525
"rate_limit_headers": true,
2626
"cache_max_age": 3600
2727
}

docs/performance.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
>
1616
> **Recommended operating point: 27 RPS (~1,620/min), p99 < 200 ms.**
1717
18-
The current `60/minute` per-IP cap is therefore not the system bottleneck — the deployment can serve roughly **30× that volume in aggregate** before throughput plateaus. A single client could be permitted up to ~1,500/minute (25 RPS) without affecting overall headroom; the per-IP cap should be set well below the aggregate ceiling regardless.
18+
The per-IP cap is therefore not the system bottleneck — the deployment can serve roughly **15× the default `120/minute` cap in aggregate** before throughput plateaus. A single client could in principle be permitted up to ~1,500/minute (25 RPS) without affecting overall headroom; the per-IP cap is set well below the aggregate ceiling so that ~15 simultaneous full-rate clients can coexist without degradation.
1919

2020
---
2121

@@ -92,7 +92,7 @@ No drift over the 3-minute window. p99 stayed well under 200 ms throughout.
9292

9393
## Recommendations
9494

95-
1. **Keep per-IP cap conservative relative to aggregate ceiling.** The current `60/minute` (1 RPS per IP) leaves comfortable headroom: even ~30 saturation-rate clients in parallel could sustain themselves before degrading the aggregate. No change needed unless trusted-token traffic patterns become heavy.
95+
1. **Per-IP cap set to `120/minute` (2 RPS per IP).** Chosen as 1/15 of the aggregate ceiling — up to 15 simultaneous full-rate anonymous clients can sustain themselves before the aggregate degrades. Friendlier UX for casual users (a small country's worth of postcodes finishes in roughly half the time it took at `60/minute`) while still tight enough that batch users feel the pressure to request a trusted token. Revisit when multi-worker (#68) ships and the aggregate ceiling rises.
9696

9797
2. **Pick `p99 ≤ 200 ms` as the SLO** at the recommended 27 RPS operating point. The full 3-minute sustained run met this.
9898

tests/test_api.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -224,11 +224,11 @@ def test_health_ignores_malformed_header(self, trusted_client):
224224
assert resp.status_code == 200
225225

226226
def test_valid_token_bypasses_rate_limit(self, trusted_client):
227-
"""Default rate limit is 60/minute. With a valid bypass token, more
228-
than 60 requests in tight succession all return 200. Without bypass,
229-
request 61+ would 429."""
227+
"""Default rate limit is 120/minute. With a valid bypass token, more
228+
than 120 requests in tight succession all return 200. Without bypass,
229+
request 121+ would 429."""
230230
headers = {"Authorization": "Bearer test-token-aaa"}
231-
for i in range(80):
231+
for i in range(150):
232232
resp = trusted_client.get(
233233
"/lookup",
234234
params={"postal_code": "10115", "country": "DE"},

0 commit comments

Comments
 (0)