Skip to content

Commit dd24882

Browse files
committed
Write tradeoffs md file
1 parent a8493d2 commit dd24882

File tree

1 file changed

+146
-1
lines changed

1 file changed

+146
-1
lines changed

docs/tradeoffs.md

Lines changed: 146 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,146 @@
1-
// docs/tradeoffs.md
1+
# Design Tradeoffs
2+
3+
This document records the engineering decisions made during design and implementation
4+
5+
---
6+
7+
## Algorithm: fixed window vs token bucket vs sliding window
8+
9+
**Chosen: sliding window counter**
10+
11+
### Fixed window (rejected)
12+
13+
Fixed windows divide time into discrete buckets (e.g. 12:00-12:01, 12:01-12:02).
14+
The limit applies per bucket. Flaw: a client can fire `limit` requests at 12:00:59
15+
and another `limit` at 12:01:00 - the burst of `2xlimit` within two seconds, defeating
16+
the intent of the quota. Know as the "boundary burst" problem.
17+
18+
### Token bucket (rejected as primary)
19+
20+
Tokens refill at a constant rate. A client can consume tokens as fast as they accumulate
21+
or save them up for a burst up to `capacity`. Advantages: smooth burst handling, simple
22+
mental model for "sustained rate".
23+
24+
Disadvantages for this use case:
25+
- Storing `tokens` and `last_refill` as floats in Redis and performing arithmetic in Lua
26+
introduces floating-point imprecision that is harder to audit for correctness.
27+
- A quota of "100 requests/minute" is harder to explain with token bucked than with
28+
sliding window, which maps directly to "count requests in the last 60 seconds".
29+
- The token_bucket.lua script is kept in `internal/store/scripts/lua`
30+
31+
### Sliding Window counter (chosen)
32+
33+
Stores each request timestamp as a member in a Redis sorted set, scored by timestamp.
34+
On each decision:
35+
1. Remove members with score < `new_window` (expired requests).
36+
2. Count remaining members (requests within the active window)
37+
3. Allow if count < limit; reject otherwise
38+
39+
Correctness is easy to reason about: "at most N timestamps exist in the set at any time"
40+
Memory is bounded to `limit` entries per key (rejected requests are never added).
41+
42+
**Tradeoff:** sorted sets use more memory per entry (~50-80 bytes) than a simple counter.
43+
For high-limit keys (e.g. 10,000 req/min), this is measurable. If memory is a concern,
44+
switch to a token_bucket or approximate counter approach.
45+
46+
---
47+
48+
## Atomicity: application-level locking vs WATCH/MULTI/EXEC vs Redis Lua
49+
50+
**Chosen: Redis Lua scripts (EVALSHA)**
51+
52+
### Application-level locking (rejected)
53+
54+
A read-modify-write in Go application code:
55+
```
56+
count = GET key
57+
if count < limit:
58+
SET key (count + 1)
59+
return allow
60+
```
61+
This is a TOCTOU race across replicas. Two replicas can both read `count - 90`
62+
(limit: 100), both decide to allow and both write `100` - admitting 101 requests.
63+
Incorrect by design.
64+
65+
### WATCH/MULTI/EXEC - optimistic transactions (rejected)
66+
67+
Redis's optimistic locking aborts and retries the transaction if a watched key
68+
changes between WATCH and EXEC. Under high concurrency - many replicas targetting
69+
the same popular API key - the retry rate can grow unboundedly, causing p99 latency
70+
spikes. Each retry is a full round-trip
71+
72+
### Redis Lua scripts (chosen)
73+
74+
Redis executes Lua scripts atomically in its single-threaded interpreter.
75+
The script runs to completion before any other command is processed.
76+
This serialises all decisions at the Redis server, not application layer.
77+
78+
Benfits:
79+
- Single round-trip (EVALSHA sends only the 40-char SHA digest, not the script text)
80+
- No retries, no contention-dependent latency growth
81+
- Correctness arguement is simple: Redis processes one Lua call at a time
82+
83+
The correctness prodd is in `tests/integration/concurrency_test.go`:
84+
N goroutines firing simultaneously against a single key, asserting exactly
85+
`limit` requests are admitted.
86+
87+
---
88+
89+
## Failure behaviour: fail-open vs fail-closed
90+
91+
**Chosen: fail-open**
92+
93+
When Redis is unreachable `AllowRequest` returns `(true, error)` - the request
94+
is allowed through and the error is recorded in `ratelimiter_redis_error_total`.
95+
96+
**Rationale:** This service enforces API quotas, not security boundaries.
97+
The cost of over-admitting requests during a Redis outage (a few extra requests
98+
reach the upstream API) is lower than the cost of dropping all traffic (total
99+
service outage from the perspective of the callers).
100+
101+
**When to choose fail-closed:** If the rate-limiter enforces a billing limit,
102+
prevents credential stuffing or protects a resource with strict capacity limits,
103+
fail-closed is correct. Change 'limiter.go' line:
104+
```go
105+
return true, fmt.Errorf("store error (failing open): %w", err)
106+
//
107+
return false, fmt.Errorf("store error (failing closed): %w", err)
108+
```
109+
110+
**Observability:** Alert on `ratelimiter_redis_errors_total > 0` to detect and
111+
measure fail-open windows. The Grafana dashboard includes this panel.
112+
113+
---
114+
115+
## Stateless Replicas
116+
117+
Multiple service replicas are safe because no rate-limit state lives in the Go process.
118+
All state is in Redis. A request landing on any replica gets the same correct decision,
119+
because the decision is made atomically by the Lua script on Redis.
120+
121+
**What this requires:**
122+
- Redis must be reachable from all replicas (handled by the Redis Service in k8s)
123+
- Redis must not be sharded such that the same key could land on different shards
124+
without coordination. For very high scale, Redis Cluster with `{key}` hashtags
125+
can ensure per-key routing to a single shard.
126+
127+
**HPA implication:** Because replicas are stateless, the HPA can scale them freely
128+
without worrying about state migration. See `deploy/k8s/hpa.yml`.
129+
130+
---
131+
## Memory Growth in Redis
132+
133+
Each admitted request adds one member to the sorted set for the API key.
134+
Rejected requests are not recorded (the Lua script returns 0 before ZADD).
135+
136+
Memory per key ≈ `limit x 70 bytes` (approximate sorted set entry size).
137+
138+
At 100 req/min limit: ~7 KB per key.
139+
At 10,000 req/min limit: ~700 KB per key.
140+
At 10,000 active keys x 100 req/min limit: ~70 MB total - acceptable.
141+
142+
The `ZREMRANGEBYSCORE` call at the top of every Lua invocation keeps each set
143+
bounded to the active window. The `PEXPIRE` call ensures idle keys (no requests
144+
for a full window duration) are evicted automatically by Redis.
145+
146+
Without `PEXPIRE`, sorted sets for inactive keys would persist indefinitely.

0 commit comments

Comments
 (0)