Skip to content

Commit be1c889

Browse files
vahidlazioclaudeandreas-karlssonnicklasl
authored
feat(cloudflare): add telemetry collection and /metrics endpoint (#400)
Co-authored-by: vahidlazio <vahidlazio@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Andreas Karlsson <andreask@spotify.com> Co-authored-by: Nicklas Lundin <nicklasl@spotify.com>
1 parent 7cf1d1f commit be1c889

15 files changed

Lines changed: 660 additions & 105 deletions

File tree

Cargo.lock

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

confidence-cloudflare-resolver/Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,5 +28,7 @@ worker = { version= "0.6.1", features=['queue'] }
2828
base64 = "0.22.1"
2929
once_cell = "1.19"
3030
prost = "0.13"
31+
arc-swap = "1"
32+
js-sys = "0.3"
3133
serde = { version = "1.0.219" }
3234
serde_json = "1.0.85"

confidence-cloudflare-resolver/deployer/README.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ The deployer automatically:
6363
| `WRANGLER_DEPLOY_MESSAGE` | Value passed to `wrangler deploy --message` |
6464
| `WRANGLER_DEPLOY_ARGS` | Additional newline-separated arguments passed to `wrangler deploy` |
6565
| `WRANGLER_DEPLOY_ARGS_FILE` | Path to a file containing additional `wrangler deploy` arguments, one argument per line |
66+
| `ENABLE_METRICS` | Set to create a KV namespace and enable the `/metrics` Prometheus endpoint. Requires a [KV store](https://developers.cloudflare.com/kv/platform/pricing/) |
6667

6768
### Extending Wrangler Configuration
6869

@@ -120,6 +121,33 @@ When integrating with the Cloudflare resolver, you have two options:
120121

121122
For more details on integration, including code examples using the [`@spotify-confidence/sdk`](https://github.com/spotify/confidence-sdk-js), see the [Confidence documentation](https://confidence.spotify.com/docs/sdks/edge/cloudflare#cloudflare-workers).
122123

124+
## Telemetry & Metrics
125+
126+
The resolver collects telemetry and exposes a Prometheus-compatible `/metrics` endpoint using the same metric names as all other Confidence providers (`confidence_resolve_latency_microseconds`, `confidence_resolves_total`).
127+
128+
### How latency is measured
129+
130+
Cloudflare Workers freeze `Date.now()` and `performance.now()` during synchronous CPU work (Spectre mitigation). The resolver uses `scheduler.wait(0)` — a zero-delay yield to the runtime — to unfreeze the clock after each resolve. This provides 1ms resolution with no measurable overhead.
131+
132+
### `/metrics` endpoint
133+
134+
Requires authentication:
135+
136+
```bash
137+
curl -H "Authorization: ClientSecret <your-client-secret>" \
138+
https://<worker>.workers.dev/metrics
139+
```
140+
141+
Returns Prometheus exposition format with:
142+
- `confidence_resolve_latency_microseconds` — histogram (sum, count, cumulative `le` buckets)
143+
- `confidence_resolves_total` — counter by resolve reason
144+
145+
Metrics are accumulated in a [KV namespace](https://developers.cloudflare.com/kv/platform/pricing/) (`CONFIDENCE_METRICS_KV`). Set `ENABLE_METRICS` to have the deployer create the KV namespace and bind it to the Worker. Without it, the `/metrics` endpoint returns empty and no KV writes occur.
146+
147+
### Backend telemetry
148+
149+
Resolve rates and latency are always sent to the Confidence backend via `WriteFlagLogsRequest`, regardless of the `ENABLE_METRICS` setting. The `/metrics` endpoint and KV store are only needed for direct Prometheus scraping — backend telemetry flows through the queue consumer independently.
150+
123151
## Limitations
124152

125153
* **Sticky assignments**: Not currently supported with the Cloudflare resolver. Flags with sticky assignment rules will return "flag not found".

confidence-cloudflare-resolver/deployer/script.sh

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -366,6 +366,63 @@ else
366366
echo "⚠️ Could not check queue status (HTTP $QUEUE_STATUS)"
367367
fi
368368

369+
# Create KV namespace for /metrics endpoint if it doesn't exist
370+
if [ -n "$WORKER_NAME_PREFIX" ]; then
371+
KV_NAMESPACE_TITLE="${WORKER_NAME_PREFIX}-resolver-metrics"
372+
else
373+
KV_NAMESPACE_TITLE="resolver-metrics"
374+
fi
375+
376+
ENABLE_METRICS=${ENABLE_METRICS:=}
377+
if [ -z "$ENABLE_METRICS" ]; then
378+
echo "ℹ️ ENABLE_METRICS not set; skipping KV namespace creation (/metrics endpoint disabled)"
379+
KV_NAMESPACE_ID=""
380+
else
381+
382+
echo "🔍 Checking if KV namespace '$KV_NAMESPACE_TITLE' exists..."
383+
KV_LIST=$(curl -sS -w "%{http_code}" \
384+
-H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
385+
"https://api.cloudflare.com/client/v4/accounts/${CLOUDFLARE_ACCOUNT_ID}/storage/kv/namespaces?per_page=100")
386+
KV_LIST_STATUS="${KV_LIST: -3}"
387+
KV_LIST_BODY="${KV_LIST%???}"
388+
389+
KV_NAMESPACE_ID=""
390+
if [ "$KV_LIST_STATUS" = "200" ]; then
391+
KV_NAMESPACE_ID=$(printf "%s" "$KV_LIST_BODY" | jq -r ".result[] | select(.title == \"${KV_NAMESPACE_TITLE}\") | .id" 2>/dev/null || true)
392+
fi
393+
394+
if [ -z "$KV_NAMESPACE_ID" ]; then
395+
echo "📦 KV namespace '$KV_NAMESPACE_TITLE' not found, creating..."
396+
KV_CREATE_RESP=$(curl -sS -w "%{http_code}" -X POST \
397+
-H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
398+
-H "Content-Type: application/json" \
399+
-d "{\"title\": \"${KV_NAMESPACE_TITLE}\"}" \
400+
"https://api.cloudflare.com/client/v4/accounts/${CLOUDFLARE_ACCOUNT_ID}/storage/kv/namespaces")
401+
KV_CREATE_STATUS="${KV_CREATE_RESP: -3}"
402+
KV_CREATE_BODY="${KV_CREATE_RESP%???}"
403+
if [ "$KV_CREATE_STATUS" = "200" ] || [ "$KV_CREATE_STATUS" = "201" ]; then
404+
KV_NAMESPACE_ID=$(printf "%s" "$KV_CREATE_BODY" | jq -r '.result.id')
405+
echo "✅ KV namespace '$KV_NAMESPACE_TITLE' created (id: $KV_NAMESPACE_ID)"
406+
else
407+
echo "⚠️ Failed to create KV namespace (HTTP $KV_CREATE_STATUS), /metrics will be unavailable"
408+
fi
409+
else
410+
echo "✅ KV namespace '$KV_NAMESPACE_TITLE' already exists (id: $KV_NAMESPACE_ID)"
411+
fi
412+
413+
# Append KV binding to wrangler.toml if namespace was created
414+
if [ -n "$KV_NAMESPACE_ID" ]; then
415+
cat >> wrangler.toml <<EOF
416+
417+
[[kv_namespaces]]
418+
binding = "CONFIDENCE_METRICS_KV"
419+
id = "$KV_NAMESPACE_ID"
420+
EOF
421+
echo "✅ Added CONFIDENCE_METRICS_KV binding to wrangler.toml"
422+
fi
423+
424+
fi # end ENABLE_METRICS check
425+
369426
# Update worker name and queue name in wrangler.toml if using prefix
370427
if [ -n "$WORKER_NAME_PREFIX" ]; then
371428
sed -i.tmp "s/^name = .*/name = \"$WORKER_NAME\"/" wrangler.toml
@@ -479,6 +536,7 @@ add_wrangler_deploy_args_from_lines "WRANGLER_DEPLOY_ARGS" "$WRANGLER_DEPLOY_ARG
479536
# only deploy if NO_DEPLOY is not set
480537
if test -z "$NO_DEPLOY"; then
481538
wrangler deploy "${WRANGLER_DEPLOY_ARGS_ARRAY[@]}"
539+
482540
else
483541
echo "NO_DEPLOY is set, skipping deploy"
484542
fi

0 commit comments

Comments
 (0)