fix(guard): add metrics#2480
fix(guard): add metrics#2480MasterPtato wants to merge 1 commit into05-30-chore_expose_namespace_in_otel_attributesfrom
Conversation
There was a problem hiding this comment.
PR Summary
This PR adds comprehensive metrics tracking to the Rivet Guard service, focusing on monitoring system performance and request handling.
- Added Prometheus metrics in
metrics.rsfor tracking TCP connections, active requests, route cache size, rate limiters, and in-flight counters - Modified
server.rsto increment/decrement connection and request metrics at key lifecycle points - Changed rate limiting in
proxy_service.rsto use IP addresses instead of actor IDs for better tracking - Added metrics server configuration in
main.rswith a dedicated task for metrics collection - Configured Prometheus target for guard service on port 8091 with 15-second scrape interval
7 file(s) reviewed, 6 comment(s)
Edit PR Review Bot Settings | Greptile
| metrics::ACTIVE_REQUESTS_TOTAL.inc(); | ||
|
|
||
| let result = service_clone.process(req).await.map_err(|err| GlobalErrorWrapper{err}); | ||
|
|
||
| metrics::ACTIVE_REQUESTS_TOTAL.dec(); | ||
|
|
||
| result |
There was a problem hiding this comment.
style: Consider using a try-finally pattern to ensure metrics are always decremented, even if process() panics.
|
|
||
| // Accept TLS connection in a separate task to avoid ownership issues | ||
| tokio::spawn(async move { | ||
| metrics::TCP_CONNECTIONS_TOTAL.inc(); |
There was a problem hiding this comment.
style: TCP connection metric incremented before TLS handshake. Consider moving after successful handshake to avoid counting failed connections.
| endpoint: "http://127.0.0.1:8091".into(), | ||
| scrape_interval: 15, |
There was a problem hiding this comment.
logic: Port 8091 is already used by the worker service (see line 215). Consider using a different port for guard metrics to avoid conflicts.
| components::vector::PrometheusTarget { | ||
| endpoint: "http://127.0.0.1:8091".into(), | ||
| scrape_interval: 15, | ||
| }, |
There was a problem hiding this comment.
logic: Missing /metrics path in endpoint URL which is required for Prometheus scraping
| components::vector::PrometheusTarget { | |
| endpoint: "http://127.0.0.1:8091".into(), | |
| scrape_interval: 15, | |
| }, | |
| components::vector::PrometheusTarget { | |
| endpoint: "http://127.0.0.1:8091/metrics".into(), | |
| scrape_interval: 15, | |
| }, |
| // Start metrics server | ||
| let metrics_task = tokio::spawn(rivet_metrics::run_standalone(config.clone())); |
There was a problem hiding this comment.
logic: No cleanup/shutdown handling for metrics task if main server exits or Ctrl+C is received. Should implement graceful shutdown.
| Err(e) => bail!( | ||
| "Failed to build dynamic hostname actor routing regex: {}", | ||
| e | ||
| ), | ||
| }; |
There was a problem hiding this comment.
style: Error handling style is inconsistent with the static regex case below (lines 181-184). Consider using the same style for both cases.
Deploying rivet with
|
| Latest commit: |
bd93030
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://73a1cb0c.rivet.pages.dev |
| Branch Preview URL: | https://05-27-fix-guard-add-metrics.rivet.pages.dev |
6db518e to
240796d
Compare
6c38ea5 to
37074b9
Compare
37074b9 to
4d3b056
Compare
Deploying rivet-hub with
|
| Latest commit: |
bd93030
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://a72cdbb1.rivet-hub-7jb.pages.dev |
| Branch Preview URL: | https://05-27-fix-guard-add-metrics.rivet-hub-7jb.pages.dev |
Deploying rivet-studio with
|
| Latest commit: |
bd93030
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://2c56ae00.rivet-studio.pages.dev |
| Branch Preview URL: | https://05-27-fix-guard-add-metrics.rivet-studio.pages.dev |
4d3b056 to
bd93030
Compare
8f8d1e6 to
7b38fd6
Compare
7b38fd6 to
3115384
Compare
bd93030 to
617a367
Compare

Changes