title
Create Observability Dashboards
id
observability-dashboards
skillLevel
advanced
applicationPatternId
observability
summary
Design effective dashboards to visualize your Effect application metrics.
tags
observability
dashboards
grafana
visualization
rule
description
Create focused dashboards that answer specific questions about system health.
author
PaulJPhilp
related
observability-prometheus
observability-alerting
lessonOrder
1
Design dashboards that answer specific questions about system health, performance, and user experience.
Good dashboards provide:
Quick health check - See problems at a glance
Trend analysis - Spot gradual degradation
Debugging aid - Correlate metrics during incidents
Capacity planning - Forecast resource needs
1. Service Overview Dashboard
import { Effect , Metric } from "effect"
// ============================================
// Key metrics for overview dashboard
// ============================================
// RED metrics (Rate, Errors, Duration)
const requestRate = Metric . counter ( "http_requests_total" )
const errorRate = Metric . counter ( "http_errors_total" )
const requestDuration = Metric . histogram ( "http_request_duration_seconds" , {
boundaries : [ 0.01 , 0.05 , 0.1 , 0.5 , 1 , 5 ] ,
} )
// USE metrics (Utilization, Saturation, Errors)
const cpuUtilization = Metric . gauge ( "cpu_utilization_percent" )
const memoryUsage = Metric . gauge ( "memory_usage_bytes" )
const connectionPoolSize = Metric . gauge ( "connection_pool_active" )
// Business metrics
const ordersProcessed = Metric . counter ( "orders_processed_total" )
const revenueTotal = Metric . counter ( "revenue_dollars_total" )
2. Grafana Dashboard JSON
{
"title" : " Effect Application Overview" ,
"panels" : [
{
"title" : " Request Rate" ,
"type" : " timeseries" ,
"targets" : [
{
"expr" : " rate(http_requests_total[5m])" ,
"legendFormat" : " {{method}} {{path}}"
}
],
"gridPos" : { "x" : 0 , "y" : 0 , "w" : 8 , "h" : 6 }
},
{
"title" : " Error Rate" ,
"type" : " timeseries" ,
"targets" : [
{
"expr" : " rate(http_errors_total[5m]) / rate(http_requests_total[5m]) * 100" ,
"legendFormat" : " Error %"
}
],
"gridPos" : { "x" : 8 , "y" : 0 , "w" : 8 , "h" : 6 }
},
{
"title" : " P99 Latency" ,
"type" : " timeseries" ,
"targets" : [
{
"expr" : " histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))" ,
"legendFormat" : " P99"
}
],
"gridPos" : { "x" : 16 , "y" : 0 , "w" : 8 , "h" : 6 }
},
{
"title" : " Active Connections" ,
"type" : " gauge" ,
"targets" : [
{
"expr" : " active_connections" ,
"legendFormat" : " Connections"
}
],
"gridPos" : { "x" : 0 , "y" : 6 , "w" : 6 , "h" : 4 }
}
]
}
// ============================================
// SLO-focused metrics
// ============================================
// Availability: % of successful requests
const availabilitySLO = `
sum(rate(http_requests_total{status!~"5.."}[5m]))
/
sum(rate(http_requests_total[5m]))
* 100
`
// Latency: % of requests under threshold
const latencySLO = `
sum(rate(http_request_duration_seconds_bucket{le="0.5"}[5m]))
/
sum(rate(http_request_duration_seconds_count[5m]))
* 100
`
// Error budget remaining
const errorBudget = `
1 - (
(1 - (sum(rate(http_requests_total{status!~"5.."}[30d])) / sum(rate(http_requests_total[30d]))))
/
(1 - 0.999) # 99.9% SLO target
)
`
4. Effect-Specific Dashboard
import { Effect , Metric } from "effect"
// Effect runtime metrics
const fiberCount = Metric . gauge ( "effect_fibers_active" )
const fiberCreated = Metric . counter ( "effect_fibers_created_total" )
const effectDuration = Metric . histogram ( "effect_duration_seconds" )
// Service layer metrics
const serviceCallsTotal = Metric . counter ( "service_calls_total" )
const serviceErrors = Metric . counter ( "service_errors_total" )
// Instrument Effect programs
const instrumentedProgram = < A , E , R > (
name : string ,
effect : Effect . Effect < A , E , R >
) =>
Effect . gen ( function * ( ) {
yield * Metric . increment ( serviceCallsTotal . pipe ( Metric . tagged ( "service" , name ) ) )
const startTime = Date . now ( )
const result = yield * effect . pipe (
Effect . tapError ( ( ) =>
Metric . increment ( serviceErrors . pipe ( Metric . tagged ( "service" , name ) ) )
)
)
const duration = ( Date . now ( ) - startTime ) / 1000
yield * Metric . update (
effectDuration . pipe ( Metric . tagged ( "service" , name ) ) ,
duration
)
return result
} )
┌─────────────────────────────────────────────────────────┐
│ Service Overview │
├──────────────────┬──────────────────┬──────────────────┤
│ Request Rate │ Error Rate │ P99 Latency │
│ ▄▄▄▄▄▄▄▄▄▄▄ │ ▄▄▄▄▄▄▄▄▄▄▄ │ ▄▄▄▄▄▄▄▄▄▄▄ │
├──────────────────┴──────────────────┴──────────────────┤
│ Resource Usage │
├──────────────────┬──────────────────┬──────────────────┤
│ CPU: 45% │ Memory: 2.1GB │ Connections: 42│
├──────────────────┴──────────────────┴──────────────────┤
│ SLO Compliance │
├──────────────────┬──────────────────┬──────────────────┤
│ Availability │ Latency SLO │ Error Budget │
│ 99.95% │ 98.2% │ 75% remaining │
└──────────────────┴──────────────────┴──────────────────┘
Metric
PromQL
Request rate
rate(http_requests_total[5m])
Error rate
rate(http_errors_total[5m]) / rate(http_requests_total[5m])
P50 latency
histogram_quantile(0.5, rate(duration_bucket[5m]))
P99 latency
histogram_quantile(0.99, rate(duration_bucket[5m]))
Saturation
active / max
Start with RED - Rate, Errors, Duration
Add USE - Utilization, Saturation, Errors
Include SLOs - Show compliance
Group logically - Related metrics together
Use consistent time ranges - 5m, 1h, 24h