-
Notifications
You must be signed in to change notification settings - Fork 0
Monitoring
Griffen Fargo edited this page Apr 21, 2026
·
2 revisions
Self-hosted monitoring with Prometheus, Grafana, and Alertmanager for strut stacks.
strut monitoring deploy --env prod
strut monitoring add-target my-stack --env prod
strut monitoring alert-channel add email \
--to alerts@yourdomain.com \
--from monitoring@yourdomain.com \
--resend-api-key re_xxx
strut monitoring alert-channel test email
strut monitoring status
strut monitoring reload| Component | Purpose | Default Port |
|---|---|---|
| Prometheus | Metrics collection, time-series DB, alert rules | 9090 |
| Grafana | Dashboards and visualization | 3000 |
| Alertmanager | Alert routing, grouping, notifications | 9093 |
| Node Exporter | System metrics (CPU, memory, disk, network) | 9100 |
| cAdvisor | Per-container resource metrics | 8080 |
strut monitoring deploy --env prodEdit .monitoring-prod.env:
RESEND_API_KEY=re_xxx
ALERT_EMAIL_TO=alerts@yourdomain.com
ALERT_EMAIL_FROM=monitoring@yourdomain.com
GRAFANA_ADMIN_USER=admin
GRAFANA_ADMIN_PASSWORD=<secure-password>
# Optional
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/xxxstrut monitoring add-target my-stack --env prod
strut monitoring add-target another-stack --env prodOpen http://<vps-ip>:3000 and login with credentials from the env file.
strut monitoring alert-channel add email \
--to alerts@yourdomain.com \
--from monitoring@yourdomain.com \
--resend-api-key re_xxxSMTP: smtp.resend.com:587, username=resend, password=API_KEY, TLS required.
strut monitoring alert-channel add slack \
--webhook-url https://hooks.slack.com/services/xxxstrut monitoring alert-channel add webhook \
--url https://your-service.com/alerts --method POSTstrut monitoring alert-route critical email,slack
strut monitoring alert-route warning email
strut monitoring alert-route info slack| Severity | Triggers | Examples |
|---|---|---|
| Critical | Immediate action | Service down, DB unreachable, disk >95% |
| Warning | Attention needed | CPU >80% 5min, memory >90%, disk >85% |
| Info | Informational | Backup completed, deployment successful |
-
ServiceDown —
up == 0for 2+ minutes → critical - HighCPU — CPU >80% for 5+ minutes → warning
- HighMemory — Memory >90% for 5+ minutes → warning
- DiskSpaceLow — Disk <15% free for 5+ minutes → warning
Create stacks/monitoring/prometheus/alerts/custom.yml:
groups:
- name: custom_alerts
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
for: 5m
labels:
severity: warning
annotations:
summary: "High error rate detected"Reload: strut monitoring reload
- Stack Overview — all stacks at a glance (health, resources, alerts, uptime)
- Stack Health — per-stack service availability, response times, error rates
- Resource Usage — CPU/memory/disk/network per service with trends
- Backup Status — success rate, last backup time, verification, storage
ssh ubuntu@<remote-vps>
docker run -d --name node-exporter --restart unless-stopped -p 9100:9100 prom/node-exporter
docker run -d --name cadvisor --restart unless-stopped -p 8080:8080 \
-v /:/rootfs:ro -v /var/run:/var/run:ro -v /sys:/sys:ro \
-v /var/lib/docker/:/var/lib/docker:ro gcr.io/cadvisor/cadvisorautossh -M 0 -f -N \
-o "ServerAliveInterval 30" -o "ServerAliveCountMax 3" \
-L 9100:localhost:9100 ubuntu@<remote-vps>strut monitoring add-target my-stack --env prod --vps vps-2strut monitoring update --env prod
strut monitoring restart --env prod
strut monitoring backup prometheus --env prod
strut monitoring backup grafana --env prod- Use strong Grafana admin password
- Don't expose metrics endpoints publicly
- Use SSH tunnels for cross-VPS (not open ports)
- Restrict metrics ports via firewall
- Secure webhook URLs and API keys
-
Security Posture —
strut postureruns a scheduled/CI security audit across every stack (placeholder secrets, exposed ports, missing resource limits, env files in git) -
Notifications — strut can fire Slack/Discord/webhook events on
deploy.success,backup.success,health.fail,drift.detectedetc., independent of the monitoring stack. Useful when you want deploy pings without running Prometheus. -
Debugging —
strut status-allgives a one-shot cross-stack dashboard without dashboards
strut · v0.1.0 · Report an Issue
Getting Started
Core Concepts
Operations
- Deployment
- Remote Host Setup
- Blue-Green Deploy
- Deploy Rollback
- Database Backups
- Stack Groups
- Lifecycle Hooks
- Notifications
- Key Rotation
- Drift Detection
- Domain and SSL
- Monitoring
- Volume Management
Advanced
- Security Posture
- VPS Audit and Migration
- Stack Validation
- Data Anonymization
- Debugging
- Local Development
Extending
Contributing