@@ -7,6 +7,9 @@ A production-ready pipeline automation system built with:
77- ** Redis** - State management, locks, caching
88- ** PostgreSQL** - Persistence
99- ** AI Safety Module** - Failure prediction & anomaly handling
10+ - ** Prometheus + Grafana** - Metrics collection and dashboards
11+ - ** ELK Stack (Elasticsearch, Logstash, Kibana)** - Centralized logging
12+ - ** Sentry** - Error monitoring and tracing
1013
1114## Architecture Overview
1215
@@ -98,6 +101,10 @@ docker-compose up -d
98101| FastAPI Docs | http://localhost:8000/api/docs | - |
99102| PostgreSQL | localhost:5432 | airflow / airflow |
100103| Redis | localhost:6379 | - |
104+ | Prometheus | http://localhost:9090 | - |
105+ | Grafana | http://localhost:3000 | admin / admin |
106+ | Kibana | http://localhost:5601 | - |
107+ | Elasticsearch | http://localhost:9200 | - |
101108
102109### 4. Create Your First Pipeline
103110
@@ -176,6 +183,26 @@ curl -X POST http://localhost:8000/api/executions/pipeline-xxx/execute
176183| GET | ` /metrics ` | Dashboard metrics |
177184| GET | ` /insights ` | AI insights |
178185
186+
187+ ## Observability & Monitoring
188+
189+ ### Metrics (Prometheus)
190+ - Backend exposes Prometheus metrics at ` /metrics ` via ` prometheus-fastapi-instrumentator ` .
191+ - Prometheus scrapes ` backend:8000/metrics ` every 15 seconds.
192+
193+ ### Dashboards (Grafana)
194+ - Grafana runs on port ` 3000 ` and can connect to Prometheus (` http://prometheus:9090 ` ) as a data source.
195+ - Default credentials are ` admin/admin ` (change in production).
196+
197+ ### Centralized Logging (ELK Stack)
198+ - Elasticsearch stores indexed logs.
199+ - Logstash listens on ` 5000 ` (TCP JSON) and ` 5044 ` (beats) and forwards to Elasticsearch.
200+ - Kibana provides visualization for indices like ` flexiroaster-backend-* ` .
201+
202+ ### Error Monitoring (Sentry)
203+ - Configure ` SENTRY_DSN ` to enable Sentry for FastAPI exception capture and tracing.
204+ - Optional tuning: ` SENTRY_ENVIRONMENT ` , ` SENTRY_TRACES_SAMPLE_RATE ` , and ` SENTRY_PROFILES_SAMPLE_RATE ` .
205+
179206## Configuration
180207
181208### Environment Variables
@@ -188,6 +215,9 @@ curl -X POST http://localhost:8000/api/executions/pipeline-xxx/execute
188215| ` EXECUTOR_STAGE_TIMEOUT ` | ` 120 ` | Stage timeout in seconds |
189216| ` AI_BLOCK_HIGH_RISK ` | ` false ` | Block high-risk executions |
190217| ` AI_RISK_THRESHOLD_HIGH ` | ` 0.7 ` | High risk threshold |
218+ | ` SENTRY_DSN ` | ` "" ` | Enables Sentry when set |
219+ | ` SENTRY_ENVIRONMENT ` | ` development ` | Sentry environment label |
220+ | ` SENTRY_TRACES_SAMPLE_RATE ` | ` 0.1 ` | Fraction of traced requests |
191221
192222### Airflow Variables
193223
0 commit comments