You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> These instructions provide context for AI agents (GitHub Copilot, Copilot Chat, agentic workflows) working on this codebase. They describe architecture, conventions, patterns, constraints, and lessons learned so that an AI agent can make correct decisions without re-discovering project structure.
**What this is:** An AI-assisted, modular ETL (Extract, Transform, Load) platform where each data operation is an independent Flask microservice. Pipelines are orchestrated via Apache Airflow DAGs or via the AI agent (natural language → YAML → execution).
10
12
11
13
**Primary use case:** HR / People Analytics and E-commerce — the platform ships with production-ready pipelines for the IBM HR Attrition dataset and e-commerce order analytics, plus a weather API demo. Bundled demo datasets in `data/demo/` allow out-of-the-box testing.
@@ -93,8 +95,8 @@ All services propagate `X-Correlation-ID` header for end-to-end request tracing.
**cAdvisor**: `gcr.io/cadvisor/cadvisor:latest`, port 8088→8080, `--docker_only=true`. Provides per-container CPU/memory metrics scraped by Prometheus.
430
+
431
+
**Grafana provisioning**: datasource and dashboard are auto-loaded at startup from `prometheus/grafana/provisioning/`. No manual configuration needed. Dashboard uid: `etl-monitoring-v1`.
432
+
433
+
**Prometheus metric naming**: counters follow the pattern `{slug}_requests_total` / `{slug}_success_total` / `{slug}_error_total` where slug is the service key (e.g., `extract_csv_requests_total`). PromQL aggregation pattern: `{__name__=~".*_requests_total", job=~".+-service"}`.
434
+
435
+
**Airflow admin user**: created automatically at first boot by the Dockerfile CMD (idempotent — skipped if already exists). Credentials: `admin` / `admin`.
436
+
418
437
### Network
419
438
420
439
Single bridge network `etl-network`. Services reference each other by container name (e.g., `http://clean-nan-service:5002`).
@@ -641,13 +660,18 @@ These are hard-won insights from building and debugging the platform. They shoul
641
660
| Task | Command / File |
642
661
|---|---|
643
662
| Start all services |`make up` or `docker compose up -d`|
0 commit comments