Skip to content

akhil27051999/Flask-REST-API

Repository files navigation

Flask REST API — End-to-End DevOps Project

Production-grade DevOps reference architecture built around a Flask + PostgreSQL REST API. Covers the full lifecycle from local development to production-style Kubernetes orchestration with GitOps and observability.

Audience: This documentation is structured for engineers preparing for 3-5 year DevOps / SRE interviews. Each module covers the implementation, the deep concepts behind it, troubleshooting from real issues we hit, interview Q&A, STAR stories, and how it maps to the cloud.


High-Level Architecture

                                ┌──────────────────────┐
                                │   Developer pushes   │
                                │   code to GitHub     │
                                └──────────┬───────────┘
                                           │
                                           ▼
   ┌────────────────────── CI Pipeline (GitHub Actions) ───────────────--───────┐
   │                                                                            │
   │  build job:                                                                │
   │   • Run unit tests (pytest)                                                │
   │   • Build Docker image                                                     │
   │   • Push to DockerHub (tagged with commit SHA)                             │
   │                                                                            │
   │  update-helm job:                                                          │
   │   • sed updates helm/application/values.yaml with new image tag            │
   │   • Commits & pushes to main branch                                        │
   │                                                                            │
   └────────────────────────────────────┬────────────────────────────────-──────┘
                                        │ git push main
                                        ▼
   ┌─────────────────── ArgoCD (GitOps Controller in K8s) ────────────────--────┐
   │                                                                            │
   │  Detects values.yaml diff → renders Helm chart → applies new manifests     │
   │  → Kubernetes does rolling update                                          │
   │                                                                            │
   └────────────────────────────────────┬───────────────────────────────-───────┘
                                        │
                                        ▼
   ┌──────────────────── 3-Node Minikube Cluster (Production-like) ───────────┐
   │                                                                          │
   │  ┌─────────────────┐  ┌─────────────────┐  ┌──────────────────────────┐  │
   │  │ App Tier        │  │ Database Tier   │  │ Dependent Services Tier  │  │
   │  │ (minikube)      │  │ (minikube-m02)  │  │ (minikube-m03)           │  │
   │  │                 │  │                 │  │                          │  │
   │  │ • Flask API ×3  │  │ • Postgres      │  │ • Vault                  │  │
   │  │                 │  │                 │  │ • External Secrets Op    │  │
   │  │                 │  │                 │  │ • Prometheus + AM        │  │
   │  │                 │  │                 │  │ • Grafana                │  │
   │  │                 │  │                 │  │ • Loki                   │  │
   │  │                 │  │                 │  │ • Promtail (DS)          │  │
   │  │                 │  │                 │  │ • Postgres exporter      │  │
   │  │                 │  │                 │  │ • Blackbox exporter      │  │
   │  └─────────────────┘  └─────────────────┘  └──────────────────────────┘  │
   │                                                                          │
   └────────────────────────────────────┬─────────────────────────────────────┘
                                        │ Slack alerts
                                        ▼
                              ┌───────────────────┐
                              │   #alerts channel │
                              └───────────────────┘

Tech Stack

Layer Tools
Application Flask 3 + SQLAlchemy + Flask-Migrate + Gunicorn + PostgreSQL 15
Testing pytest (unit) + Locust (load)
Containerization Docker (multi-stage build) + Docker Compose (local stack) + nginx (reverse proxy)
CI GitHub Actions on a self-hosted runner; SHA-based image tagging; auto-update Helm values
IaC Terraform (AWS VPC, EC2, ALB) + Ansible (system bootstrapping) — written, not deployed
Orchestration Kubernetes via Minikube (3-node cluster mimicking multi-AZ)
Secrets HashiCorp Vault + External Secrets Operator (ESO)
Packaging Helm 3 charts for every component
GitOps ArgoCD with App-of-Apps pattern + multi-source pattern for upstream charts
Observability Prometheus + Grafana + Loki + Promtail + Alertmanager + exporters
Alerting Alertmanager → Slack via Incoming Webhooks

Module Index

The documentation is structured as a curriculum. Read in order for the full picture, or jump to the topic you need.

Goal: Get the Flask API running locally with venv + Postgres + migrations + seed data.

  • Tech stack & architecture
  • Step-by-step walkthrough with the why for each step
  • 12 interview Q&A on Python venvs, WSGI, migrations, secrets, connection pooling
  • 2 STAR stories — moving the project broke the venv, AirPlay port conflict
  • Production hardening + AWS mapping

Goal: Unit tests with pytest + in-memory SQLite, load tests with Locust.

  • The test pyramid + why in-memory SQLite for unit tests
  • pytest fixture pattern + setup/teardown
  • Locust scenarios + headless CI mode
  • 14 interview Q&A on test pyramid, RED method, contract testing, load test interpretation
  • 2 STAR stories — duplicated Prometheus registry breaking tests, finding the throughput limit

Goal: Package the app as a Docker image; orchestrate the multi-service stack with Compose.

  • Multi-stage Dockerfile (build vs main; image size 80 MB vs 400 MB)
  • Layer caching, EXPOSE vs port mapping, CMD vs ENTRYPOINT
  • Compose deep dive: networking, healthchecks, depends_on, volumes
  • 14 troubleshooting issues — including the famous 127.0.0.1 Gunicorn binding bug
  • 20 interview Q&A — containers vs VMs, layers, distroless, signal handling
  • 3 STAR stories — debugging container networking, image optimization, port conflicts

Goal: On every push, run tests → build image → push to DockerHub → update Helm values in main.

  • Self-hosted vs GitHub-hosted runners (when to use which)
  • Pipeline walkthrough — build and update-helm jobs
  • Cross-platform sed, GH_PAT scopes, secret management
  • The CI → GitOps handoff
  • 14 troubleshooting issues — setup-python permission errors, push protection, branch confusion
  • 20 interview Q&A — CI vs CD, OIDC, matrix builds, blue/green
  • 3 STAR stories — setup-python mac issue, Slack webhook leak, CI pushing to wrong branch

Goal: Provision AWS infra (VPC, subnets, NAT, ALB, EC2) with Terraform; configure machines with Ansible.

  • for_each vs count (with the index-shifting trap)
  • State management — local vs S3 + DynamoDB locking
  • Drift detection (apply -refresh-only vs apply)
  • Modules, workspaces, backends
  • 24 deep Terraform troubleshooting scenarios — state lock recovery, drift, RDS replacement traps, EIP costs, rate limits
  • 4 production scenario deep-dives — manually deleted IAM role, CloudFormation migration, leaked tfstate, concurrent applies
  • 32 interview Q&A across Terraform + Ansible
  • 3 STAR stories — state recovery via S3 versioning, RDS rename trap, $4K/mo cost cleanup

Goal: Deploy Vault, ESO, Postgres, Flask onto a 3-node minikube cluster.

  • 3-node architecture with workload-to-node placement (type=application/database/dependent_services)
  • Vault deployment, init/unseal flow, KV-v2
  • ESO architecture + setup + force-sync pattern
  • Deep concepts (the bulk of the doc):
    • Networking & CoreDNS — full query flow, ndots:5, Service types, kube-proxy modes
    • Storage — PV/PVC/StorageClass, access modes, reclaim policies
    • Workloads — Deployment vs StatefulSet vs DaemonSet
    • Probes — liveness vs readiness vs startup
    • Rollouts & rollbacks — RollingUpdate vs Recreate, maxSurge math
    • Autoscaling — HPA + VPA + Cluster Autoscaler + KEDA with full YAMLs
    • NetworkPolicies (with DNS gotcha)
    • RBAC — Role vs ClusterRole
    • Operators & CRDs — ESO walkthrough as the canonical example
  • ~50 interview Q&A across architecture / networking / storage / workloads / probes / autoscaling / secrets / operators / scenarios
  • 4 STAR stories — pod-to-pod debug, stuck namespace, PVC permissions, HPA implementation

Goal: Package K8s manifests as Helm charts; deploy via ArgoCD using the App-of-Apps pattern.

  • Why GitOps (push vs pull)
  • Helm deep dive — Chart.yaml, templates, hooks, helpers, sub-charts
  • ArgoCD deep dive — Application CRD, sync policies, App-of-Apps, multi-source pattern, sync waves
  • The full CI → GitOps → Deploy loop
  • 14 troubleshooting issues — CRD version mismatches, ConfigMap-doesn't-restart, sync errors
  • 35 interview Q&A — GitOps principles, Helm internals, ArgoCD architecture, AppProjects, ApplicationSet
  • 4 STAR stories — adopting GitOps, ConfigMap checksum trick, CRD version mismatch, selfHeal saving the day

Goal: Build a full observability layer with metrics, logs, dashboards, and Slack alerts.

  • Three pillars (metrics, logs, traces); USE & RED methods
  • Component-by-component setup
  • Application instrumentation (prometheus-flask-exporter)
  • 10 alert rules with severity + USE/RED classification
  • Pre-loaded Grafana dashboards (5 community dashboards)
  • Slack integration — using slack_api_url_file to keep webhook out of Git
  • 18 troubleshooting issues — PVC permission fixes, schema mismatches, cardinality issues
  • 30+ interview Q&A across SLI/SLO/SLA, USE/RED, Prometheus internals, Loki vs ELK, Grafana, real scenarios
  • 4 STAR stories — Prometheus permission debug, Slack webhook leak, true-positive alert, observability from scratch

Why This Order Matters

1. Local setup           → understand the app
   ↓
2. Containerize          → make it portable
   ↓
3. CI pipeline           → automate build + test + push
   ↓
4. IaC                   → provision infra reproducibly
   ↓
5. Kubernetes            → run it at scale
   ↓
6. GitOps                → declarative, audited deployments
   ↓
7. Observability         → see what's happening in production

Each layer depends on the previous. The CI pipeline (3) makes sense because we can build a container (2) of the app (1). Kubernetes (5) is meaningful because we have a CI artifact (3). GitOps (6) governs Kubernetes (5). Observability (7) closes the loop — you can finally see what your fully-automated, fully-orchestrated system is doing in real time.


Quick Start

Prerequisites: macOS / Linux, Python 3.10+, Docker Desktop, kubectl, helm, minikube, brew (for installs).

1. Local app

git clone https://github.com/akhil27051999/Flask-REST-API.git
cd Flask-REST-API
python3 -m venv venv && source venv/bin/activate
pip install -r app/requirements.txt
# Configure .env (see Module 1) and run:
flask db upgrade
python app/seed.py
flask run

2. Containerized stack (Docker Compose)

export ENV_FILE=.env
docker compose up -d --build
docker exec flask-app-container flask db upgrade --directory app/migrations
docker exec -e PYTHONPATH=/api flask-app-container python /api/app/seed.py
curl http://localhost/students/3

3. Kubernetes stack

# Cluster
minikube start --nodes=3 --driver=docker --cpus=2 --memory=2048
kubectl label node minikube       type=application --overwrite
kubectl label node minikube-m02   type=database --overwrite
kubectl label node minikube-m03   type=dependent_services --overwrite

# Install ArgoCD
helm repo add argo https://argoproj.github.io/argo-helm
helm install argocd argo/argo-cd -n argocd --create-namespace

# Bootstrap everything via App-of-Apps
kubectl apply -f argocd/root-app.yaml

# Manual bootstrap steps (Vault unseal, vault-token secret) — see Module 5

4. Trigger the GitOps loop

Edit helm/application/values.yaml (e.g., bump replicas), commit, push:

git add helm/application/values.yaml
git commit -m "scale flask-api to 3"
git push origin main
# ArgoCD picks it up within 3 min — or trigger immediate sync:
kubectl patch application flask-api -n argocd --type merge \
  -p '{"operation":{"sync":{"revision":"main"}}}'

Repository Layout

Flask-REST-API/
├── app/                    # Flask source code + Dockerfile + requirements.txt + migrations
├── tests/                  # pytest unit tests + Locust load tests
├── nginx/                  # nginx reverse proxy config + Dockerfile
├── docker-compose.yaml     # Local multi-service stack
├── .github/workflows/      # CI pipeline
├── terraform/              # AWS infrastructure (VPC, EC2, ALB, etc.)
├── ansible/                # Configuration management for VMs
├── k8s/                    # Raw K8s manifests (legacy/reference; see helm/ for current)
├── helm/                   # Helm charts for every component
│   ├── application/        # Flask app
│   ├── vault/              # HashiCorp Vault
│   ├── external-secrets/   # ESO + custom resources
│   ├── database/           # PostgreSQL
│   ├── prometheus/         # Prometheus + Alertmanager
│   ├── grafana/            # Grafana
│   ├── loki/               # Loki
│   ├── promtail/           # Promtail
│   ├── postgres-exporter/  # Postgres metrics exporter
│   └── blackbox-exporter/  # HTTP probe exporter
├── argocd/                 # ArgoCD Applications
│   ├── root-app.yaml       # The App-of-Apps that manages everything
│   ├── vault.yaml
│   ├── external-secrets.yaml
│   ├── database.yaml
│   ├── application.yaml
│   └── observability-*.yaml
└── docs/                   # This documentation (modules + images)

Skills Demonstrated

By building this project end-to-end, you've practiced every tool a 3-5 yr DevOps/SRE role expects:

  • Python web app with proper structure, migrations, testing
  • Multi-stage Dockerfile with layer caching, alpine base, non-root patterns
  • Docker Compose for local multi-service development
  • CI/CD with GitHub Actions (self-hosted runner) → DockerHub → automatic Helm updates
  • Terraform for AWS provisioning (VPC, subnets, NAT, SGs, ALB, EC2) with state, modules, lifecycle
  • Ansible for configuration management (idempotent, role-based pattern)
  • Kubernetes — multi-node cluster, node labels, all major workload types, networking, storage, RBAC, autoscaling
  • HashiCorp Vault — initialization, unsealing, KV secrets engine
  • External Secrets Operator — bridging Vault and K8s native Secrets
  • Helm — chart structure, templating, hooks, releases
  • ArgoCD — Applications, App-of-Apps, multi-source, sync policies, selfHeal
  • Observability stack — Prometheus, Grafana, Loki, Promtail, Alertmanager, exporters
  • PromQL + LogQL for queries
  • Slack alerting with proper secret handling
  • GitOps workflows — pull-based deploys, drift detection, rollbacks via git revert
  • Real production troubleshooting — Gunicorn binding, fsGroup permissions, push protection, sync errors, state locking

Interview Prep Checklist

For each module, you should be able to:

  • Explain the architecture — what it does and why it's structured that way
  • Walk through one debugging story (use the STAR stories as templates)
  • Answer 5+ deep questions on the topic from memory
  • Sketch the data flow on a whiteboard
  • Discuss production hardening — what would change at scale
  • Map to cloud equivalents (AWS / GCP / Azure)

Contributing / Extending

This project is a learning + interview prep artifact. Suggested extensions to deepen further:

  • Add distributed tracing (Jaeger / Tempo + OpenTelemetry) for the third pillar
  • Add service mesh (Istio / Linkerd) for mTLS + traffic policies
  • Add chaos engineering (Litmus / Chaos Mesh) — kill pods during load tests
  • Migrate Postgres from Deployment → StatefulSet with HA replication
  • Add policy as code with OPA Gatekeeper / Kyverno
  • Add cert-manager + Ingress with TLS
  • Implement canary deployments with Argo Rollouts

License

MIT (or your license of choice).

Author

Akhil Thyadi — built as a hands-on portfolio project for DevOps / SRE roles.

About

A production-ready Flask REST API with complete DevOps implementation. Features Docker containerization, Kubernetes orchestration, Helm charts, GitHub Actions CI/CD, ArgoCD GitOps, and full observability stack with Prometheus, Grafana, and Loki. Infrastructure as Code via Vagrant for automated provisioning and monitoring.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors