-
-
Notifications
You must be signed in to change notification settings - Fork 5
Add Docker + Kubernetes deployment stack with autoscaling and worker isolation #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
fuzziecoder
merged 1 commit into
codex/fix-remaining-issues-and-raise-pr
from
codex/implement-containerization-and-orchestration
Feb 24, 2026
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| .git | ||
| .gitignore | ||
| node_modules | ||
| dist | ||
| __pycache__ | ||
| *.pyc | ||
| .env | ||
| .env.* | ||
| *.db | ||
| pipeline/airflow/logs |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| FROM python:3.11-slim | ||
|
|
||
| ENV PYTHONDONTWRITEBYTECODE=1 \ | ||
| PYTHONUNBUFFERED=1 | ||
|
|
||
| WORKDIR /app | ||
|
|
||
| RUN apt-get update && apt-get install -y --no-install-recommends curl && rm -rf /var/lib/apt/lists/* | ||
|
|
||
| COPY backend/requirements.txt /tmp/requirements.txt | ||
| RUN pip install --no-cache-dir -r /tmp/requirements.txt | ||
|
|
||
| COPY backend /app/backend | ||
|
|
||
| EXPOSE 8000 | ||
|
|
||
| CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| FROM node:20-alpine AS build | ||
| WORKDIR /app | ||
|
|
||
| COPY package*.json ./ | ||
| RUN npm ci | ||
|
|
||
| COPY . . | ||
| RUN npm run build | ||
|
|
||
| FROM nginx:1.27-alpine | ||
| COPY --from=build /app/dist /usr/share/nginx/html | ||
| EXPOSE 80 | ||
| CMD ["nginx", "-g", "daemon off;"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| """Simple isolated worker process for Kubernetes worker pool deployments.""" | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import signal | ||
| import time | ||
|
|
||
| RUNNING = True | ||
|
|
||
|
|
||
| def _shutdown_handler(signum, frame): | ||
| global RUNNING | ||
| RUNNING = False | ||
|
|
||
|
|
||
| def main() -> None: | ||
| signal.signal(signal.SIGTERM, _shutdown_handler) | ||
| signal.signal(signal.SIGINT, _shutdown_handler) | ||
|
|
||
| print("FlexiRoaster worker started. Waiting for tasks...") | ||
| while RUNNING: | ||
| # Placeholder for queue-based execution workers. | ||
| # This keeps the worker pool isolated from API pods. | ||
| print("worker-heartbeat") | ||
| time.sleep(15) | ||
|
|
||
| print("FlexiRoaster worker shutting down gracefully.") | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| # Containerization & Orchestration | ||
|
|
||
| This repository now includes container and Kubernetes deployment assets that support: | ||
|
|
||
| - Docker-based packaging | ||
| - Kubernetes orchestration | ||
| - Horizontal auto-scaling (HPA) | ||
| - Rolling updates | ||
| - Self-healing pods (liveness/readiness probes) | ||
| - Worker isolation (dedicated worker deployment + node scheduling hints) | ||
|
|
||
| ## Docker | ||
|
|
||
| ### Build images | ||
|
|
||
| ```bash | ||
| docker build -f Dockerfile.backend -t flexiroaster-backend:local . | ||
| docker build -f Dockerfile.frontend -t flexiroaster-frontend:local . | ||
| ``` | ||
|
|
||
| ### Run locally with Docker Compose | ||
|
|
||
| ```bash | ||
| docker compose up --build | ||
| ``` | ||
|
|
||
| Services: | ||
| - Frontend: `http://localhost:8080` | ||
| - Backend: `http://localhost:8000` | ||
| - Worker: isolated background worker process | ||
|
|
||
| ## Kubernetes | ||
|
|
||
| Kubernetes manifests are under `deploy/k8s` and can be applied with Kustomize: | ||
|
|
||
| ```bash | ||
| kubectl apply -k deploy/k8s | ||
| ``` | ||
|
|
||
| ### What is included | ||
|
|
||
| - `backend.yaml`: API deployment/service with rolling update strategy and probes. | ||
| - `frontend.yaml`: web deployment/service with rolling update strategy and probes. | ||
| - `worker.yaml`: isolated worker deployment with node selector/tolerations. | ||
| - `autoscaling.yaml`: HPAs for backend and worker. | ||
| - `namespace.yaml`: dedicated namespace. | ||
|
|
||
| ## Managed Kubernetes options | ||
|
|
||
| These manifests are cloud-agnostic and can be deployed to: | ||
|
|
||
| - **AWS EKS** | ||
| - **Google GKE** | ||
| - **Azure AKS** | ||
|
|
||
| ### Recommended managed-cluster setup | ||
|
|
||
| 1. Create separate node pools for API/web and workers. | ||
| 2. Label/taint worker nodes to enforce isolation: | ||
| - Label: `workload=worker` | ||
| - Taint: `dedicated=worker:NoSchedule` | ||
| 3. Install Metrics Server (or provider equivalent) for HPA. | ||
| 4. Use a cloud load balancer + Ingress controller for public access. | ||
| 5. Push images to a cloud registry (ECR/GAR/ACR) and update image references. | ||
|
|
||
| ## Notes | ||
|
|
||
| - Replace placeholder image names (`ghcr.io/your-org/...`) before deployment. | ||
| - Consider adding PodDisruptionBudgets, NetworkPolicies, and secrets management for production hardening. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| apiVersion: autoscaling/v2 | ||
| kind: HorizontalPodAutoscaler | ||
| metadata: | ||
| name: flexiroaster-backend-hpa | ||
| namespace: flexiroaster | ||
| spec: | ||
| scaleTargetRef: | ||
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| name: flexiroaster-backend | ||
| minReplicas: 2 | ||
| maxReplicas: 10 | ||
| metrics: | ||
| - type: Resource | ||
| resource: | ||
| name: cpu | ||
| target: | ||
| type: Utilization | ||
| averageUtilization: 70 | ||
| --- | ||
| apiVersion: autoscaling/v2 | ||
| kind: HorizontalPodAutoscaler | ||
| metadata: | ||
| name: flexiroaster-worker-hpa | ||
| namespace: flexiroaster | ||
| spec: | ||
| scaleTargetRef: | ||
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| name: flexiroaster-worker | ||
| minReplicas: 2 | ||
| maxReplicas: 10 | ||
| metrics: | ||
| - type: Resource | ||
| resource: | ||
| name: cpu | ||
| target: | ||
| type: Utilization | ||
| averageUtilization: 70 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| metadata: | ||
| name: flexiroaster-backend | ||
| namespace: flexiroaster | ||
| spec: | ||
| replicas: 2 | ||
| strategy: | ||
| type: RollingUpdate | ||
| rollingUpdate: | ||
| maxUnavailable: 0 | ||
| maxSurge: 1 | ||
| selector: | ||
| matchLabels: | ||
| app: flexiroaster-backend | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app: flexiroaster-backend | ||
| spec: | ||
| containers: | ||
| - name: backend | ||
| image: ghcr.io/your-org/flexiroaster-backend:latest | ||
| imagePullPolicy: IfNotPresent | ||
| ports: | ||
| - containerPort: 8000 | ||
| readinessProbe: | ||
| httpGet: | ||
| path: /health | ||
| port: 8000 | ||
| initialDelaySeconds: 10 | ||
| periodSeconds: 10 | ||
| livenessProbe: | ||
| httpGet: | ||
| path: /health | ||
| port: 8000 | ||
| initialDelaySeconds: 20 | ||
| periodSeconds: 20 | ||
| resources: | ||
| requests: | ||
| cpu: "250m" | ||
| memory: "256Mi" | ||
| limits: | ||
| cpu: "1000m" | ||
| memory: "1Gi" | ||
| --- | ||
| apiVersion: v1 | ||
| kind: Service | ||
| metadata: | ||
| name: flexiroaster-backend | ||
| namespace: flexiroaster | ||
| spec: | ||
| selector: | ||
| app: flexiroaster-backend | ||
| ports: | ||
| - name: http | ||
| port: 8000 | ||
| targetPort: 8000 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| metadata: | ||
| name: flexiroaster-frontend | ||
| namespace: flexiroaster | ||
| spec: | ||
| replicas: 2 | ||
| strategy: | ||
| type: RollingUpdate | ||
| rollingUpdate: | ||
| maxUnavailable: 0 | ||
| maxSurge: 1 | ||
| selector: | ||
| matchLabels: | ||
| app: flexiroaster-frontend | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app: flexiroaster-frontend | ||
| spec: | ||
| containers: | ||
| - name: frontend | ||
| image: ghcr.io/your-org/flexiroaster-frontend:latest | ||
| imagePullPolicy: IfNotPresent | ||
| ports: | ||
| - containerPort: 80 | ||
| readinessProbe: | ||
| httpGet: | ||
| path: / | ||
| port: 80 | ||
| initialDelaySeconds: 5 | ||
| periodSeconds: 10 | ||
| livenessProbe: | ||
| httpGet: | ||
| path: / | ||
| port: 80 | ||
| initialDelaySeconds: 15 | ||
| periodSeconds: 20 | ||
| resources: | ||
| requests: | ||
| cpu: "100m" | ||
| memory: "128Mi" | ||
| limits: | ||
| cpu: "500m" | ||
| memory: "512Mi" | ||
| --- | ||
| apiVersion: v1 | ||
| kind: Service | ||
| metadata: | ||
| name: flexiroaster-frontend | ||
| namespace: flexiroaster | ||
| spec: | ||
| selector: | ||
| app: flexiroaster-frontend | ||
| ports: | ||
| - name: http | ||
| port: 80 | ||
| targetPort: 80 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| apiVersion: kustomize.config.k8s.io/v1beta1 | ||
| kind: Kustomization | ||
| resources: | ||
| - namespace.yaml | ||
| - backend.yaml | ||
| - frontend.yaml | ||
| - worker.yaml | ||
| - autoscaling.yaml |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| apiVersion: v1 | ||
| kind: Namespace | ||
| metadata: | ||
| name: flexiroaster |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| metadata: | ||
| name: flexiroaster-worker | ||
| namespace: flexiroaster | ||
| spec: | ||
| replicas: 2 | ||
| strategy: | ||
| type: RollingUpdate | ||
| rollingUpdate: | ||
| maxUnavailable: 0 | ||
| maxSurge: 1 | ||
| selector: | ||
| matchLabels: | ||
| app: flexiroaster-worker | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app: flexiroaster-worker | ||
| spec: | ||
| nodeSelector: | ||
| workload: worker | ||
| tolerations: | ||
| - key: "dedicated" | ||
| operator: "Equal" | ||
| value: "worker" | ||
| effect: "NoSchedule" | ||
| containers: | ||
| - name: worker | ||
| image: ghcr.io/your-org/flexiroaster-backend:latest | ||
| imagePullPolicy: IfNotPresent | ||
| command: ["python", "-m", "backend.worker"] | ||
| resources: | ||
| requests: | ||
| cpu: "200m" | ||
| memory: "256Mi" | ||
| limits: | ||
| cpu: "1000m" | ||
| memory: "1Gi" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| version: "3.9" | ||
|
|
||
| services: | ||
| backend: | ||
| build: | ||
| context: . | ||
| dockerfile: Dockerfile.backend | ||
| ports: | ||
| - "8000:8000" | ||
| restart: unless-stopped | ||
| healthcheck: | ||
| test: ["CMD", "curl", "-f", "http://localhost:8000/health"] | ||
| interval: 30s | ||
| timeout: 5s | ||
| retries: 3 | ||
| start_period: 20s | ||
|
|
||
| frontend: | ||
| build: | ||
| context: . | ||
| dockerfile: Dockerfile.frontend | ||
| ports: | ||
| - "8080:80" | ||
| depends_on: | ||
| backend: | ||
| condition: service_healthy | ||
| restart: unless-stopped | ||
|
|
||
| worker: | ||
| build: | ||
| context: . | ||
| dockerfile: Dockerfile.backend | ||
| command: ["python", "-m", "backend.worker"] | ||
| depends_on: | ||
| backend: | ||
| condition: service_healthy | ||
| restart: unless-stopped |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Graceful shutdown delayed up to 15 seconds because
time.sleep()auto-retries after signal (PEP 475)The worker's graceful shutdown mechanism doesn't work promptly. When SIGTERM/SIGINT is received during
time.sleep(15), the signal handler setsRUNNING = False, but due to PEP 475 (Python 3.5+),time.sleep()automatically retries for the remaining duration after the signal handler returns. Thewhile RUNNINGcondition is not re-checked until the full 15-second sleep completes.Root Cause and Verification
PEP 475 modified the standard library to automatically retry system calls that are interrupted by signals (EINTR). This means
time.sleep(15)will resume sleeping for the remaining time after the_shutdown_handlersetsRUNNING = False.Verified empirically: a
time.sleep(5)interrupted by a signal after 1 second still sleeps the full 5 seconds, even though the signal handler ran at the 1-second mark.Actual behavior: Worker takes up to 15 seconds to shut down after receiving SIGTERM, because
time.sleep(15)atbackend/worker.py:25resumes after the signal handler completes.Expected behavior: Worker should exit promptly (within milliseconds) after receiving SIGTERM.
Impact: In Kubernetes, this means pod termination is delayed by up to 15 seconds on every rolling update or scale-down. While this is within the default 30-second
terminationGracePeriodSeconds, it unnecessarily slows deployments and wastes resources. If the sleep interval were increased (e.g., to 60 seconds), it could exceed the grace period and cause forced kills (SIGKILL).Fix: Use
threading.Event.wait()instead oftime.sleep(), which can be interrupted immediately:Prompt for agents
Was this helpful? React with 👍 or 👎 to provide feedback.