layout	default
title	Chapter 8: Production Deployment
parent	Firecrawl Tutorial
nav_order	8

Chapter 8: Production Deployment

Welcome to Chapter 8: Production Deployment. In this part of Firecrawl Tutorial: Building LLM-Ready Web Scraping and Data Extraction Systems, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.

You have built a scraping and RAG pipeline that works on your laptop. Now it is time to deploy it to production -- where it must run 24/7 with high reliability, proper security, automated recovery, and full observability. This chapter covers everything from containerization to Kubernetes deployment, from secrets management to disaster recovery.

Production deployment is where the rubber meets the road. A system that works in development but fails in production provides zero value. This chapter ensures your Firecrawl-based pipeline survives the real world.

What You Will Learn

Skill	Description
Containerization	Package workers and services in Docker containers
Orchestration	Deploy with Kubernetes and autoscaling
Secrets management	Secure API keys and credentials
CI/CD pipelines	Automate testing, building, and deployment
Observability	Monitor logs, metrics, and traces in production
Backup and recovery	Protect data and recover from failures
Operational runbooks	Respond to incidents systematically

Production Architecture

A production deployment separates concerns into independently deployable services, each with its own scaling policy and failure domain.

flowchart TD
    subgraph Ingress["Ingress Layer"]
        A[Load Balancer] --> B[API Gateway]
    end

    subgraph App["Application Layer"]
        B --> C[Scheduler Service]
        C --> D[(Redis Queue)]
        D --> E[Worker Pool]
        E --> F[Firecrawl API]
    end

    subgraph Data["Data Layer"]
        E --> G[(PostgreSQL)]
        E --> H[(Redis Cache)]
        E --> I[(Vector DB)]
    end

    subgraph Ops["Operations"]
        J[Prometheus] --> K[Grafana]
        L[Loki] --> K
        E --> J
        E --> L
        M[Alertmanager] --> N[PagerDuty/Slack]
        J --> M
    end

    classDef ingress fill:#e1f5fe,stroke:#01579b
    classDef app fill:#f3e5f5,stroke:#4a148c
    classDef data fill:#e8f5e8,stroke:#1b5e20
    classDef ops fill:#fff3e0,stroke:#e65100

    class A,B ingress
    class C,D,E,F app
    class G,H,I data
    class J,K,L,M,N ops

Containerization

Worker Dockerfile

# Multi-stage build for smaller image
FROM python:3.11-slim AS builder

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

FROM python:3.11-slim

# Create non-root user
RUN useradd --create-home --shell /bin/bash worker
WORKDIR /app

# Copy installed packages from builder
COPY --from=builder /root/.local /home/worker/.local
ENV PATH=/home/worker/.local/bin:$PATH

# Copy application code
COPY src/ ./src/
COPY config/ ./config/

# Switch to non-root user
USER worker

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD python -c "import requests; requests.get('http://localhost:8080/health')" || exit 1

# Run the worker
CMD ["python", "src/worker.py"]

Docker Compose for Local Development

# docker-compose.yml
version: "3.8"

services:
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: firecrawl_pipeline
      POSTGRES_USER: pipeline
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    ports:
      - "5432:5432"
    volumes:
      - pg_data:/var/lib/postgresql/data

  worker:
    build:
      context: .
      dockerfile: Dockerfile
    environment:
      FIRECRAWL_API_KEY: ${FIRECRAWL_API_KEY}
      REDIS_URL: redis://redis:6379
      DATABASE_URL: postgresql://pipeline:${POSTGRES_PASSWORD}@postgres:5432/firecrawl_pipeline
      WORKER_CONCURRENCY: "5"
      LOG_LEVEL: info
    depends_on:
      - redis
      - postgres
    deploy:
      replicas: 3
    restart: unless-stopped

  scheduler:
    build:
      context: .
      dockerfile: Dockerfile
    command: python src/scheduler.py
    environment:
      REDIS_URL: redis://redis:6379
      DATABASE_URL: postgresql://pipeline:${POSTGRES_PASSWORD}@postgres:5432/firecrawl_pipeline
    depends_on:
      - redis
      - postgres
    restart: unless-stopped

volumes:
  redis_data:
  pg_data:

Application Entry Point

# src/worker.py
import os
import signal
import sys
import logging
from firecrawl import FirecrawlApp

logging.basicConfig(
    level=os.getenv("LOG_LEVEL", "INFO").upper(),
    format="%(asctime)s %(levelname)s %(name)s %(message)s",
)
logger = logging.getLogger("worker")

# Graceful shutdown
shutdown_requested = False

def handle_shutdown(signum, frame):
    global shutdown_requested
    logger.info("Shutdown signal received, finishing current job...")
    shutdown_requested = True

signal.signal(signal.SIGTERM, handle_shutdown)
signal.signal(signal.SIGINT, handle_shutdown)


def main():
    # Load configuration from environment
    api_key = os.environ["FIRECRAWL_API_KEY"]
    redis_url = os.environ["REDIS_URL"]
    concurrency = int(os.getenv("WORKER_CONCURRENCY", "5"))

    app = FirecrawlApp(api_key=api_key)
    queue = ScrapeJobQueue(redis_url=redis_url)

    logger.info(f"Worker started with concurrency={concurrency}")

    while not shutdown_requested:
        job = queue.dequeue()
        if job is None:
            import time
            time.sleep(1)
            continue

        url = job["url"]
        logger.info(f"Processing: {url}")

        try:
            result = app.scrape_url(url, params={
                "formats": ["markdown"],
                "onlyMainContent": True,
                "timeout": 30000,
            })
            queue.complete(job, {
                "content_length": len(result.get("markdown", "")),
                "title": result.get("metadata", {}).get("title", ""),
            })
            logger.info(f"Completed: {url}")

        except Exception as exc:
            queue.fail(job, str(exc))
            logger.error(f"Failed: {url}: {exc}")

    logger.info("Worker shutdown complete")
    sys.exit(0)


if __name__ == "__main__":
    main()

Kubernetes Deployment

Worker Deployment

# k8s/worker-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: scrape-worker
  labels:
    app: scrape-worker
spec:
  replicas: 3
  selector:
    matchLabels:
      app: scrape-worker
  template:
    metadata:
      labels:
        app: scrape-worker
    spec:
      containers:
        - name: worker
          image: your-registry/scrape-worker:latest
          resources:
            requests:
              cpu: "250m"
              memory: "512Mi"
            limits:
              cpu: "1000m"
              memory: "1Gi"
          env:
            - name: FIRECRAWL_API_KEY
              valueFrom:
                secretKeyRef:
                  name: firecrawl-secrets
                  key: api-key
            - name: REDIS_URL
              valueFrom:
                configMapKeyRef:
                  name: app-config
                  key: redis-url
            - name: WORKER_CONCURRENCY
              value: "5"
            - name: LOG_LEVEL
              value: "info"
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 30
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
      restartPolicy: Always

Horizontal Pod Autoscaler

# k8s/worker-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: scrape-worker-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: scrape-worker
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: External
      external:
        metric:
          name: redis_queue_depth
          selector:
            matchLabels:
              queue: scrape-pending
        target:
          type: AverageValue
          averageValue: "100"
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Pods
          value: 4
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 25
          periodSeconds: 120

Secrets and ConfigMaps

# k8s/secrets.yaml (encrypt with sealed-secrets or use external-secrets)
apiVersion: v1
kind: Secret
metadata:
  name: firecrawl-secrets
type: Opaque
stringData:
  api-key: "fc-your-api-key-here"
  openai-key: "sk-your-openai-key-here"
  postgres-password: "your-secure-password"

---
# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  redis-url: "redis://redis-master.default.svc.cluster.local:6379"
  database-url: "postgresql://pipeline@postgres.default.svc.cluster.local:5432/firecrawl_pipeline"
  worker-concurrency: "5"
  log-level: "info"
  cache-ttl-hours: "24"

Pod Disruption Budget

# k8s/pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: scrape-worker-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: scrape-worker

Secrets Management

Never store secrets in code, config files, or container images. Use a dedicated secrets manager.

flowchart TD
    A[Application] --> B{Secret Source}
    B -- Development --> C[.env File]
    B -- Staging --> D[Kubernetes Secrets]
    B -- Production --> E[External Secrets Manager]

    E --> F[AWS Secrets Manager]
    E --> G[HashiCorp Vault]
    E --> H[GCP Secret Manager]

    F --> I[Auto-rotation]
    G --> I
    H --> I

    classDef dev fill:#e8f5e8,stroke:#1b5e20
    classDef staging fill:#fff3e0,stroke:#e65100
    classDef prod fill:#e1f5fe,stroke:#01579b

    class C dev
    class D staging
    class E,F,G,H,I prod

Loading Secrets from Environment

import os
from dataclasses import dataclass

@dataclass
class AppConfig:
    """Application configuration loaded from environment."""
    firecrawl_api_key: str
    redis_url: str
    database_url: str
    openai_api_key: str = ""
    worker_concurrency: int = 5
    log_level: str = "info"
    cache_ttl_hours: int = 24

    @classmethod
    def from_env(cls) -> "AppConfig":
        """Load configuration from environment variables."""
        required = ["FIRECRAWL_API_KEY", "REDIS_URL", "DATABASE_URL"]
        missing = [var for var in required if var not in os.environ]
        if missing:
            raise EnvironmentError(f"Missing required env vars: {', '.join(missing)}")

        return cls(
            firecrawl_api_key=os.environ["FIRECRAWL_API_KEY"],
            redis_url=os.environ["REDIS_URL"],
            database_url=os.environ["DATABASE_URL"],
            openai_api_key=os.getenv("OPENAI_API_KEY", ""),
            worker_concurrency=int(os.getenv("WORKER_CONCURRENCY", "5")),
            log_level=os.getenv("LOG_LEVEL", "info"),
            cache_ttl_hours=int(os.getenv("CACHE_TTL_HOURS", "24")),
        )

# Usage
config = AppConfig.from_env()

CI/CD Pipeline

Automate testing, building, and deployment to ensure consistent, reliable releases.

GitHub Actions Workflow

# .github/workflows/deploy.yml
name: Build and Deploy

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - run: pip install -r requirements.txt -r requirements-dev.txt
      - run: pytest tests/ -v --cov=src/

  build:
    needs: test
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/build-push-action@v5
        with:
          push: true
          tags: ghcr.io/${{ github.repository }}/worker:${{ github.sha }}

  deploy:
    needs: build
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: |
          kubectl set image deployment/scrape-worker \
            worker=ghcr.io/${{ github.repository }}/worker:${{ github.sha }}
          kubectl rollout status deployment/scrape-worker --timeout=300s

CI/CD Pipeline Flow

flowchart LR
    A[Git Push] --> B[Run Tests]
    B --> C{Tests Pass?}
    C -- Yes --> D[Build Docker Image]
    C -- No --> E[Notify: Failed]
    D --> F[Push to Registry]
    F --> G[Deploy to Staging]
    G --> H[Smoke Tests]
    H --> I{Pass?}
    I -- Yes --> J[Deploy to Production]
    I -- No --> K[Rollback]
    J --> L[Monitor for 30min]
    L --> M{Healthy?}
    M -- Yes --> N[Done]
    M -- No --> K

    classDef pass fill:#e8f5e8,stroke:#1b5e20
    classDef fail fill:#ffebee,stroke:#b71c1c

    class N pass
    class E,K fail

Observability

Structured Logging

import json
import logging
import sys
from datetime import datetime, timezone

class JSONFormatter(logging.Formatter):
    """Structured JSON log formatter for production."""

    def format(self, record):
        log_entry = {
            "timestamp": datetime.now(timezone.utc).isoformat(),
            "level": record.levelname,
            "logger": record.name,
            "message": record.getMessage(),
            "module": record.module,
            "function": record.funcName,
        }

        # Add extra fields
        if hasattr(record, "url"):
            log_entry["url"] = record.url
        if hasattr(record, "duration_ms"):
            log_entry["duration_ms"] = record.duration_ms
        if hasattr(record, "status"):
            log_entry["status"] = record.status

        # Add exception info
        if record.exc_info:
            log_entry["exception"] = self.formatException(record.exc_info)

        return json.dumps(log_entry)

# Configure structured logging
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(JSONFormatter())
logger = logging.getLogger("scraper")
logger.addHandler(handler)
logger.setLevel(logging.INFO)

# Usage
logger.info("Scrape completed", extra={
    "url": "https://example.com",
    "duration_ms": 1250,
    "status": "success",
})

Prometheus Metrics

from prometheus_client import Counter, Histogram, Gauge, start_http_server

# Define metrics
SCRAPE_TOTAL = Counter(
    "scrape_requests_total",
    "Total scrape requests",
    ["status", "domain"]
)
SCRAPE_DURATION = Histogram(
    "scrape_duration_seconds",
    "Scrape request duration",
    ["domain"],
    buckets=[0.5, 1, 2, 5, 10, 30, 60]
)
QUEUE_DEPTH = Gauge(
    "scrape_queue_depth",
    "Number of pending scrape jobs"
)
CACHE_HIT_RATE = Gauge(
    "scrape_cache_hit_rate",
    "Cache hit rate percentage"
)

# Start metrics server
start_http_server(8080)

# Instrument scraping
import time

def instrumented_scrape(app, url):
    """Scrape with Prometheus instrumentation."""
    domain = urlparse(url).netloc
    start = time.time()

    try:
        result = app.scrape_url(url, params={"formats": ["markdown"]})
        duration = time.time() - start
        SCRAPE_TOTAL.labels(status="success", domain=domain).inc()
        SCRAPE_DURATION.labels(domain=domain).observe(duration)
        return result
    except Exception as exc:
        SCRAPE_TOTAL.labels(status="error", domain=domain).inc()
        raise

Health Check Endpoint

from http.server import HTTPServer, BaseHTTPRequestHandler
import threading
import json

class HealthHandler(BaseHTTPRequestHandler):
    """Simple health check endpoint."""

    def do_GET(self):
        if self.path == "/health":
            self.send_response(200)
            self.send_header("Content-Type", "application/json")
            self.end_headers()
            self.wfile.write(json.dumps({"status": "healthy"}).encode())
        elif self.path == "/ready":
            # Check dependencies
            ready = check_redis() and check_database()
            status = 200 if ready else 503
            self.send_response(status)
            self.send_header("Content-Type", "application/json")
            self.end_headers()
            self.wfile.write(json.dumps({
                "ready": ready,
                "redis": check_redis(),
                "database": check_database(),
            }).encode())
        else:
            self.send_response(404)
            self.end_headers()

    def log_message(self, format, *args):
        pass  # Suppress default logging

def start_health_server(port=8080):
    """Start health check server in a background thread."""
    server = HTTPServer(("0.0.0.0", port), HealthHandler)
    thread = threading.Thread(target=server.serve_forever, daemon=True)
    thread.start()
    return server

Backup and Recovery

Vector Database Backup

import subprocess
from datetime import datetime

def backup_chroma(chroma_path: str, backup_dir: str):
    """Backup ChromaDB to a compressed archive."""
    timestamp = datetime.utcnow().strftime("%Y%m%d_%H%M%S")
    backup_file = f"{backup_dir}/chroma_backup_{timestamp}.tar.gz"

    subprocess.run([
        "tar", "-czf", backup_file, "-C", chroma_path, "."
    ], check=True)

    print(f"Backup created: {backup_file}")
    return backup_file


def backup_to_s3(local_path: str, s3_bucket: str, s3_prefix: str):
    """Upload backup to S3."""
    import boto3
    s3 = boto3.client("s3")
    filename = local_path.split("/")[-1]
    s3_key = f"{s3_prefix}/{filename}"

    s3.upload_file(local_path, s3_bucket, s3_key)
    print(f"Uploaded to s3://{s3_bucket}/{s3_key}")

Backup Schedule

Component	Frequency	Retention	Method
Vector database	Daily	30 days	Snapshot + S3
PostgreSQL	Every 6 hours	14 days	pg_dump + S3
Redis (queue state)	Hourly	7 days	RDB snapshot
Application config	On every deploy	Forever	Git repository
Scraped content	On collection	Until re-scraped	S3 lifecycle rules

Disaster Recovery Runbook

flowchart TD
    A[Incident Detected] --> B{Severity}
    B -- "P1: Complete outage" --> C[Page on-call]
    B -- "P2: Degraded" --> D[Slack alert]
    B -- "P3: Minor" --> E[Create ticket]

    C --> F[Assess impact]
    D --> F
    F --> G{Root Cause}

    G -- "Queue backlog" --> H[Scale workers]
    G -- "API key expired" --> I[Rotate key]
    G -- "Database down" --> J[Failover DB]
    G -- "Vector DB corrupt" --> K[Restore from backup]

    H --> L[Monitor recovery]
    I --> L
    J --> L
    K --> L
    L --> M[Post-mortem]

    classDef critical fill:#ffebee,stroke:#b71c1c
    classDef warn fill:#fff3e0,stroke:#e65100
    classDef info fill:#e1f5fe,stroke:#01579b

    class C critical
    class D warn
    class E info

Go-Live Checklist

Use this checklist before deploying to production.

Infrastructure

All containers build and pass tests in CI
Kubernetes deployments, HPAs, and PDBs configured
Resource requests and limits set for all containers
Persistent volumes provisioned for databases

Security

All secrets stored in secrets manager (never in code or config)
API keys scoped to minimum required permissions
Non-root users in all containers
Network policies restrict inter-service communication
HTTPS enforced on all external endpoints

Observability

Structured JSON logging configured
Prometheus metrics exposed and scraped
Grafana dashboards created for key metrics
Alerts configured for: error rate > 10%, latency > 30s, queue depth > 1000
Log aggregation set up (Loki, CloudWatch, or equivalent)

Reliability

Health check and readiness endpoints implemented
Graceful shutdown handles SIGTERM
Retry logic with exponential backoff for all external calls
Circuit breaker for Firecrawl API calls
Backup schedule configured and tested
Disaster recovery runbook documented and accessible
Load test completed at 2x expected throughput

Operations

Runbook documented for common failure scenarios
On-call rotation defined
Rollback procedure tested
Cost monitoring and budget alerts configured

Troubleshooting

Problem	Possible Cause	Solution
Pods crashing on start	Missing env vars or secrets	Check `kubectl describe pod` for error messages
OOM kills	Memory limit too low	Increase memory limit; profile memory usage
Slow autoscaling	HPA stabilization window too long	Reduce stabilization window; adjust metrics
Stale queue jobs	Workers crashed mid-processing	Add job timeout; implement dead-letter queue
High error rate after deploy	Code regression	Rollback with `kubectl rollout undo`
Backup restore fails	Incompatible versions	Test restores regularly; version-tag backups

Performance Tips

Use rolling deployments with maxSurge: 1 and maxUnavailable: 0 for zero-downtime deploys.
Set resource requests accurately -- under-requesting causes noisy-neighbor issues; over-requesting wastes money.
Use node affinity to place scraping workers on nodes with good network connectivity.
Implement circuit breakers to stop calling failing APIs and let them recover.
Profile in staging before production -- use realistic URL lists and load patterns.

Security Notes

Rotate API keys regularly -- automate rotation with your secrets manager.
Enable audit logging -- log who accessed what data and when.
Scan container images for vulnerabilities with Trivy or Snyk.
Enforce network policies -- workers should only reach Redis, the database, and Firecrawl's API.
Never log sensitive data -- filter API keys, passwords, and PII from log output.
Use RBAC in Kubernetes -- each service account should have minimal permissions.

Summary

Production deployment transforms a development prototype into a reliable, scalable system. By containerizing your application, deploying on Kubernetes with autoscaling, securing credentials, implementing CI/CD, adding comprehensive observability, and preparing disaster recovery procedures, you build a system that can operate unattended and recover gracefully from failures.

Key Takeaways

Containerize everything -- Docker containers make deployments reproducible and portable across environments.
Use Kubernetes for orchestration -- HPAs, PDBs, and rolling deployments provide the reliability primitives you need.
Never store secrets in code -- use environment variables backed by a secrets manager with automated rotation.
Observability is not optional -- structured logs, Prometheus metrics, and alerting are essential for operating a production system.
Test your recovery procedures -- a backup that has never been restored is not a backup.

What Comes Next

Congratulations -- you have completed the entire Firecrawl tutorial. You now have the knowledge to:

Scrape any website (static or JavaScript-rendered) and extract clean, structured data
Build RAG pipelines that give your LLMs access to up-to-date web content
Scale to thousands of pages with distributed workers, caching, and rate limiting
Deploy to production with containers, Kubernetes, CI/CD, and full observability

From here, consider these advanced topics:

Fine-tuning LLMs on your scraped corpus for domain-specific performance
Building agentic systems that autonomously discover and scrape new sources
Real-time scraping with webhooks and event-driven architectures
Multi-modal extraction including images, PDFs, and video transcripts

Built with insights from the Firecrawl project.

What Problem Does This Solve?

Most teams struggle here because the hard part is not writing more code, but deciding clear boundaries for worker, name, self so behavior stays predictable as complexity grows.

In practical terms, this chapter helps you avoid three common failures:

coupling core logic too tightly to one implementation path
missing the handoff boundaries between setup, execution, and validation
shipping changes without clear rollback or observability strategy

After working through this chapter, you should be able to reason about Chapter 8: Production Deployment as an operating subsystem inside Firecrawl Tutorial: Building LLM-Ready Web Scraping and Data Extraction Systems, with explicit contracts for inputs, state transitions, and outputs.

Use the implementation notes around info, scrape, record as your checklist when adapting these patterns to your own repository.

How it Works Under the Hood

Under the hood, Chapter 8: Production Deployment usually follows a repeatable control path:

Context bootstrap: initialize runtime config and prerequisites for worker.
Input normalization: shape incoming data so name receives stable contracts.
Core execution: run the main logic branch and propagate intermediate state through self.
Policy and safety checks: enforce limits, auth scopes, and failure boundaries.
Output composition: return canonical result payloads for downstream consumers.
Operational telemetry: emit logs/metrics needed for debugging and performance tuning.

When debugging, walk this sequence in order and confirm each stage has explicit success/failure conditions.

Source Walkthrough

Use the following upstream sources to verify implementation details while reading this chapter:

View Repo Why it matters: authoritative reference on View Repo (github.com).

Suggested trace strategy:

search upstream code for worker and name to map concrete implementation paths
compare docs claims against actual runtime/config code before reusing patterns in production

FilesExpand file tree

08-production-deployment.md

Latest commit

History

08-production-deployment.md

File metadata and controls

Chapter 8: Production Deployment

What You Will Learn

Production Architecture

Containerization

Worker Dockerfile

Docker Compose for Local Development

Application Entry Point

Kubernetes Deployment

Worker Deployment

Horizontal Pod Autoscaler

Secrets and ConfigMaps

Pod Disruption Budget

Secrets Management

Loading Secrets from Environment

CI/CD Pipeline

GitHub Actions Workflow

CI/CD Pipeline Flow

Observability

Structured Logging

Prometheus Metrics

Health Check Endpoint

Backup and Recovery

Vector Database Backup

Backup Schedule

Disaster Recovery Runbook

Go-Live Checklist

Infrastructure

Security

Observability

Reliability

Operations

Troubleshooting

Performance Tips

Security Notes

Summary

Key Takeaways

What Comes Next

What Problem Does This Solve?

How it Works Under the Hood

Source Walkthrough

Chapter Connections