This directory contains Docker configuration files for deploying the AWS RAG (Retrieval-Augmented Generation) application.
- Docker (version 20.10 or later)
- Docker Compose (version 2.0 or later)
- AWS credentials configured (via AWS CLI, environment variables, or IAM roles)
- Pinecone API key and index
-
Create environment file:
# From the project root cp env.example .env # Edit .env with your actual credentials
-
Start the application:
cd deployment/docker docker-compose up -d -
Check the application:
- Health check: http://localhost:8000/health
- API documentation: http://localhost:8000/docs
- Root endpoint: http://localhost:8000/
The application requires the following environment variables:
PINECONE_API_KEY: Your Pinecone API keyAWS_REGION: AWS region (default: us-east-1)
PINECONE_ENVIRONMENT: Pinecone environment (default: us-east-1-aws)PINECONE_INDEX_NAME: Pinecone index name (default: rag-documents)BEDROCK_EMBED_MODEL_ID: Embedding model (default: amazon.titan-embed-text-v2:0)BEDROCK_LLM_MODEL_ID: LLM model (default: us.anthropic.claude-sonnet-4-20250514-v1:0)LOG_LEVEL: Logging level (default: INFO)
You can provide AWS credentials in several ways:
-
Environment variables (in .env file):
AWS_ACCESS_KEY_ID=your_access_key AWS_SECRET_ACCESS_KEY=your_secret_key -
AWS CLI configuration (recommended):
aws configure
-
IAM roles (for EC2/ECS deployment)
-
Instance profiles (for EC2 deployment)
Dockerfile: Multi-stage Docker build configurationdocker-compose.yml: Service orchestration configurationREADME.md: This documentation
# Build and start services
docker-compose up -d
# Build without cache
docker-compose build --no-cache
# Start with logs
docker-compose up# Stop services
docker-compose down
# Restart services
docker-compose restart
# View logs
docker-compose logs -f rag-api
# Check status
docker-compose ps# Start with auto-reload (for development)
docker-compose up -d
# Execute commands in container
docker-compose exec rag-api bash
# View real-time logs
docker-compose logs -f rag-apiAfter starting the services, you can test the deployment manually:
-
Check service health:
curl http://localhost:8000/health
-
Test a sample query:
curl -X POST http://localhost:8000/query \ -H "Content-Type: application/json" \ -d '{"query": "What are AWS security best practices?"}'
-
View API documentation:
open http://localhost:8000/docs
-
Check container status:
docker-compose ps
-
Port 8000 already in use
# Find and kill the process using port 8000 lsof -i :8000 kill -9 <PID>
-
Docker build fails
# Clean Docker cache and rebuild docker system prune -f docker-compose build --no-cache -
Service fails to start
# Check logs for errors docker-compose logs rag-api # Check container status docker-compose ps
-
Health check fails
# Check if the service is responding curl -v http://localhost:8000/health # Check application logs docker-compose logs rag-api
-
AWS/Pinecone connection issues
- Verify your credentials in the .env file
- Check AWS region and Pinecone environment settings
- Ensure Bedrock models are enabled in your AWS account
- Verify Pinecone index exists and is accessible
To run in debug mode with more verbose logging:
-
Update docker-compose.yml:
environment: - LOG_LEVEL=DEBUG
-
Restart the service:
docker-compose restart
For production deployment, consider:
-
Resource limits in docker-compose.yml:
deploy: resources: limits: memory: 2G cpus: '1.0'
-
Health check intervals:
healthcheck: interval: 60s timeout: 30s retries: 3
-
Logging configuration:
logging: driver: "json-file" options: max-size: "10m" max-file: "3"
- Never commit .env files with real credentials
- Use secrets management for production (Docker Secrets, AWS Secrets Manager)
- Run as non-root user (already configured in Dockerfile)
- Use specific image tags instead of
latestfor production - Regularly update base images for security patches
For production deployment:
- Use a reverse proxy (nginx, traefik)
- Enable HTTPS/TLS
- Configure proper logging and monitoring
- Use container orchestration (Docker Swarm, Kubernetes)
- Implement proper backup strategies
- Set up CI/CD pipelines
If you encounter issues:
- Check the logs:
docker-compose logs rag-api - Verify your environment configuration
- Ensure all prerequisites are met
- Test endpoints manually using curl commands above