This section explains the CI/CD pipeline setup for our MLOps Zoomcamp project "Store-Sales-Forecasting"(06-best_practices module), using GitHub Actions to automate testing, infrastructure provisioning, Docker image creation, and deployment.
CI/CD stands for:
-
CI (Continuous Integration)
Automatically builds, tests, and validates code when changes are made. It helps detect errors early in the development lifecycle. -
CD (Continuous Delivery)
Automates the delivery of applications and infrastructure changes. It ensures that new versions are deployed reliably and safely.
Together, CI/CD is a critical DevOps practice to shorten the software development lifecycle and improve code quality through automation.
The purpose of this pipeline is to:
- Automatically run tests on new code and infrastructure changes.
- Define infrastructure using Terraform.
- Build and push Docker images for a Lambda service.
- Update AWS Lambda to use the new container image.
- Repeat all of this on every commit or PR merge, without manual steps.
We use GitHub Actions to orchestrate this process, which provides pre-configured virtual machines for running our CI/CD jobs.
We split our automation into two workflows:
-
Trigger:
Runs on pull requests created from feature branches. -
Jobs in CI Workflow:
- Run Unit Tests
Ensure Python functions work correctly using pytest or similar. - Run Integration Tests
Test how different parts of the system work together. - Terraform Plan
Validate any infrastructure changes before applying them, by runningterraform planon the Terraform code.
- Run Unit Tests
-
Trigger:
Runs after a pull request is merged into themainordevelopbranch. -
Jobs in CD Workflow:
-
Define Infrastructure
- Use
Terraform Applyto provision or update infrastructure such as S3 buckets, Lambda functions, IAM roles, etc.
- Use
-
Build and Push Docker Image
- Package the Lambda function as a Docker image.
- Tag and push the image to Amazon ECR (Elastic Container Registry).
-
Deploy
- Update the Lambda function configuration to use the new image version.
- Enable multi-environment support (e.g., dev, staging, prod) via environment variables or separate infrastructure definitions.
-
We chose GitHub Actions because it:
- Is natively integrated with GitHub.
- Provides ready-to-use VMs for automation.
- Has a wide ecosystem of actions for Terraform, Docker, AWS, etc.
- Supports multi-job workflows and environment-specific secrets.
Workflows must be defined inside the .github/workflows/ directory at the root of the repository.
In our project:
ci-tests.yml→ Continuous Integration (CI) workflowcd-deploy.yml→ Continuous Deployment (CD) workflow
Configure GitHub Actions to trigger the CI workflow when a pull request is opened or updated on the main branch, only if changes are made inside the MLOps code folder.
on:
pull_request:
branches:
- 'main'
paths:
- '06-best-practices/code/**'This ensures the workflow runs only when meaningful changes are introduced.
We define AWS credentials and region using GitHub Secrets in the environment block:
env:
AWS_DEFAULT_REGION: 'your_aws_default_region'
AWS_ACCESS_KEY_ID: "your_aws_access_key"
AWS_SECRET_ACCESS_KEY: "your_aws_sectet_access_key"These secrets must be configured under repository Settings → Secrets and variables → Actions in your GitHub repository.
Our ci-tests.yml contains two jobs:
test: runs unit tests, linter, and integration teststf-plan: validates Terraform infrastructure changes
This job performs code checkout, dependency installation, testing, linting, and integration testing.
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
uses: actions/setup-python@v2
with:
python-version: 3.9.12
- name: Install dependencies
working-directory: "06-best-practices/code"
run: pip install pipenv && pipenv install --dev
- name: Run Unit tests
working-directory: "06-best-practices/code"
run: pipenv run pytest tests/
- name: Lint
working-directory: "06-best-practices/code"
run: pipenv run pylint --recursive=y .
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ env.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ env.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_DEFAULT_REGION }}
- name: Install Docker Compose Plugin
run: |
sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
- name: Integration Test
working-directory: '06-best-practices/code/integration-test'
run: |
. run.shThis job performs infrastructure validation using Terraform:
tf-plan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ env.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ env.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_DEFAULT_REGION }}
- uses: hashicorp/setup-terraform@v2
- name: TF plan
id: plan
working-directory: '06-best-practices/code/infrastructure'
run: |
terraform init -backend-config="key=mlops-zoomcamp-prod.tfstate" --reconfigure && terraform plan --var-file vars/prod.tfvars```
This setup ensures every pull request goes through automated testing and Terraform validation before merging to main, improving code quality and infrastructure stability.
the Continuous Deployment (CD) pipeline used in the MLOps Zoomcamp project. It automates Terraform deployment, Docker image build and push, and Lambda function update on merges to the main branch.
We configure GitHub Actions to trigger the CD workflow automatically on every push to the main branch. This ensures that whenever code is merged into production, the latest infrastructure changes and model updates are deployed.
name: CD-Deploy
on:
push:
branches:
- 'main'
# paths:
# - '06-best-practices/code/**'our cd-deploy.yml have only one job.
build-push-deploy: This job runs all deployment steps sequentially onubuntu-latest:
- name: Check out repo
uses: actions/checkout@v3
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: "ap-south-1"This defines and applies infrastructure with auto-approval:
- uses: hashicorp/setup-terraform@v2
with:
terraform_wrapper: false
- name: TF plan
id: tf-plan
working-directory: '06-best-practices/code/infrastructure'
run: terraform init -backend-config="key=mlops-zoomcamp-prod.tfstate" -reconfigure && terraform plan -var-file=vars/prod.tfvars
- name: TF Apply
id: tf-apply
working-directory: '06-best-practices/code/infrastructure'
if: ${{ steps.tf-plan.outcome }} == 'success'
run: |
terraform apply -auto-approve -var-file=vars/prod.tfvars
echo "::set-output name=ecr_repo::$(terraform output ecr_repo | xargs)"
echo "::set-output name=predictions_stream_name::$(terraform output predictions_stream_name | xargs)"
echo "::set-output name=model_bucket::$(terraform output model_bucket | xargs)"
echo "::set-output name=lambda_function::$(terraform output lambda_function | xargs)"- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v1
- name: Build, tag, and push image to Amazon ECR
id: build-image-step
working-directory: "06-best-practices/code"
env:
ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
ECR_REPOSITORY: ${{ steps.tf-apply.outputs.ecr_repo }}
IMAGE_TAG: "latest"
run: |
docker build -t ${ECR_REGISTRY}/${ECR_REPOSITORY}:${IMAGE_TAG} .
docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
echo "::set-output name=image_uri::$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG"- name: Get model artifacts
id: get-model-artifacts
working-directory: "06-best-practices/code"
env:
MODEL_BUCKET_DEV: "mlflow-models-alexey"
MODEL_BUCKET_PROD: ${{ steps.tf-apply.outputs.model_bucket }}
run: |
export RUN_ID=$(aws s3api list-objects-v2 --bucket ${MODEL_BUCKET_DEV} --query 'sort_by(Contents, &LastModified)[-1].Key' --output=text | cut -f2 -d/)
aws s3 sync s3://${MODEL_BUCKET_DEV} s3://${MODEL_BUCKET_PROD}
echo "::set-output name=run_id::${RUN_ID}"- name: Update Lambda
env:
LAMBDA_FUNCTION: ${{ steps.tf-apply.outputs.lambda_function }}
PREDICTIONS_STREAM_NAME: ${{ steps.tf-apply.outputs.predictions_stream_name }}
MODEL_BUCKET: ${{ steps.tf-apply.outputs.model_bucket }}
RUN_ID: ${{ steps.get-model-artifacts.outputs.run_id }}
run: |
variables="{PREDICTIONS_STREAM_NAME=$PREDICTIONS_STREAM_NAME, MODEL_BUCKET=$MODEL_BUCKET, RUN_ID=$RUN_ID}"
STATE=$(aws lambda get-function --function-name $LAMBDA_FUNCTION --region "ap-south-1" --query 'Configuration.LastUpdateStatus' --output text)
while [[ "$STATE" == "InProgress" ]]
do
echo "sleep 5sec ...."
sleep 5s
STATE=$(aws lambda get-function --function-name $LAMBDA_FUNCTION --region "ap-south-1" --query 'Configuration.LastUpdateStatus' --output text)
echo $STATE
done
aws lambda update-function-configuration --function-name $LAMBDA_FUNCTION --environment "Variables=${variables}"We must define these outputs in our main.tf file at root directory of infrastructure:
output "lambda_function" {
value = "${var.lambda_function_name}_${var.project_id}"
}
output "model_bucket" {
value = module.s3_bucket.name
}
output "predictions_stream_name" {
value = "${var.output_stream_name}-${var.project_id}"
}
output "ecr_repo" {
value = "${var.ecr_repo_name}_${var.project_id}"
}This completes the Continuous Deployment process — automating infrastructure setup, Docker image management, and model delivery into production.
By implementing CI/CD with GitHub Actions in this MLOps Zoomcamp project, we ensure a robust and automated workflow:
-
CI (Continuous Integration):
- Automatically runs on pull requests to the
mainbranch. - Executes unit tests, lints code, performs integration tests using Docker & Kinesis.
- Validates Terraform configuration without applying changes.
- Automatically runs on pull requests to the
-
CD (Continuous Deployment):
- Triggers on every push to the
mainbranch. - Applies Terraform infrastructure changes in AWS.
- Builds and pushes Docker images to Amazon ECR.
- Copies ML model artifacts and updates AWS Lambda function environment.
- Triggers on every push to the
Together, these pipelines ensure:
- Code quality is enforced before merging.
- Infrastructure is always validated and versioned.
- Deployments are consistent, automated, and traceable.
- Everything reacts seamlessly to GitHub events like PRs and merges.
**Screenshots: **



