This project demonstrates a complete MLOps pipeline using the Iris dataset for flower species classification.
- Data: Iris dataset (classification)
- ML Framework: scikit-learn
- Experiment Tracking: MLflow
- API: FastAPI
- Containerization: Docker
- CI/CD: GitHub Actions
- Monitoring: Custom logging with SQLite
- Validation: Pydantic
├── data/ # Dataset storage
├── src/
│ ├── data/ # Data processing modules
│ ├── models/ # Model training modules
│ ├── api/ # FastAPI application
│ └── monitoring/ # Logging and monitoring
├── notebooks/ # Jupyter notebooks for exploration
├── tests/ # Unit tests
├── docker/ # Docker configuration
├── .github/workflows/ # GitHub Actions CI/CD
├── mlruns/ # MLflow tracking data
├── requirements.txt # Python dependencies
├── Dockerfile # Docker configuration
└── README.md # This file
-
Setup Environment
pip install -r requirements.txt
-
Train Models
python src/models/train.py
-
Start API
uvicorn src.api.main:app --reload
-
Build Docker Image
docker build -t iris-mlops-pipeline .
| Model | Accuracy | Precision | Recall | F1-Score |
|---|---|---|---|---|
| Logistic Regression | 93.33% | 93.33% | 93.33% | 93.33% |
| Random Forest | 90.00% | 90.24% | 90.00% | 89.97% |
Best model: Logistic Regression automatically selected based on performance
- Prediction requests are logged to SQLite database
- Metrics endpoint available at
/metrics - Request/response logging for audit trail
pytest tests/mlflow uidocker run -p 8000:8000 iris-mlops-pipelineOnce running, visit: http://localhost:8000/docs