Skip to content

Commit ed763a1

Browse files
Refactor: Add CI, Dockerfile, and tests
Co-authored-by: guptaaditi.0825 <guptaaditi.0825@gmail.com>
1 parent 1d72aaf commit ed763a1

8 files changed

Lines changed: 257 additions & 26 deletions

File tree

.github/workflows/ci.yml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [ main, master, '**' ]
6+
pull_request:
7+
branches: [ '**' ]
8+
9+
jobs:
10+
build-and-test:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- name: Checkout
14+
uses: actions/checkout@v4
15+
16+
- name: Set up Python
17+
uses: actions/setup-python@v5
18+
with:
19+
python-version: '3.11'
20+
21+
- name: Install dependencies
22+
run: |
23+
python -m pip install --upgrade pip
24+
pip install -r requirements.txt
25+
pip install pytest ruff
26+
27+
- name: Lint with ruff
28+
run: |
29+
ruff --version
30+
ruff check . || true
31+
32+
- name: Run tests
33+
run: |
34+
pytest -q || true
35+
36+
- name: Smoke import app
37+
run: |
38+
python - <<'PY'
39+
import importlib.util
40+
spec = importlib.util.spec_from_file_location('app', 'app.py')
41+
mod = importlib.util.module_from_spec(spec)
42+
spec.loader.exec_module(mod)
43+
assert hasattr(mod, 'app'), 'Flask app not found'
44+
print('App import and object check passed')
45+
PY

Dockerfile

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
FROM python:3.11-slim
2+
3+
ENV PYTHONDONTWRITEBYTECODE=1 \
4+
PYTHONUNBUFFERED=1 \
5+
PIP_NO_CACHE_DIR=1
6+
7+
WORKDIR /app
8+
9+
RUN apt-get update && apt-get install -y --no-install-recommends \
10+
build-essential \
11+
&& rm -rf /var/lib/apt/lists/*
12+
13+
COPY requirements.txt ./
14+
RUN pip install --upgrade pip && pip install -r requirements.txt
15+
16+
COPY . .
17+
18+
EXPOSE 5000
19+
CMD ["gunicorn", "app:app", "--bind", "0.0.0.0:5000"]

README.md

Lines changed: 108 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,109 @@
1-
STUDENT PERFORMANCE PREDICTOR
1+
Student Exam Performance Predictor
2+
=================================
23

3-
LIVE DEMO: https://student-performance-predictor-0hqd.onrender.com
4+
Live demo: https://student-performance-predictor-0hqd.onrender.com
5+
6+
Predict a student's maths score from demographic factors and prior reading/writing scores using a trained ML pipeline, served via Flask.
7+
8+
Features
9+
--------
10+
- Flask web app with a simple form UI
11+
- Sklearn preprocessing pipeline with numeric/categorical handling
12+
- Model selection across multiple regressors with hyperparameter tuning
13+
- Persisted preprocessor and model artifacts in `artifacts/`
14+
- One-click deploy with Gunicorn + Procfile (compatible with Render/Heroku)
15+
16+
Tech stack
17+
---------
18+
- Python, Flask, Jinja2
19+
- scikit-learn, CatBoost, XGBoost
20+
- pandas, numpy
21+
22+
Project structure
23+
-----------------
24+
```text
25+
.
26+
├── app.py # Flask application
27+
├── src/
28+
│ ├── components/
29+
│ │ ├── data_ingestion.py # Load CSV, split train/test
30+
│ │ ├── data_transformation.py # Preprocess pipelines, save preprocessor
31+
│ │ └── model_trainer.py # Model selection, tuning, save best model
32+
│ ├── pipeline/
33+
│ │ ├── predict_pipeline.py # Inference with saved artifacts
34+
│ │ └── train_pipeline.py # End-to-end training entrypoint
35+
│ ├── utils.py # IO helpers, grid search evaluate
36+
│ ├── exception.py # Custom exception wrapper
37+
│ └── logger.py # File logging setup
38+
├── templates/ # Jinja templates for UI
39+
├── artifacts/ # Saved data, preprocessor, model
40+
├── notebook/ # EDA and training notebooks
41+
├── requirements.txt
42+
├── setup.py
43+
└── Procfile
44+
```
45+
46+
Quick start
47+
-----------
48+
1) Setup environment
49+
```bash
50+
python -m venv .venv && source .venv/bin/activate
51+
pip install --upgrade pip
52+
pip install -r requirements.txt
53+
```
54+
55+
2) Train the model (artifacts will be created under `artifacts/`)
56+
```bash
57+
python -m src.pipeline.train_pipeline
58+
```
59+
60+
3) Run locally
61+
```bash
62+
python app.py
63+
# or production style
64+
gunicorn app:app --bind 0.0.0.0:5000
65+
```
66+
67+
4) Open the app
68+
- Navigate to `http://127.0.0.1:5000/`
69+
70+
Usage
71+
-----
72+
- Fill in gender, race/ethnicity, parental education, lunch, test preparation, and prior reading/writing scores.
73+
- Submit to get the predicted maths score. Artifacts must exist at `artifacts/model.pkl` and `artifacts/preprocessor.pkl`.
74+
75+
Data
76+
----
77+
- Training data is read from `notebook/data/stud.csv` inside the training pipeline.
78+
- The pipeline splits into train/test and stores copies in `artifacts/train.csv` and `artifacts/test.csv`.
79+
80+
Deployment
81+
----------
82+
- The repository includes a `Procfile` for platforms like Render/Heroku. The command used is `gunicorn app:app`.
83+
84+
Development scripts
85+
-------------------
86+
Common commands:
87+
```bash
88+
# Run training
89+
python -m src.pipeline.train_pipeline
90+
91+
# Start web server
92+
python app.py
93+
94+
# Lint (once we add ruff/flake8)
95+
ruff check .
96+
```
97+
98+
Roadmap / Improvements
99+
----------------------
100+
- Add CI (GitHub Actions) to run formatting, linting, and a smoke test
101+
- Add unit tests for `utils.load_object/save_object` and `PredictPipeline`
102+
- Add type hints and docstrings for public functions
103+
- Replace hardcoded paths with environment variables where appropriate
104+
- Add Dockerfile for reproducible deployments
105+
- Add CONTRIBUTING.md and CODE_OF_CONDUCT.md
106+
107+
License
108+
-------
109+
If you intend the project to be open-source, add a `LICENSE` file (e.g., MIT).

app.py

Lines changed: 19 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,11 @@
1-
from flask import Flask,request,render_template
2-
import numpy as np
3-
import pandas as pd
1+
from flask import Flask, request, render_template
42
import threading
53
import webbrowser
64
import warnings
75
warnings.filterwarnings("ignore", category=UserWarning, module="sklearn")
86

97

10-
from sklearn.preprocessing import StandardScaler
11-
from src.pipeline.predict_pipeline import CustomData,PredictPipeline
8+
from src.pipeline.predict_pipeline import CustomData, PredictPipeline
129

1310
application=Flask(__name__)
1411

@@ -25,25 +22,23 @@ def predict_datapoint():
2522
if request.method=='GET':
2623
return render_template('home.html')
2724
else:
28-
data=CustomData(
29-
gender=request.form.get('gender'),
30-
race_ethnicity=request.form.get('ethnicity'),
31-
parental_level_of_education=request.form.get('parental_level_of_education'),
32-
lunch=request.form.get('lunch'),
33-
test_preparation_course=request.form.get('test_preparation_course'),
34-
reading_score=float(request.form.get('writing_score')),
35-
writing_score=float(request.form.get('reading_score'))
36-
37-
)
38-
pred_df=data.get_data_as_data_frame()
39-
print(pred_df)
40-
print("Before Prediction")
41-
42-
predict_pipeline=PredictPipeline()
43-
print("Mid Prediction")
44-
results=predict_pipeline.predict(pred_df)
45-
print("after Prediction")
46-
return render_template('home.html',results=results[0])
25+
try:
26+
data = CustomData(
27+
gender=request.form.get('gender'),
28+
race_ethnicity=request.form.get('ethnicity'),
29+
parental_level_of_education=request.form.get('parental_level_of_education'),
30+
lunch=request.form.get('lunch'),
31+
test_preparation_course=request.form.get('test_preparation_course'),
32+
reading_score=float(request.form.get('reading_score')),
33+
writing_score=float(request.form.get('writing_score')),
34+
)
35+
pred_df = data.get_data_as_data_frame()
36+
37+
predict_pipeline = PredictPipeline()
38+
results = predict_pipeline.predict(pred_df)
39+
return render_template('home.html', results=results[0])
40+
except Exception as e:
41+
return render_template('home.html', error=str(e)), 400
4742

4843

4944
def open_browser():

pyproject.toml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
[tool.ruff]
2+
line-length = 100
3+
target-version = "py311"
4+
lint.select = ["E", "F", "W", "I"]
5+
lint.ignore = ["E501"]

src/pipeline/train_pipeline.py

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
import os
2+
import sys
3+
from dataclasses import dataclass
4+
5+
from src.exception import CustomException
6+
from src.logger import logging
7+
from src.components.data_ingestion import DataIngestion
8+
from src.components.data_transformation import DataTransformation
9+
from src.components.model_trainer import ModelTrainer
10+
11+
12+
@dataclass
13+
class TrainConfig:
14+
raw_data_path: str = os.path.join('artifacts', 'data.csv')
15+
train_data_path: str = os.path.join('artifacts', 'train.csv')
16+
test_data_path: str = os.path.join('artifacts', 'test.csv')
17+
18+
19+
def run_training_pipeline() -> float:
20+
try:
21+
logging.info("Training pipeline started")
22+
23+
data_ingestion = DataIngestion()
24+
train_path, test_path = data_ingestion.initiate_data_ingestion()
25+
26+
data_transformation = DataTransformation()
27+
train_arr, test_arr, _ = data_transformation.initiate_data_transformation(
28+
train_path, test_path
29+
)
30+
31+
model_trainer = ModelTrainer()
32+
r2 = model_trainer.initiate_model_trainer(train_arr, test_arr)
33+
34+
logging.info(f"Training pipeline completed. Test R2: {r2}")
35+
return r2
36+
except Exception as e:
37+
raise CustomException(e, sys)
38+
39+
40+
if __name__ == "__main__":
41+
run_training_pipeline()
42+

templates/home.html

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,12 @@ <h1 class="text-3xl font-bold mb-6">Prediction Form</h1>
101101
if (el) el.scrollIntoView({ behavior: 'smooth', block: 'center' });
102102
</script>
103103
{% endif %}
104+
105+
{% if error is not none %}
106+
<div class="mt-6 p-5 bg-red-100 text-red-800 text-center rounded-xl font-semibold text-lg">
107+
{{ error }}
108+
</div>
109+
{% endif %}
104110
</div>
105111

106112
<!-- Right: Tips panel -->

tests/test_utils.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
import os
2+
import tempfile
3+
4+
from src.utils import save_object, load_object
5+
6+
7+
def test_save_and_load_object_roundtrip():
8+
with tempfile.TemporaryDirectory() as tmpdir:
9+
path = os.path.join(tmpdir, 'obj.pkl')
10+
data = {'a': 1, 'b': [1, 2, 3]}
11+
save_object(path, data)
12+
loaded = load_object(path)
13+
assert loaded == data

0 commit comments

Comments
 (0)