You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This project provides a **production-style machine learning inference API**
4
-
for detecting fraudulent credit card transactions.
3
+
A production-style machine learning inference API for detecting fraudulent credit card transactions.
4
+
The API serves a trained XGBoost fraud detection model via FastAPI, containerized with Docker and automatically deployed via a GitHub Actions CI/CD pipeline.
5
5
6
-
The API serves a trained **XGBoost fraud detection model** via **FastAPI**,
7
-
with strict input validation, feature alignment, and safe inference handling.
8
-
9
-
The goal of this project is to demonstrate how a trained ML model is
10
-
**exposed, validated, and consumed** in a real-world system — not just trained.
6
+
---
11
7
12
8
## Problem Context
13
9
14
-
Credit card fraud detection is a **highly imbalanced classification problem**,
15
-
where missing a fraudulent transaction is often more costly than flagging
16
-
a legitimate one.
10
+
Credit card fraud detection is a highly imbalanced classification problem, where missing a fraudulent transaction is often more costly than flagging a legitimate one.
17
11
18
-
In real production systems, models are rarely used directly.
19
-
They must be:
12
+
In real production systems, models are rarely used directly. They must be:
20
13
- Validated
21
14
- Properly preprocessed
22
15
- Safely deployed behind an API
23
16
24
17
This project focuses on the **serving and inference layer** of a fraud detection system.
25
18
19
+
---
26
20
27
21
## Key Features
28
22
@@ -34,47 +28,49 @@ This project focuses on the **serving and inference layer** of a fraud detection
34
28
- Clear error handling with meaningful HTTP responses
35
29
- Model metadata endpoint for observability
36
30
- JSON-safe prediction outputs
31
+
- Dockerized for consistent deployment anywhere
32
+
- CI/CD pipeline via GitHub Actions → auto-builds and pushes to Docker Hub on every push
37
33
34
+
---
38
35
39
36
## Project Structure
40
37
41
-
```text
38
+
```
42
39
fraud-detection-api/
43
40
├── app/
44
41
│ ├── main.py # API endpoints
45
42
│ ├── schemas.py # Request validation schemas
46
-
│ ├── inference.py # Model inference logic
47
-
│ ├── config.py # Centralized configuration
43
+
│ ├── inference.py # Model inference logic
44
+
│ ├── config.py # Centralized configuration
48
45
│ └── __init__.py
49
46
├── models/
50
47
│ └── xgboost.pkl
51
48
├── artifacts/
52
49
│ └── standard_scaler.pkl
50
+
├── .github/
51
+
│ └── workflows/
52
+
│ └── deploy.yml # CI/CD pipeline
53
+
├── Dockerfile
53
54
└── .gitignore
54
55
```
55
56
56
57
---
57
58
58
-
## 📌 API Endpoints
59
+
## API Endpoints
59
60
60
-
```markdown
61
-
### Health Check :
61
+
### `GET /health`
62
62
Returns API health status.
63
63
64
-
### Model Information :
65
-
Returns model metadata such as:
66
-
- Model name
67
-
- Version
68
-
- Threshold
64
+
### `GET /model-info`
65
+
Returns model metadata:
66
+
- Model name and version
67
+
- Decision threshold
69
68
- Number of features
70
69
71
-
### Fraud Prediction
72
-
Accepts a full feature vector (Time, Amount, V1–V28) and returns
73
-
a fraud probability and decision.
74
-
```
75
-
76
-
## Example Prediction Request
70
+
### `POST /predict`
71
+
Accepts a full feature vector (Time, Amount, V1–V28) and returns a fraud probability and decision.
77
72
73
+
**Example Request:**
78
74
```json
79
75
{
80
76
"Time": 50000,
@@ -109,30 +105,72 @@ a fraud probability and decision.
109
105
"V28": -0.04
110
106
}
111
107
```
108
+
109
+
---
110
+
111
+
## Running with Docker
112
+
113
+
Pull and run directly from Docker Hub — no setup required:
114
+
115
+
```bash
116
+
docker pull saaddot/fraud-detection-api
117
+
docker run -p 8000:8000 saaddot/fraud-detection-api
118
+
```
119
+
120
+
Then open: `http://localhost:8000/health`
121
+
122
+
Or build locally:
123
+
```bash
124
+
docker build -t fraud-detection-api .
125
+
docker run -p 8000:8000 fraud-detection-api
126
+
```
127
+
112
128
---
113
129
114
-
## 📌 Design Decisions
130
+
## CI/CD Pipeline
131
+
132
+
Every push to `main` triggers a GitHub Actions workflow that:
133
+
134
+
1. Checks out the code on a fresh Ubuntu runner
135
+
2. Logs into Docker Hub using repository secrets
136
+
3. Builds the Docker image
137
+
4. Pushes it to Docker Hub as `saaddot/fraud-detection-api:latest`
115
138
116
-
```markdown
117
-
- The API expects the **same feature schema used during training**
118
-
to avoid inference drift.
119
-
- PCA features (V1–V28) are assumed to be computed upstream.
120
-
- Only Time and Amount are scaled during inference.
121
-
- Feature order is explicitly enforced before prediction.
122
-
- Inference logic is separated from API routing and configuration.
123
139
```
140
+
Push to main
141
+
↓
142
+
GitHub Actions (Ubuntu)
143
+
↓
144
+
Build Docker image
145
+
↓
146
+
Push to Docker Hub ← anyone can pull and run
147
+
```
148
+
149
+
---
150
+
151
+
## Design Decisions
152
+
153
+
- The API expects the **same feature schema used during training** to avoid inference drift
154
+
- PCA features (V1–V28) are assumed to be computed upstream
155
+
- Only Time and Amount are scaled during inference
156
+
- Feature order is explicitly enforced before prediction
157
+
- Inference logic is separated from API routing and configuration
0 commit comments