Skip to content

codezelaca/ai-ml-k-means-cluster-project01

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Customer Segmentation ML Application

A production-ready FastAPI web application for customer segmentation using Machine Learning (KMeans Clustering). Built with MVC architecture following industry best practices.

Python FastAPI scikit-learn License

Table of Contents

Features

  • Machine Learning: KMeans clustering for customer segmentation
  • FastAPI Backend: High-performance async API
  • Modern UI: Beautiful, responsive HTML interface
  • Real-time Predictions: Instant customer segment classification
  • MVC Architecture: Clean separation of concerns
  • Business Intelligence: Marketing strategy recommendations
  • Input Validation: Pydantic schemas for data validation
  • Auto Documentation: Swagger UI and ReDoc
  • Best Practices: Following industry standards

πŸ“ Project Structure

practice/
β”œβ”€β”€ app/                          # Main application package
β”‚   β”œβ”€β”€ __init__.py              # Package initialization
β”‚   β”œβ”€β”€ core/                    # Core configuration
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── config.py           # Application settings
β”‚   β”œβ”€β”€ models/                  # ML Models (Model layer)
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── ml_model.py         # KMeans model handler
β”‚   β”œβ”€β”€ schemas/                 # Data validation schemas
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── customer.py         # Pydantic models
β”‚   β”œβ”€β”€ services/                # Business logic (Service layer)
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── prediction_service.py
β”‚   β”œβ”€β”€ controllers/             # Request handlers (Controller layer)
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ api_controller.py   # REST API endpoints
β”‚   β”‚   └── view_controller.py  # HTML views
β”‚   β”œβ”€β”€ templates/               # Jinja2 templates (View layer)
β”‚   β”‚   β”œβ”€β”€ index.html          # Home page
β”‚   β”‚   └── about.html          # About page
β”‚   β”œβ”€β”€ static/                  # Static assets
β”‚   β”‚   β”œβ”€β”€ css/
β”‚   β”‚   β”‚   └── style.css       # Styles
β”‚   β”‚   └── js/
β”‚   β”‚       └── app.js          # Frontend JavaScript
β”‚   └── utils/                   # Utility functions
β”‚       β”œβ”€β”€ __init__.py
β”‚       └── helpers.py
β”œβ”€β”€ data/                        # Data directory
β”‚   β”œβ”€β”€ raw/                    # Raw data
β”‚   β”‚   └── mall_customers.csv
β”‚   └── processed/              # Processed data
β”‚       └── mall_customers_processed.csv
β”œβ”€β”€ models_artifacts/           # Saved ML models
β”‚   β”œβ”€β”€ kmeans_model.pkl       # Trained KMeans model
β”‚   └── scaler.pkl             # Feature scaler
β”œβ”€β”€ notebooks/                  # Jupyter notebooks
β”‚   β”œβ”€β”€ 01_eda_preprocessing.ipynb
β”‚   └── 02_modeling_evaluation.ipynb
β”œβ”€β”€ main.py                     # Application entry point
β”œβ”€β”€ train_model.py             # Model training script
β”œβ”€β”€ requirements.txt           # Python dependencies
└── README.md                  # This file

Installation

Prerequisites

  • Python 3.10 or higher
  • pip package manager

Step 1: Clone or Navigate to Project

cd d:\AIML\practice

Step 2: Create Virtual Environment (Recommended)

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate

# On macOS/Linux:
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

Usage

Step 1: Train the Model

First, you need to train the KMeans model and save the artifacts:

Option A: Using Jupyter Notebooks (Recommended for first-time users)

# Launch Jupyter
jupyter notebook

# Open and run these notebooks in order:
# 1. notebooks/01_eda_preprocessing.ipynb
# 2. notebooks/02_modeling_evaluation.ipynb

Option B: Using Training Script

# This requires processed data from the notebooks
python train_model.py

The training script will:

  • Load processed customer data
  • Train KMeans clustering model
  • Save model artifacts to models_artifacts/
  • Generate customer segment labels

Step 2: Run the Application

# Start the FastAPI server
python main.py

# Or using uvicorn directly:
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Step 3: Access the Application

API Documentation

REST API Endpoints

1. Predict Customer Segment

POST /api/v1/predict
Content-Type: application/json

{
  "annual_income": 70.0,
  "spending_score": 75
}

Response:

{
  "cluster_id": 1,
  "cluster_name": "VIP / Whale",
  "annual_income": 70.0,
  "spending_score": 75,
  "description": "High income, high spending customers",
  "marketing_strategy": "Premium products, exclusive offers..."
}

2. Get Cluster Statistics

GET /api/v1/clusters

3. Get Cluster Information

GET /api/v1/clusters/info

4. Get Model Information

GET /api/v1/model/info

5. Health Check

GET /api/v1/health

Using Python Requests

import requests

# Predict customer segment
response = requests.post(
    "http://localhost:8000/api/v1/predict",
    json={
        "annual_income": 90.0,
        "spending_score": 85
    }
)
result = response.json()
print(f"Segment: {result['cluster_name']}")

Using cURL

curl -X POST "http://localhost:8000/api/v1/predict" \
     -H "Content-Type: application/json" \
     -d '{"annual_income": 90.0, "spending_score": 85}'

πŸ—οΈ Architecture

This application follows the MVC (Model-View-Controller) architectural pattern:

Model Layer (app/models/)

  • Handles ML model loading, predictions, and persistence
  • Manages KMeans clustering model and StandardScaler
  • Provides model metadata and cluster centroids

View Layer (app/templates/, app/static/)

  • Jinja2 templates for HTML rendering
  • CSS for styling
  • JavaScript for interactivity and API calls

Controller Layer (app/controllers/)

  • API Controller: REST API endpoints
  • View Controller: HTML page rendering
  • Request/response handling
  • Input validation

Service Layer (app/services/)

  • Business logic and rules
  • Prediction processing
  • Cluster descriptions and marketing strategies
  • Statistics calculation

Core Configuration (app/core/)

  • Application settings
  • Environment variables
  • Path configurations

Schemas (app/schemas/)

  • Pydantic models for validation
  • Request/response schemas
  • Type safety

Customer Segments

The model identifies 5 distinct customer segments:

Segment Income Spending Description Marketing Strategy
Average Customer Moderate Moderate Balanced shoppers Standard promotions, loyalty programs
VIP / Whale High High Premium customers Exclusive offers, VIP experiences
Young Trendsetter Low-Moderate High Fashion-conscious Trendy products, social media marketing
High Earner Saver High Low Conservative spenders Quality products, investment opportunities
Budget Conscious Low Low Price-sensitive Discounts, clearance sales

Technologies

Backend

  • FastAPI - Modern, fast web framework
  • Uvicorn - ASGI server
  • Pydantic - Data validation
  • scikit-learn - Machine Learning
  • pandas - Data manipulation
  • NumPy - Numerical computing

Frontend

  • HTML5 - Semantic markup
  • CSS3 - Modern styling with gradients and animations
  • JavaScript - Async API calls and DOM manipulation
  • Jinja2 - Template engine

ML/Data Science

  • KMeans Clustering - Customer segmentation algorithm
  • StandardScaler - Feature normalization
  • matplotlib - Visualization (notebooks)
  • seaborn - Statistical visualization (notebooks)

Configuration

Application settings can be configured in app/core/config.py:

class Settings(BaseSettings):
    APP_NAME: str = "Customer Segmentation API"
    APP_VERSION: str = "1.0.0"
    DEBUG: bool = True
    API_PREFIX: str = "/api/v1"
    # ... more settings

Model Performance

  • Algorithm: KMeans with k-means++ initialization
  • Number of Clusters: 5 (optimized using Elbow Method)
  • Features: Annual Income, Spending Score
  • Preprocessing: StandardScaler normalization
  • Validation: Silhouette Score

🚦 Testing

Manual Testing

  1. Test via Web Interface: Navigate to http://localhost:8000
  2. Test via API Docs: Go to http://localhost:8000/docs
  3. Test via Command Line:
# Health check
curl http://localhost:8000/api/v1/health

# Prediction
curl -X POST http://localhost:8000/api/v1/predict \
  -H "Content-Type: application/json" \
  -d '{"annual_income": 90, "spending_score": 85}'

πŸ“ License

This project is created for educational purposes.

About

A FastAPI web application for customer segmentation using Machine Learning (KMeans Clustering)

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages

  • Jupyter Notebook 93.6%
  • Python 3.6%
  • HTML 1.3%
  • Other 1.5%