Customer-Churn-Prediction-Personal-Project

Goal: Train a supervised learning model to predict customer churn for the fictional company Fabapalooza, and demonstrate how the model can guide targeted retention efforts.

1. Project Overview

This project focuses on developing a supervised learning model to predict customer churn for a fictional company called Fabapalooza. The goal is to help the company identify which customers to prioritize in a retention plan where each intervention costs €20,000.

Your team must deliver ML models to:

- Predict which customers are likely to churn.

- Estimate expected lost revenue for each customer.

- Identify a profitable subset of customers to target with retention efforts.

- Communicate findings through a recorded presentation and reproducible R code.

Project deliverables:

- A reproducible R Markdown file implementing the full analysis.

- A reproducible Jupyter notebook implementing the full analysis using Python

The project consists of data understanding, model training and tuning, model assessment, and demonstrating how predictions inform retention decisions.

2. Business Case Description

Fabapalooza provides 3D printer hardware for business clients. Its customer base consists largely of small companies and startups, leading to high churn rates. Despite the volatility, long-term customers provide significant value through referrals and stable revenue.

The company plans to:

- Predict each customer’s probability of churn.

- Estimate expected revenue lost if the customer churns.

- Multiply these to calculate expected loss: $Expected Loss=E[Lost Revenue∣Churn]×Pr(Churn)$

- Prioritize customers with the highest expected loss, as long as the expected benefit exceeds €20,000.

Your task is to build the predictive model supporting this plan.

3. Data Description

customers.Rdata: the customer-year dataset.

analysis_template.R: Parker’s partially completed analysis file.

Each row represents a customer at the end of a specific year. Simplifying assumptions include fixed customer attributes across years, churn only at year-end, and no returning customers.

Candidate models

LASSO logistic regression
Random forest classifier

Use tidymodels for:

- Preprocessing

- Model tuning

- Cross-validation (5 folds, 4 repeats provided)

Final Model Demonstration

- Retrain the full analysis set (pre-2024)

- Generate soft predictions on assessment set (2024 customers)

- Estimate expected lost revenue for each customer

- Compute expected loss using predicted churn probabilities

- Determine and recommend how many customers should be targeted to maximize expected net value

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer-Churn-Prediction-Personal-Project

1. Project Overview

2. Business Case Description

3. Data Description

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Customer-Churn-Prediction-Personal-Project

1. Project Overview

2. Business Case Description

3. Data Description

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages