Bank Customer Complaints - NLP Classification

Overview

This project uses Natural Language Processing (NLP) and Machine Learning to classify consumer complaint narratives into banking product categories.

The goal is to help financial institutions and regulatory teams organize customer complaints more efficiently, identify common issue areas, and support faster response and resolution.

Business Objective

Financial institutions receive large volumes of customer complaints in free-text format. Manually reviewing and categorizing these complaints can be time-consuming and inconsistent.

This project automates complaint classification by predicting the product category based on the complaint narrative.

Input: Consumer complaint narrative
Output: Predicted banking product category
Example categories: Mortgage, Credit Card, Student Loan, Credit Reporting

Dataset

The dataset is based on the public Consumer Financial Protection Bureau (CFPB) Consumer Complaint Database.

Main columns used:

consumer_complaint_narrative: Complaint text
product: Target product category

Methodology

1. Data Cleaning and Preprocessing

Removed missing complaint narratives
Converted text to lowercase
Removed punctuation and special characters
Removed stopwords
Applied lemmatization

2. Feature Engineering

Applied TF-IDF vectorization to convert text into numerical features

3. Modeling

The following machine learning models were tested:

Logistic Regression
Random Forest Classifier
XGBoost Classifier

4. Evaluation

Models were evaluated using:

Accuracy
Precision
Recall
F1-score
Confusion Matrix

Results

The best-performing model was XGBoost Classifier.

Key results:

Accuracy: 0.90
Macro F1-score: 0.74
Strong performance for categories such as Mortgage and Credit Reporting
Lower performance for underrepresented categories with fewer samples

Tools and Technologies

Python
Pandas
NumPy
Scikit-learn
XGBoost
NLTK
Matplotlib
Seaborn
Jupyter Notebook

How to Run

Clone the repository:

git clone https://github.com/bahar-data/bank-customer-complaints-nlp.git
cd bank-customer-complaints-nlp

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
images		images
BaharMLProject (1).ipynb		BaharMLProject (1).ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bank Customer Complaints - NLP Classification

Overview

Business Objective

Dataset

Methodology

1. Data Cleaning and Preprocessing

2. Feature Engineering

3. Modeling

4. Evaluation

Results

Tools and Technologies

How to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bank Customer Complaints - NLP Classification

Overview

Business Objective

Dataset

Methodology

1. Data Cleaning and Preprocessing

2. Feature Engineering

3. Modeling

4. Evaluation

Results

Tools and Technologies

How to Run

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages