Spam Detection System

NLP-Powered SMS Spam Classifier with Real-Time Confidence Scoring

Intelligent SMS filtering using NLP and Deep Learning

Overview

A machine learning-powered application that identifies spam messages using Natural Language Processing (NLP). The system:

Analyzes SMS/text messages in real-time
Classifies text as Spam or Ham (legitimate)
Provides confidence scores for predictions
Processes messages instantly with <100ms latency
Features an intuitive web interface

Why This Matters

With over 45% of SMS messages being spam globally, this tool helps:

Protect users from phishing attempts
Prevent financial scams
Filter malicious links and content
Save time by auto-filtering unwanted messages

Features

Core Capabilities

NLP-Based Classification - Advanced text processing
TF-IDF Vectorization - Smart feature extraction
Deep Learning Model - TensorFlow/Keras neural network
Real-Time Prediction - Instant message analysis
Confidence Scoring - Probability-based results
Interactive Dashboard - Streamlit web interface

Advanced Features

Pattern Recognition - Identify common spam patterns
URL Detection - Flag suspicious links
Phone Number Extraction - Identify spam sender patterns

Demo

Web Interface

# Launch the Streamlit app
streamlit run app.py

How It Works

Processing Pipeline

┌──────────────┐
│ Input Text   │
│ "FREE PRIZE" │
└──────┬───────┘
       │
       ▼
┌──────────────────┐
│ Text Cleaning    │
│ • Lowercase      │
│ • Remove punct.  │
│ • Tokenization   │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Feature          │
│ Extraction       │
│ • TF-IDF         │
│ • N-grams        │
│ • Word vectors   │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Neural Network   │
│ Classification   │
│ • Dense layers   │
│ • Dropout        │
│ • Softmax output │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Prediction       │
│ SPAM: 98.7%      │
│ HAM:  1.3%       │
└──────────────────┘

Tech Stack

Frontend: Streamlit
Machine Learning: TensorFlow / Keras
NLP: NLTK, Scikit-learn (TF-IDF Vectorization)
Data Handling: Joblib, NumPy, Pandas

Installation

# 1. Clone & Activate 
python -m venv .venv
.\.venv\Scripts\activate

# 2. Install dependencies
pip install -r requirements.txt

# 3. Run the app
streamlit run app.py

Model Architecture

Neural Network Structure

model = Sequential([
    # Input layer
    Dense(128, activation='relu', input_shape=(5000,)),
    Dropout(0.5),
    
    # Hidden layers
    Dense(64, activation='relu'),
    Dropout(0.4),
    
    Dense(32, activation='relu'),
    Dropout(0.3),
    
    # Output layer
    Dense(2, activation='softmax')  # Binary classification
])

Performance

Metric	Score
Accuracy	98.2%
Precision	97.8%
Recall	96.5%

Contributing

Contributions welcome! Please follow these steps:

Fork the repository
Create feature branch
Push to branch
Open Pull Request

License

This project is licensed under the Apache License 2.0.

Author

Au Amores - AI/ML Engineer

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
spam_classifier.keras		spam_classifier.keras
tfidf_vectorizer.pkl		tfidf_vectorizer.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spam Detection System

NLP-Powered SMS Spam Classifier with Real-Time Confidence Scoring

Table of Contents

Overview

Why This Matters

Features

Core Capabilities

Advanced Features

Demo

Web Interface

How It Works

Processing Pipeline

Tech Stack

Installation

Model Architecture

Neural Network Structure

Performance

Contributing

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Spam Detection System

NLP-Powered SMS Spam Classifier with Real-Time Confidence Scoring

Table of Contents

Overview

Why This Matters

Features

Core Capabilities

Advanced Features

Demo

Web Interface

How It Works

Processing Pipeline

Tech Stack

Installation

Model Architecture

Neural Network Structure

Performance

Contributing

License

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages