Skip to content

Latest commit

 

History

History
99 lines (68 loc) · 1.56 KB

File metadata and controls

99 lines (68 loc) · 1.56 KB

⚡ Plagiarism Detector Using String Matching Algorithms

A Python + Streamlit based plagiarism detection system using Naive Matching, KMP, and Rabin-Karp algorithms to compare documents and detect similarity.


🚀 Project Overview

This project detects plagiarism between two text documents by comparing content using classic string matching algorithms. It calculates similarity percentage and generates a detailed report with a modern UI dashboard.


📂 Project Structure

Plagiarism-Detector-Using-String-Matching/

documents/

  • original.txt
  • submitted.txt

outputs/

  • sample outputs

reports/

  • generated plagiarism reports

images/

  • UI screenshots and results

docs/

  • architecture
  • algorithm explanation
  • future enhancements

src/

  • preprocess.py
  • naive_match.py
  • kmp.py
  • rabin_karp.py
  • similarity.py
  • report.py

main.py app.py requirements.txt .gitignore README.md

⚙️ Features

  • Text preprocessing
  • Naive string matching
  • KMP algorithm
  • Rabin-Karp algorithm
  • Similarity percentage calculation
  • Streamlit neon UI dashboard
  • Plagiarism report generation

🧠 DSA Concepts Used

  • Strings
  • Pattern Matching
  • Hashing
  • Sliding Window
  • LPS Array (KMP)
  • Set Operations

▶️ Run Project

CLI version: python main.py

UI version: streamlit run app.py


📊 Output

  • Similarity Score (%)
  • Matched Words
  • Detailed Report

🌟 Future Enhancements

  • PDF/DOCX support
  • Sentence-level detection
  • Highlight copied text
  • AI-based semantic similarity
  • Multi-document comparison