A Python + Streamlit based plagiarism detection system using Naive Matching, KMP, and Rabin-Karp algorithms to compare documents and detect similarity.
This project detects plagiarism between two text documents by comparing content using classic string matching algorithms. It calculates similarity percentage and generates a detailed report with a modern UI dashboard.
Plagiarism-Detector-Using-String-Matching/
documents/
- original.txt
- submitted.txt
outputs/
- sample outputs
reports/
- generated plagiarism reports
images/
- UI screenshots and results
docs/
- architecture
- algorithm explanation
- future enhancements
src/
- preprocess.py
- naive_match.py
- kmp.py
- rabin_karp.py
- similarity.py
- report.py
main.py app.py requirements.txt .gitignore README.md
- Text preprocessing
- Naive string matching
- KMP algorithm
- Rabin-Karp algorithm
- Similarity percentage calculation
- Streamlit neon UI dashboard
- Plagiarism report generation
- Strings
- Pattern Matching
- Hashing
- Sliding Window
- LPS Array (KMP)
- Set Operations
CLI version: python main.py
UI version: streamlit run app.py
- Similarity Score (%)
- Matched Words
- Detailed Report
- PDF/DOCX support
- Sentence-level detection
- Highlight copied text
- AI-based semantic similarity
- Multi-document comparison