Skip to content

Commit 27ddb4f

Browse files
authored
Refactor README formatting and content
Updated formatting and removed emojis for consistency.
1 parent 6e7983f commit 27ddb4f

1 file changed

Lines changed: 30 additions & 30 deletions

File tree

README.md

Lines changed: 30 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,29 @@
1-
# 🎬 Personalized Movie Recommendation System using PySpark & Collaborative Filtering
1+
# Personalized Movie Recommendation System using PySpark & Collaborative Filtering
22

3-
> 📌 Project 1 of 6 | Pushed as part of my academic + real-world ML portfolio 🚀
3+
> Project 1 of 6 | Pushed as part of my academic + real-world ML portfolio
44
5-
## 🧠 Overview
5+
## Overview
66
In the ever-growing jungle of streaming content, users often get lost in endless scrolls and mediocre suggestions. Our project dives into solving this problem by building a **personalized movie recommendation system** powered by **collaborative filtering** and **Apache Spark**, capable of processing massive datasets and giving spot-on suggestions based on user behavior.
77

8-
## 📈 Key Features
9-
- 💡 Personalized suggestions based on **user-item interaction**
10-
- Built with **PySpark** on **Apache Spark** for large-scale performance
11-
- 🧪 Evaluated using **RMSE**, **precision**, and **recall**
12-
- 🤝 Scalable, fast, and adaptable to various streaming platforms
13-
- 🔒 Acknowledges **bias and privacy** issues in recommender systems
8+
## Key Features
9+
- Personalized suggestions based on **user-item interaction**
10+
- Built with **PySpark** on **Apache Spark** for large-scale performance
11+
- Evaluated using **RMSE**, **precision**, and **recall**
12+
- Scalable, fast, and adaptable to various streaming platforms
13+
- Acknowledges **bias and privacy** issues in recommender systems
1414

15-
## 🛠️ Tech Stack
15+
## Tech Stack
1616
- **Language**: Python
1717
- **Frameworks**: PySpark, Apache Hadoop (HDFS)
1818
- **Tools**: MLlib, Jupyter, VS Code
1919
- **Algorithm**: User-based Collaborative Filtering
2020

21-
## 📂 Dataset
21+
## Dataset
2222
- Contains over **8,000+** user interactions and movie ratings
2323
- Publicly sourced, includes diverse genres, languages, and release years
2424
- Preprocessing steps include handling nulls, normalization, and outlier removal
2525

26-
## 📊 Dataset
26+
## Dataset
2727

2828
This project uses the ([https://www.kaggle.com/datasets/grouplens/movielens-20m-dataset](https://www.kaggle.com/datasets/arzubesiroglu/netflix-titles)) which contains millions of user-movie interactions, ratings, and metadata.
2929

@@ -35,7 +35,7 @@ To use the full dataset:
3535
3. Place it in the root directory or update the path in the code accordingly
3636

3737

38-
## 📊 Results
38+
## Results
3939
- Achieved **RMSE = 3.7725** on our baseline implementation
4040
- Compared with benchmark paper achieving **RMSE = 1.0742**
4141
- Insights into how **parameter tuning** (lambda, iterations, rank) affects performance
@@ -49,29 +49,29 @@ We’ve drawn inspiration and technical strategies from key works including:
4949

5050
_For the full IEEE-style paper, check the documenation folder in this repo :)
5151

52-
## 🧠 Authors & Credits
53-
Built with ❤️ by a team of graduate students as part of our coursework under the guidance of our incredible supervisor (see acknowledgments in paper). Shoutout to all contributors and cited researchers!
52+
## Authors & Credits
53+
Built by a team of graduate students as part of our coursework under the guidance of our incredible supervisor (see acknowledgments in paper). Shoutout to all contributors and cited researchers!
5454

55-
## 📌 Future Work
56-
- 🧠 Incorporating **hybrid models** (content + collaborative)
57-
- 🔒 Introducing **privacy-preserving mechanisms**
58-
- 🎯 Deploying the system on a cloud platform for live inference
55+
## Future Work
56+
- Incorporating **hybrid models** (content + collaborative)
57+
- Introducing **privacy-preserving mechanisms**
58+
- Deploying the system on a cloud platform for live inference
5959

60-
## 📎 License
60+
## License
6161
feel free to fork, star, and remix with credit!
6262

6363
## 📁 Project Structure
6464

65-
📦 PySparkFlicks_MovieRecommender/
65+
PySparkFlicks_MovieRecommender/
6666
```
67-
|---🧠 code/ → PySpark code and scripts
68-
├── 📒 notebooks/ → Jupyter Notebooks for exploration
69-
├── 📊 data/ → Sample Netflix dataset
70-
├── 📄 documentation/ → IEEE paper, diagrams, and references
71-
├── ⚙️ .github/workflows/ → CI/CD workflows (Python)
72-
├── 📦 requirements.txt → Python dependencies
73-
├── 🛠️ setup.py → Installable package setup (optional)
74-
├── 📘 README.md → This very file
75-
└── 🧾 LICENSE → Open-source license
67+
|--- code/ → PySpark code and scripts
68+
├── notebooks/ → Jupyter Notebooks for exploration
69+
├── data/ → Sample Netflix dataset
70+
├── documentation/ → IEEE paper, diagrams, and references
71+
├── .github/workflows/ → CI/CD workflows (Python)
72+
├── requirements.txt → Python dependencies
73+
├── setup.py → Installable package setup (optional)
74+
├── README.md → This very file
75+
└── LICENSE → Open-source license
7676
```
7777

0 commit comments

Comments
 (0)