You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# 🎬 Personalized Movie Recommendation System using PySpark & Collaborative Filtering
1
+
# Personalized Movie Recommendation System using PySpark & Collaborative Filtering
2
2
3
-
> 📌 Project 1 of 6 | Pushed as part of my academic + real-world ML portfolio 🚀
3
+
> Project 1 of 6 | Pushed as part of my academic + real-world ML portfolio
4
4
5
-
## 🧠 Overview
5
+
## Overview
6
6
In the ever-growing jungle of streaming content, users often get lost in endless scrolls and mediocre suggestions. Our project dives into solving this problem by building a **personalized movie recommendation system** powered by **collaborative filtering** and **Apache Spark**, capable of processing massive datasets and giving spot-on suggestions based on user behavior.
7
7
8
-
## 📈 Key Features
9
-
-💡 Personalized suggestions based on **user-item interaction**
10
-
-⚡ Built with **PySpark** on **Apache Spark** for large-scale performance
11
-
-🧪 Evaluated using **RMSE**, **precision**, and **recall**
12
-
-🤝 Scalable, fast, and adaptable to various streaming platforms
13
-
-🔒 Acknowledges **bias and privacy** issues in recommender systems
8
+
## Key Features
9
+
- Personalized suggestions based on **user-item interaction**
10
+
- Built with **PySpark** on **Apache Spark** for large-scale performance
11
+
- Evaluated using **RMSE**, **precision**, and **recall**
12
+
- Scalable, fast, and adaptable to various streaming platforms
13
+
- Acknowledges **bias and privacy** issues in recommender systems
- Contains over **8,000+** user interactions and movie ratings
23
23
- Publicly sourced, includes diverse genres, languages, and release years
24
24
- Preprocessing steps include handling nulls, normalization, and outlier removal
25
25
26
-
## 📊 Dataset
26
+
## Dataset
27
27
28
28
This project uses the ([https://www.kaggle.com/datasets/grouplens/movielens-20m-dataset](https://www.kaggle.com/datasets/arzubesiroglu/netflix-titles)) which contains millions of user-movie interactions, ratings, and metadata.
29
29
@@ -35,7 +35,7 @@ To use the full dataset:
35
35
3. Place it in the root directory or update the path in the code accordingly
36
36
37
37
38
-
## 📊 Results
38
+
## Results
39
39
- Achieved **RMSE = 3.7725** on our baseline implementation
40
40
- Compared with benchmark paper achieving **RMSE = 1.0742**
41
41
- Insights into how **parameter tuning** (lambda, iterations, rank) affects performance
@@ -49,29 +49,29 @@ We’ve drawn inspiration and technical strategies from key works including:
49
49
50
50
_For the full IEEE-style paper, check the documenation folder in this repo :)
51
51
52
-
## 🧠 Authors & Credits
53
-
Built with ❤️ by a team of graduate students as part of our coursework under the guidance of our incredible supervisor (see acknowledgments in paper). Shoutout to all contributors and cited researchers!
52
+
## Authors & Credits
53
+
Built by a team of graduate students as part of our coursework under the guidance of our incredible supervisor (see acknowledgments in paper). Shoutout to all contributors and cited researchers!
0 commit comments