HIVE_Big_Data

MovieLens Genre Analytics Using Hive on Hadoop

Objective

Analyze the MovieLens dataset to identify popular genres using Hive over Hadoop in Cloudera VM. Extract insights for streaming platforms like Netflix or Prime.

Tech Stack

Apache Hive
Hadoop HDFS
Cloudera Quickstart VM
Linux Shell
Excel / Matplotlib for visualization

Key Features

Genre-wise popularity using Hive explode and split
Data stored and queried in HDFS
Business insights for recommendation engines
Visualization charts

Project Structure

datasets/: Input CSV files
Hive_Queries/: All Hive scripts
visualizations/: Graphs generated from output and Hive CLI output proofs

Report

See Movie analytics.docx for the full write-up.

How to Run

Set up Cloudera VM
Load movies.csv into HDFS
Create external Hive table
Run queries from hive_queries/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HIVE_Big_Data

MovieLens Genre Analytics Using Hive on Hadoop

Objective

Tech Stack

Key Features

Project Structure

Report

How to Run

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

HIVE_Big_Data

MovieLens Genre Analytics Using Hive on Hadoop

Objective

Tech Stack

Key Features

Project Structure

Report

How to Run