Skip to content

TutorTask86_Spring2025_Real_Time_Bitcoin_Price_Analysis_Using_Amazon_EMR_2#252

Merged
tkpratardan merged 14 commits into
masterfrom
TutorTask86_Spring2025_Real_Time_Bitcoin_Price_Analysis_Using_Amazon_EMR_2
Jun 12, 2025
Merged

TutorTask86_Spring2025_Real_Time_Bitcoin_Price_Analysis_Using_Amazon_EMR_2#252
tkpratardan merged 14 commits into
masterfrom
TutorTask86_Spring2025_Real_Time_Bitcoin_Price_Analysis_Using_Amazon_EMR_2

Conversation

@RithikaBaskaran
Copy link
Copy Markdown
Collaborator

This PR includes the main scripts and notebooks for the project titled "Real-Time Bitcoin Price Analysis Using Amazon EMR", under TutorTask86_Spring2025.

Files added:

  • bitcoin_ingest.API.py: Pulls real-time Bitcoin price data from an API
  • bitcoin_processing.example.py: Handles transformation and aggregation of raw price data
  • bitcoin_streaming.example.py: Implements real-time processing using PySpark
  • bitcoin_timeseries.example.py: Performs basic time-series analytics
  • spark_test.py: A lightweight Spark job to validate setup
  • template.API.ipynb: Exploratory notebook with annotated logic and tests

Path:
DATA605/Spring2025/projects/TutorTask86_Spring2025_Real_Time_Bitcoin_Price_Analysis_Using_Amazon_EMR/

The README.md file will be added in a follow-up commit.

Copy link
Copy Markdown
Collaborator

@Prahar08modi Prahar08modi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like you to make the following changes:

  • Write all your utility/wrapper functions in utility file (for eg., emr_utils.py) like fetch_bitcoin_price, save_record, etc.
  • Convert all your .py files to namely 2 jupyter notebooks ( for eg., bitcoin_emr.API.ipynb and bitcoin_emr.example.ipynb).
  • The notebooks should only call the functions written in emr_utils.py file.
  • Apart from these changes, you should also add markdown files for your project.

Note: Please got through the required files for submission here

@RithikaBaskaran
Copy link
Copy Markdown
Collaborator Author

RithikaBaskaran commented May 16, 2025 via email

@RithikaBaskaran
Copy link
Copy Markdown
Collaborator Author

This PR includes the final working version of my project: Real-Time Bitcoin Price Analysis Using Amazon EMR.

Key Updates:

  • Real-time pipeline using CoinGecko API → S3 → Spark on EMR
  • Producer and consumer scripts refactored and tested
  • All reusable logic moved to bitcoin_emr_utils.py
  • Updated notebooks, markdowns, and final README.md
  • Removed older/placeholder files from early commits

📁 Final Files:

  • bitcoin_kafka/bitcoin_producer.py
  • bitcoin_kafka/bitcoin_streaming_consumer_emr_debug.py
  • bitcoin_kafka/bitcoin_emr_utils.py
  • bitcoin_emr.API.ipynb, bitcoin_emr.example.ipynb
  • bitcoin_emr.API.md, bitcoin_emr.example.md
  • README.md

@RithikaBaskaran
Copy link
Copy Markdown
Collaborator Author

Final Submission: Real-Time Bitcoin Price Analysis Using Amazon EMR

This PR includes the final, tested version of my project including:

  • Real-time pipeline from CoinGecko API → S3 → Spark on EMR
  • Updated notebooks:
    • bitcoin_emr.API.ipynb (utility functions with fallback)
    • bitcoin_emr.example.ipynb (simulated full pipeline)
  • Final README.md with:
    • Docker setup
    • AWS credential handling
    • EMR cluster instructions
  • Docker setup files:
    • Dockerfile, run_jupyter.sh, docker_build.sh, docker_bash.sh

All code runs end-to-end in Docker. AWS interactions gracefully fallback when credentials are missing.

Ready for grading.

@tkpratardan tkpratardan merged commit a45f2a9 into master Jun 12, 2025
1 check passed
@tkpratardan tkpratardan deleted the TutorTask86_Spring2025_Real_Time_Bitcoin_Price_Analysis_Using_Amazon_EMR_2 branch June 12, 2025 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants