Skip to content

Uttam-38/Scalable-Data-Systems

Repository files navigation

Scalable Data Systems

This repository showcases end-to-end data engineering projects focused on building scalable, high-performance data systems using relational databases, distributed processing frameworks, spatial analytics, and NoSQL storage engines.

Tech Stack

  • PostgreSQL
  • Apache Spark & SparkSQL
  • Scala
  • SQL
  • Docker
  • RocksDB
  • C++

Projects Included

  1. Relational Database Design & Query Optimization
  2. Distributed Spatial Queries using SparkSQL
  3. Spatio-Temporal Hotspot Analysis using Apache Spark
  4. NoSQL Key-Value Store Implementation using RocksDB

Key Concepts

  • Large-scale data ingestion
  • Distributed query processing
  • Spatial and spatio-temporal analytics
  • Storage engine internals (LSM trees)
  • Performance-aware system design

Note: Datasets are not included due to size and licensing constraints.

About

End-to-end data engineering projects focused on scalable storage, distributed processing, spatial analytics, and performance-aware system design using PostgreSQL, Apache Spark, and RocksDB.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors