Skip to content

faysalalmahmud/python-postgres-etl-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python to PostgreSQL ETL Pipeline

This project demonstrates a simple ETL (Extract, Transform, Load) pipeline using Python and PostgreSQL.

Files Included

  • task1_d.json: A raw data file that is not in valid JSON form, containing Ruby hash syntax.
  • db_connect.py: Handles the connection to the PostgreSQL database.
  • ingest_data.py: A Python script that extracts the data, uses Regular Expressions to clean it into valid JSON, and loads it into the raw_books table.
  • transformation.sql: A pure SQL script that standardizes currency symbols and generates a book_summary table.

How to Run

  1. Ensure PostgreSQL is installed and running.
  2. Update the database credentials in db_connect.py.
  3. Run python ingest_data.py to populate the database.
  4. Execute transformation.sql in pgAdmin or via psql to generate the summary table.

About

A simple ETL task using Python & PostgreSQL

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages