Skip to content

DanielTrivelli/millibar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

millibar

An end-to-end data engineering pipeline for meteorological data, starting with ~700 automatic weather stations from Brazil's national meteorological network (INMET), spanning from 2000 to the present day.

The project is designed to grow over time. The goal is to gradually expand coverage to other countries and data sources, building towards a broader multi-national meteorological platform.

Work in progress. Architecture and implementation are being developed iteratively and documented as the project evolves.

Motivation

Weather data is messy, large, and spatially structured. INMET's historical archive spans over two decades of hourly readings across all Brazilian states. The dataset is large enough to make distributed processing a natural fit, and rich enough to support meaningful spatiotemporal feature engineering and forecasting.

Data Source

Data is sourced from INMET (Instituto Nacional de Meteorologia), Brazil's national meteorological institute. The network currently comprises ~700 automatic weather stations with hourly granularity. Coverage was significantly smaller in 2000 and has grown steadily over the years.

Raw data files are not included in this repository. To download them, run:

python ingest/historical_data.py

Stack

Layer Tools
Language Python 3.14+
Package manager uv
Processing PySpark
Runtime Java 21
IDE PyCharm

Project Status

Phase Status
Exploratory data analysis 🔄 In progress
Ingestion layer 🔜 Planned
Processing layer 🔜 Planned
Storage layer 🔜 Planned
Model training 🔜 Planned
Serving layer 🔜 Planned
Spatial interpolation (Phase 2) 🔜 Planned
Multi-country expansion (Phase 3) 🔜 Planned

Getting Started

git clone https://github.com/DanielTrivelli/millibar.git
cd millibar
uv sync

License

This project is under the MIT license. See the file LICENSE for more details.

About

End-to-end data engineering pipeline for meteorological data. Built on top of INMET weather stations across Brazil, spanning from 2000 to the present day. Designed to grow towards a broader multi-national platform.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages