-
Notifications
You must be signed in to change notification settings - Fork 0
Feature/klimadashboard #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
3d37560
d4f0777
0acb111
ed362ec
aa1f063
5eb2a89
3d2a5be
8f91ad0
2d2e7ae
80dea6f
c597db9
0c014b6
2e07841
8eb4162
6816694
c7a3092
6a3866a
4515c48
10009d1
0d7cf3a
b15dc7a
05d06bd
1287427
92e21c6
eb6df1c
3abecac
5649137
0fe566c
cd25241
27ca7e3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -87,7 +87,7 @@ Find the scraper you created in the `ddj_cloud/scrapers` folder and open the `.p | |
| You can run the following command to test your scraper: | ||
|
|
||
| uv run manage test <scraper_name> | ||
|
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ? |
||
| where `<scraper_name>` is the Python module name of your scraper. | ||
|
|
||
| If a local `.env` file exists in the repository root, `manage test` will load it automatically before importing the scraper. | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,13 @@ | ||||||||||||||||||||
| { | ||||||||||||||||||||
| "permissions": { | ||||||||||||||||||||
| "allow": [ | ||||||||||||||||||||
| "WebFetch(domain:github.com)", | ||||||||||||||||||||
| "WebFetch(domain:raw.githubusercontent.com)", | ||||||||||||||||||||
| "Read(//Users/janeggers/Code/wdr-ddj-cloud/ddj_cloud/scrapers/talsperren/**)", | ||||||||||||||||||||
| "WebFetch(domain:open-mastr.readthedocs.io)", | ||||||||||||||||||||
| "WebFetch(domain:api.github.com)", | ||||||||||||||||||||
| "Read(//Users/janeggers/miniconda3/lib/python3.12/site-packages/open_mastr/**)", | ||||||||||||||||||||
| "Read(//Users/janeggers/Code/wdr-ddj-cloud/**)" | ||||||||||||||||||||
|
Comment on lines
+6
to
+10
|
||||||||||||||||||||
| "Read(//Users/janeggers/Code/wdr-ddj-cloud/ddj_cloud/scrapers/talsperren/**)", | |
| "WebFetch(domain:open-mastr.readthedocs.io)", | |
| "WebFetch(domain:api.github.com)", | |
| "Read(//Users/janeggers/miniconda3/lib/python3.12/site-packages/open_mastr/**)", | |
| "Read(//Users/janeggers/Code/wdr-ddj-cloud/**)" | |
| "Read(ddj_cloud/scrapers/talsperren/**)", | |
| "WebFetch(domain:open-mastr.readthedocs.io)", | |
| "WebFetch(domain:api.github.com)", | |
| "Read(ddj_cloud/**)" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # Changes here will be overwritten by Copier; NEVER EDIT MANUALLY | ||
| _src_path: /Users/janeggers/Code/wdr-ddj-cloud/scraper_template | ||
|
jh0ker marked this conversation as resolved.
|
||
| contact_email: jan.eggers@fm.wdr.de | ||
| contact_name: Jan Eggers | ||
| description: 'Automation für Quarks.de: Ausbau von Wind- und Solarenergie, Energiemix | ||
| in D und mehr | ||
|
|
||
| ' | ||
| display_name: klimadashboard | ||
| ephemeral_storage: '512' | ||
| interval: daily | ||
| memory_size: '1024' | ||
| preset: pandas | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| # MaStR databases | ||
| *.db |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,52 @@ | ||||||
| # Technology stack | ||||||
|
|
||||||
| - Python 3.11 | ||||||
| - uv | ||||||
| - Datawrapper (Charts) | ||||||
| - SQLite Database (MaStR-Daten) | ||||||
| - Fraunhofer Energy Charts API (Energiemix) | ||||||
| - MaStR SOAP API (Windkraft-Ausbau) | ||||||
|
||||||
| - MaStR SOAP API (Windkraft-Ausbau) | |
| - open-mastr (Bulk-Download für Windkraft-Ausbau) |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,112 @@ | ||||||
| # klimadashboard | ||||||
|
|
||||||
| **Contact:** Jan Eggers (jan.eggers@fm.wdr.de) | ||||||
|
|
||||||
| Automation für Quarks.de: Ausbau von Wind- und Solarenergie, Energiemix in D und mehr | ||||||
|
|
||||||
| ## Architektur | ||||||
|
|
||||||
| ``` | ||||||
| klimadashboard.py (Orchestrator) | ||||||
| │ | ||||||
| ├── msr_scraper.py → alle Energiearten aus MaStR (isoliertes venv via uv run) | ||||||
| ├── msr_wind_processor.py → Wind-Tagesdaten berechnen | ||||||
| ├── msr_solar_processor.py → Solar-Tagesdaten berechnen | ||||||
| ├── msr_dw_display.py → Datawrapper-Charts aktualisieren | ||||||
| ├── S3: upload mastr.db + CSVs | ||||||
| └── energiemix.py → Fraunhofer-Daten + DW-Charts | ||||||
| ``` | ||||||
|
|
||||||
|
|
||||||
| ## MaStR-Scraper; Auswertung Wind- und Solarenergie | ||||||
|
|
||||||
| Ausbaustand Wind- und Solarenenergie: Wie geht es voran? Was muss passieren, um die Ziele des EEG zu erreichen? | ||||||
|
|
||||||
| Ursprünglich ein Python-Port der PHP-Skripte `msr_php/wka_daily.php` und `msr_php/wka_to_data.php`, jetzt basierend auf der [open-mastr](https://github.com/OpenEnergyPlatform/open-mastr)-Bibliothek des [Rainer-Lemoine-Instituts](https://wam.rl-institut.de/#showcase). Die Maintainer dort sind Jonathan Amme und Ludwig Hülk - die das mehr oder weniger nebenbei entwickeln und für Props und Kooperationen offen sind. | ||||||
|
|
||||||
|
|
||||||
| ### 1. Scraper (`src/msr_scraper.py`) | ||||||
|
|
||||||
| Lädt alle Energiearten (Wind, Solar, Biomasse, Wasser, Kernkraft, Verbrennung, Geothermie/Grubengas, Speicher) | ||||||
| über den open-mastr Bulk-Download und speichert sie in `mastr.db`. | ||||||
|
|
||||||
| **Kein API-Key nötig** -- nutzt die öffentlichen Bulk-Daten des MaStR. | ||||||
|
|
||||||
| **Isoliertes venv:** Der Scraper nutzt PEP 723 inline script metadata und wird via `uv run` | ||||||
| in einem eigenen virtuellen Environment ausgeführt (open-mastr benötigt pandas>=2.2, | ||||||
| das Hauptprojekt nutzt pandas~=1.5). | ||||||
|
|
||||||
|
Comment on lines
+35
to
+38
|
||||||
| **Caching:** Wenn `mastr.db` bereits Daten von heute enthält (`DatumDownload`), wird der Download übersprungen. | ||||||
|
|
||||||
| ### 2. Wind-Prozessor (`src/msr_wind_processor.py`) | ||||||
|
|
||||||
| Berechnet tägliche Ausbaudaten (2010-2030) für Onshore und Offshore Wind: | ||||||
| - Kumulierte installierte Leistung (GW) | ||||||
| - Täglicher Zubau/Abbau (MW) | ||||||
| - Geplante zukünftige Installationen | ||||||
| - Nötiger täglicher Ausbau für die Klimaschutzziele 2030 | ||||||
| - Monatliche und jährliche Zusammenfassungen | ||||||
|
|
||||||
| **Klimaziele 2030:** | ||||||
| - Onshore: 115 GW (Wind-an-Land-Gesetz, seit 01.02.2023) | ||||||
| - Offshore: 30 GW (Wind-auf-See-Gesetz, seit 01.01.2023) | ||||||
|
|
||||||
| ### 3. Solar-Prozessor (`src/msr_solar_processor.py`) | ||||||
|
|
||||||
| Berechnet tägliche Ausbaudaten (2010-2030) für Solarenergie: | ||||||
| - Kumulierte installierte Leistung (GW) | ||||||
| - Täglicher Zubau/Abbau (MW) | ||||||
| - Geplante zukünftige Installationen | ||||||
| - Nötiger täglicher Ausbau für das Klimaziel 2030 | ||||||
| - Monatliche und jährliche Zusammenfassungen | ||||||
|
|
||||||
| **Klimaziel 2030:** 215 GW (EEG 2023) | ||||||
|
|
||||||
| ### 4. Datawrapper-Display (`src/msr_dw_display.py`) | ||||||
|
|
||||||
| Lädt aufbereitete Daten in Datawrapper-Charts hoch: | ||||||
| - **Wind-Ausbau** (`EgOti`): Gesamtleistung Onshore/Offshore | ||||||
| - **Solar-Ausbau** (`1rxLQ`): Gesamtleistung Solar | ||||||
| - **Wind-Zubau** (`7yMTK`): Zubau pro Monat/Jahr | ||||||
| - **Solar-Zubau** (`kPzGf`): Zubau pro Monat/Jahr | ||||||
|
|
||||||
| Umschaltbar zwischen monatlicher und jährlicher Aggregation via `YEARLY_AGGREGATES`. | ||||||
|
|
||||||
| ## Benötigte Secrets / Umgebungsvariablen | ||||||
|
|
||||||
| | Variable | Beschreibung | Wo? | | ||||||
| |----------|-------------|-----| | ||||||
| | `DATAWRAPPER_API_KEY` | API-Token für Datawrapper-Charts | [Datawrapper Account Settings](https://app.datawrapper.de/account/api-tokens), in .env des Projekts | | ||||||
|
||||||
| | `DATAWRAPPER_API_KEY` | API-Token für Datawrapper-Charts | [Datawrapper Account Settings](https://app.datawrapper.de/account/api-tokens), in .env des Projekts | | |
| | `DW_API_KEY_JE` | API-Token für Datawrapper-Charts; diese Umgebungsvariable wird von den Skripten und der Deploy-Konfiguration erwartet | [Datawrapper Account Settings](https://app.datawrapper.de/account/api-tokens), in `.env` des Projekts | |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
| from pathlib import Path | ||
|
|
||
| import pandas as pd | ||
|
|
||
| from ddj_cloud.scrapers.klimadashboard.src.energiemix import update_energiemix | ||
| from ddj_cloud.scrapers.klimadashboard.src.msr_dw_display import upload_all as upload_dw_charts | ||
| from ddj_cloud.scrapers.klimadashboard.src.msr_scraper import scrape_mastr | ||
| from ddj_cloud.scrapers.klimadashboard.src.msr_solar_processor import process_solar | ||
| from ddj_cloud.scrapers.klimadashboard.src.msr_wind_processor import process_wind | ||
| from ddj_cloud.utils.storage import ( | ||
| upload_dataframe, | ||
| upload_file, | ||
| ) | ||
|
|
||
| VERSION_STRING = "V0.05 vom 13.04.2026" | ||
|
|
||
| # mastr.db in local_storage (analog zu anderen Scrapern) | ||
| DB_LOCAL_PATH = Path(__file__).parent.parent.parent.parent / "local_storage" / "klimadashboard" / "mastr.db" | ||
| DB_S3_KEY = "klimadashboard/mastr.db" | ||
|
|
||
|
Comment on lines
+17
to
+20
|
||
|
|
||
| def _upload_db(): | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Diese Funktion ist momentan wirklich geil sinnbefreit 😄 Es lädt die DB aus |
||
| """Lädt mastr.db auf S3 hoch.""" | ||
| if not DB_LOCAL_PATH.exists(): | ||
| print(" Warnung: mastr.db nicht gefunden, Upload übersprungen.") | ||
| return | ||
| upload_file( | ||
| DB_LOCAL_PATH.read_bytes(), | ||
| DB_S3_KEY, | ||
| archive=False, | ||
| ) | ||
| size_mb = DB_LOCAL_PATH.stat().st_size / 1024 / 1024 | ||
| print(f" mastr.db auf S3 hochgeladen ({size_mb:.1f} MB)") | ||
|
|
||
|
|
||
| def run(): | ||
| # Energiemix (Fraunhofer API) | ||
| df = update_energiemix() | ||
| upload_dataframe(df, "klimadashboard/test_energiemix1.csv") | ||
|
|
||
|
Comment on lines
+37
to
+40
|
||
| # MaStR: Scraper, Prozessoren, DB auf S3 | ||
| print("MaStR-Daten aktualisieren...") | ||
| DB_LOCAL_PATH.parent.mkdir(parents=True, exist_ok=True) | ||
| counts = scrape_mastr(DB_LOCAL_PATH) | ||
| total = sum(counts.values()) | ||
| print(f" MaStR-Scraper: {total} Einheiten geladen") | ||
|
|
||
| # Wind | ||
| print("Wind-Daten verarbeiten...") | ||
| df_onshore, df_offshore, wind_summaries = process_wind(DB_LOCAL_PATH) | ||
| df_wind = pd.concat([df_onshore, df_offshore], ignore_index=True) | ||
| upload_dataframe(df_wind, "klimadashboard/wind_taeglich.csv") | ||
| for name, df_summary in wind_summaries.items(): | ||
| upload_dataframe(df_summary, f"klimadashboard/wind_{name}.csv") | ||
|
|
||
| # Solar | ||
| print("Solar-Daten verarbeiten...") | ||
| df_solar, solar_summaries = process_solar(DB_LOCAL_PATH) | ||
| upload_dataframe(df_solar, "klimadashboard/solar_taeglich.csv") | ||
| for name, df_summary in solar_summaries.items(): | ||
| upload_dataframe(df_summary, f"klimadashboard/solar_{name}.csv") | ||
|
|
||
| # Datawrapper-Charts aktualisieren | ||
| print("Datawrapper-Charts aktualisieren...") | ||
| upload_dw_charts( | ||
| wind_summaries=wind_summaries, | ||
| solar_summaries=solar_summaries, | ||
| ) | ||
|
|
||
| # DB auf S3 hochladen | ||
| _upload_db() | ||
|
|
||
| print("MaStR-Daten aktualisiert.") | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| -- | ||
| -- Tabellenstruktur für Tabelle `ee_wind` | ||
| -- (Tabelle mit allen WKA-Anlagen) | ||
| -- | ||
|
|
||
| CREATE TABLE `ee_wind` ( | ||
| `mastrnr_einheit` varchar(32) NOT NULL, | ||
| `name_einheit` text DEFAULT NULL, | ||
| `betriebsstatus` text DEFAULT NULL, | ||
| `bruttoleistung` decimal(10,1) DEFAULT NULL, | ||
| `nettonennleistung` decimal(10,1) DEFAULT NULL, | ||
| `datum_inbetriebnahme` date DEFAULT NULL, | ||
| `datum_registrierung` date DEFAULT NULL, | ||
| `bundesland` text DEFAULT NULL, | ||
| `landkreis` mediumtext DEFAULT NULL, | ||
| `gemeinde` mediumtext DEFAULT NULL, | ||
| `plz` mediumtext DEFAULT NULL, | ||
| `ort` mediumtext DEFAULT NULL, | ||
| `strasse` mediumtext DEFAULT NULL, | ||
| `hausnummer` mediumtext DEFAULT NULL, | ||
| `gemarkung` mediumtext DEFAULT NULL, | ||
| `flurstueck` mediumtext DEFAULT NULL, | ||
| `gemeindeschluessel` int(11) DEFAULT NULL, | ||
| `breitengrad` decimal(10,6) DEFAULT NULL, | ||
| `laengengrad` decimal(10,6) DEFAULT NULL, | ||
| `name_windpark` mediumtext DEFAULT NULL, | ||
| `nabenhoehe` decimal(10,2) DEFAULT NULL, | ||
| `rotordurchmesser` decimal(10,2) DEFAULT NULL, | ||
| `hersteller_windanlage` mediumtext DEFAULT NULL, | ||
| `typenbezeichnung` mediumtext DEFAULT NULL, | ||
| `technologie` mediumtext DEFAULT NULL, | ||
| `lage_einheit` mediumtext DEFAULT NULL, | ||
| `letzte_aktualisierung` date DEFAULT NULL, | ||
| `datum_stilllegung` date DEFAULT NULL, | ||
| `datum_geplante_inbetriebnahme` date DEFAULT NULL | ||
| ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin; | ||
|
|
||
| -- | ||
| -- Indizes für die Tabelle `ee_wind` | ||
| -- | ||
| ALTER TABLE `ee_wind` | ||
| ADD KEY `idx_lage_einheit` (`lage_einheit`(17)), | ||
| ADD KEY `idx_betriebsstatus` (`betriebsstatus`(25)), | ||
| ADD KEY `idx_datum_inbetriebnahme` (`datum_inbetriebnahme`), | ||
| ADD KEY `idx_datum_stilllegung` (`datum_stilllegung`), | ||
| ADD KEY `idx_datum_geplante_inbetriebnahme` (`datum_geplante_inbetriebnahme`), | ||
| ADD KEY `idx_lage_status_datum_inbetriebnahme` (`lage_einheit`(17),`betriebsstatus`(25),`datum_inbetriebnahme`), | ||
| ADD KEY `idx_lage_status_datum_stilllegung` (`lage_einheit`(17),`betriebsstatus`(25),`datum_stilllegung`), | ||
| ADD KEY `idx_lage_status_datum_geplante_inbetriebnahme` (`lage_einheit`(17),`betriebsstatus`(25),`datum_geplante_inbetriebnahme`), | ||
| ADD KEY `idx_mastrnr` (`mastrnr_einheit`); | ||
| COMMIT; | ||
|
|
||
|
|
||
| ------------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| -- | ||
| -- Tabellenstruktur für Tabelle `ee_wind_taeglich` | ||
| -- (Tabelle, aus denen ich meine Ausgaben an die Diagramme generiere: Zeitliche Verläufe etc.) | ||
| -- | ||
|
|
||
| CREATE TABLE `ee_wind_taeglich` ( | ||
| `datum` date NOT NULL, | ||
| `lage_einheit` text NOT NULL, | ||
| `installiert_gesamt` text NOT NULL, | ||
| `installiert_taeglich` text NOT NULL, | ||
| `geplant_gesamt` text NOT NULL, | ||
| `geplant_taeglich` text NOT NULL, | ||
| `noetig_gesamt` text NOT NULL, | ||
| `noetig_taeglich` text NOT NULL, | ||
| `stand` date NOT NULL, | ||
| `installiert_taeglich_wert` float NOT NULL, | ||
| `geplant_taeglich_wert` float NOT NULL, | ||
| `noetig_taeglich_wert` float NOT NULL | ||
| ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_bin; | ||
|
|
||
|
|
||
| -- | ||
| -- Indizes für die Tabelle `ee_wind_taeglich` | ||
| -- | ||
| ALTER TABLE `ee_wind_taeglich` | ||
| ADD KEY `idx_datum` (`datum`), | ||
| ADD KEY `idx_lage_einheit` (`lage_einheit`(17)); | ||
| COMMIT; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Braucht es das für irgendwas?