Skip to content

Commit cb4d3c3

Browse files
committed
release: v0.8.0 — execution & orchestration as first-class architecture
- introduce explicit execution plan and batch-based orchestration - add load_run_snapshot and extended load_run_log metadata - implement deterministic retries, blocking and abort semantics - add execution diagnostics and snapshot diffing - stabilize BigQuery, MSSQL, Postgres and DuckDB meta persistence - enable platform-wide execution guarantees without external orchestrators
1 parent 13055b3 commit cb4d3c3

58 files changed

Lines changed: 5224 additions & 1625 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# ======================================================
22
# elevata - Metadata-driven Data Platform Framework
3-
# Copyright © 2025 Ilona Tag
3+
# Copyright © 2025-2026 Ilona Tag
44
#
55
# SPDX-License-Identifier: AGPL-3.0-only
66
#
@@ -49,3 +49,6 @@ Thumbs.db
4949
.idea/
5050
*.swp
5151
*.swo
52+
53+
# Log
54+
*log/

CHANGELOG.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,59 @@ This project adheres to [Semantic Versioning](https://semver.org/) and [Keep a C
2222

2323
---
2424

25+
## [0.8.0] – 2026-01-xx
26+
27+
### ⚙️ Execution & Orchestration as First-Class Architecture
28+
29+
This release introduces an explicit, metadata-driven execution model,
30+
establishing **orchestration, failure semantics, and observability** as first-class concerns in elevata.
31+
32+
Execution is now planned, executed, and explained independently of SQL generation,
33+
providing a robust foundation for platform-native orchestration and governance.
34+
35+
### ✨ Added
36+
- Explicit **Execution Plan** model separating planning from execution
37+
- Dependency-graph–based dataset execution with deterministic ordering
38+
- Multi-dataset batch execution with a shared `batch_run_id`
39+
- Structured execution policies (`continue_on_error`, `max_retries`)
40+
- Retry semantics with per-attempt tracking (`attempt_no`)
41+
- Distinct failure semantics:
42+
- `blocked` (dependency-based non-execution)
43+
- `aborted` (policy-based fail-fast non-execution)
44+
- **Load Run Snapshot** (`meta.load_run_snapshot`)
45+
- Batch-level, JSON-based execution state
46+
- Captures plan, policy, dependencies, and aggregated outcomes
47+
- Extended **Load Run Log** (`meta.load_run_log`)
48+
- Orchestration-only events (blocked / aborted)
49+
- Best-effort, non-blocking meta logging
50+
- CLI execution diagnostics:
51+
- Execution snapshot printing (`--debug-execution`)
52+
- Snapshot persistence (`--write-execution-snapshot`)
53+
- Deterministic BigQuery table qualification for execution and metadata writes
54+
(prevents sporadic cross-project `NotFound` errors during streaming inserts)
55+
- Global execution modes:
56+
- single-dataset execution with dependencies (default)
57+
- platform-wide execution in deterministic order (`--all`)
58+
- optional schema-scoped execution (`--schema`)
59+
60+
### 🔄 Changed
61+
- Execution semantics are no longer implicit in SQL or CLI flow
62+
- Load execution is now driven by an explicit execution model
63+
- Fail-fast behavior is deterministic and explicitly reported
64+
- Execution observability is metadata-first and dialect-agnostic
65+
66+
### 🧪 Quality & Stability
67+
- Extensive unit tests for execution ordering, retries, fail-fast, and blocking
68+
- Guardrails for orchestration-only events and best-effort persistence
69+
- Clear separation of execution core vs CLI and dialect adapters
70+
- No destructive changes to existing materialization or SQL generation logic
71+
72+
> This release establishes elevata as a **self-orchestrating, explainable
73+
> data platform core**, laying the groundwork for native scheduling,
74+
> governance rules, and external orchestration integrations.
75+
76+
---
77+
2578
## [0.7.1] – 2025-12-29
2679

2780
### 🧱 Metadata-Driven Schema Evolution

README.md

Lines changed: 109 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ Unlike transformation-centric tools, elevata treats metadata, lineage, and execu
4242

4343

4444
<p align="center">
45-
<img src="https://raw.githubusercontent.com/elevata-labs/elevata/main/docs/elevata_v0_7_0.png" alt="elevata UI preview" width="700"/>
45+
<img src="https://raw.githubusercontent.com/elevata-labs/elevata/main/docs/elevata_v0_8_0.png" alt="elevata UI preview" width="700"/>
4646
<br/>
4747
<em>Dataset detail view with lineage, metadata, and dialect-aware SQL previews</em>
4848
</p>
@@ -144,7 +144,8 @@ Rendered SQL is executed directly in the warehouse:
144144
- historization (SCD Type 2)
145145
- delete detection
146146
- execution timing and row counts
147-
- structured load logging via `meta.load_run_log`
147+
- structured load logging via `meta.load_run_log`
148+
- batch-level execution snapshots via `meta.load_run_snapshot`
148149

149150
elevata is designed for execution — not just preview.
150151

@@ -172,6 +173,10 @@ Execution semantics depend on the target dataset and its layer:
172173
The load runner supports dry runs, execution diagnostics, dependency resolution,
173174
and execution logging.
174175

176+
Execution behavior is fully deterministic and observable.
177+
Each run produces a structured execution log and an optional
178+
batch-level execution snapshot explaining plan, policy, and outcomes.
179+
175180
---
176181

177182
## 🔮 Roadmap
@@ -181,104 +186,123 @@ The roadmap reflects this direction: structured, ambitious, and aligned with ele
181186

182187
---
183188

184-
### v0.7.x — Productivity & Governance Layer
185-
> *Guiding question: Can execution be explained, governed, and safely extended?*
186-
187-
- **Metadata-driven ingestion (optional)**
188-
Ingestion into Raw datasets based on source metadata, with support for
189-
native, external, or no-ingestion strategies. Pipelines may also start
190-
directly at the Stage layer using federated access.
191-
192-
- **Automated schema evolution detection**
193-
Detects warehouse–model drifts, identifies breaking changes.
189+
### v0.8.x — Platform Orchestration Layer
190+
> *Guiding question: Can elevata orchestrate itself reliably at scale?*
194191
195-
- **Data Quality & Metadata Rule Engine**
196-
Rule-based validation directly inside the load pipeline (nullability, domains, patterns, etc.).
192+
- **Warehouse-native task orchestration**
193+
(retries, idempotency, execution semantics; scheduling optional via integration)
197194

198-
- **Execution plan transparency**
199-
Refine execution plan annotations (polish)
195+
- **Dependency graph–driven pipeline execution**
196+
with deterministic ordering and batching
200197

201-
- **Column-level lineage & impact analysis**
202-
Rich dependency graphing and change-impact visibility.
198+
- **Multi-dataset execution with explicit failure handling strategies**
199+
(blocked vs aborted, fail-fast vs continue-on-error)
203200

204-
- **Developer tooling & debugger**
205-
Deep SQL preview, AST inspection, execution diagnostics, step-wise load traceability.
201+
- **Integrations with orchestration frameworks**
202+
(initial adapters and execution hooks)
206203

207-
- **Optional: simplified steward interface**
208-
Lightweight UI for business/data owners to view datasets and rules.
204+
- **Extended execution monitoring & explainability**
205+
(latency, throughput, volume, change rates, execution snapshots)
209206

210-
- **Extended schema evolution & drift detection**
211-
Detection and governance of breaking schema changes, including type changes
212-
and destructive modifications.
213-
Deterministic synchronization of physical warehouse schemas based on metadata:
214-
- safe table provisioning
215-
- additive column evolution
216-
- rename-aware planning
217-
- non-destructive incremental execution
207+
- **Global execution modes**
208+
Ability to execute:
209+
- a single target dataset with its dependencies (default)
210+
- all datasets in deterministic dependency order (`--all`)
211+
- optional schema-scoped execution (`--schema`)
218212

219-
- **Safe dataset & column renames**
220-
First-class rename semantics via metadata:
221-
- former_names tracking
222-
- lineage-aware physical renames
223-
- ambiguity detection and guardrails
213+
This enables platform-wide batch runs without requiring external orchestration tools.
224214

225215
**Intent:**
226-
elevata becomes **governable, productive, and capable of sourcing its own data**.
227-
228-
The 0.7.x line focuses on completing and extending these capabilities.
216+
elevata becomes a **self-contained data platform core**, orchestrable and observable without external wrappers.
229217

230218
---
231219

232-
### v0.8.x — Platform Orchestration Layer
233-
> *Guiding question: Can elevata orchestrate itself reliably at scale?*
234-
235-
- **Warehouse-native task orchestration**
236-
(retries, idempotency, scheduling)
237-
238-
- **Dependency graph–driven pipeline execution**
239-
with parallelization and batching
240-
241-
- **Multi-dataset execution with failure handling strategies**
242-
243-
- **Integrations with orchestration frameworks**
244-
(Airflow, Dagster, Prefect)
245-
246-
- **Extended execution monitoring**
247-
(latency, throughput, volume, change rates)
220+
### v0.9.x — Business Semantics & Bizcore Layer
221+
> *Guiding question: Can business meaning and business logic be modeled explicitly — without introducing a semantic BI layer?*
222+
223+
- **Bizcore as a first-class business semantics layer**
224+
Bizcore datasets represent business concepts, rules, and calculations
225+
derived explicitly from Core datasets — not technical projections
226+
and not consumption-specific semantic models.
227+
228+
- **Explicit business logic and calculations (Bizcore MVP)**
229+
Bizcore supports:
230+
- derived business fields
231+
- rule-based classifications
232+
- business calculations and KPIs expressed as dataset fields
233+
(e.g. margins, normalized revenues, activity flags, domain rules).
234+
235+
These definitions are:
236+
- metadata-driven
237+
- deterministic
238+
- compiled into executable plans
239+
without introducing a BI-style semantic or metrics layer.
240+
241+
- **Clear separation of responsibilities**
242+
- RAW / STAGE / CORE: technical correctness and data truth
243+
- BIZCORE: business meaning, rules, and calculations
244+
- SERVING (optional): tool- or consumer-specific shaping
245+
246+
- **Semantic lineage & explainability**
247+
Every Bizcore field is traceable to its Core inputs, transformations,
248+
and assumptions — enabling impact analysis and auditability.
249+
250+
- **Execution remains metadata-driven and deterministic**
251+
Bizcore logic is planned and executed through the same execution model
252+
as technical datasets, preserving elevata’s guarantees around
253+
predictability, transparency, and reproducibility.
254+
255+
**Explicit non-goals (by design):**
256+
- No BI semantic layer
257+
- No metric store or query-time metric resolution
258+
- No time-intelligence abstractions
259+
- No dbt-style macro or templating system
248260

249261
**Intent:**
250-
elevata becomes a **self-contained data platform core**, orchestrable without external wrappers.
262+
elevata becomes **business-capable by design**, allowing teams to define
263+
business logic and KPIs natively — while deliberately avoiding
264+
tool-specific semantic layers or BI-driven abstractions.
251265

252266
---
253267

254-
### Future Directions (Post-0.8)
255-
> *Long-term ambitions and ecosystem expansion.*
268+
### Future Directions (Post-0.9)
269+
> *Guiding question: Can execution be governed, validated, and integrated without breaking determinism?*
270+
271+
- **Run- and dataset-level governance rules**
272+
Declarative policies evaluated before and after execution
273+
(e.g. schema drift, delete detection, retry limits, environment guards).
256274

257-
- **Bizcore as Data Product Layer**
258-
Elevate Bizcore datasets from technical projections to first-class data products.
275+
- **Rule-based validation framework**
276+
Metadata-defined checks on schema, volumes, and execution outcomes
277+
(non-blocking warnings vs blocking violations).
259278

260-
- **Business logic & semantic modeling**
261-
Explicit modeling of business rules, derivations, and analytical intent
262-
at the business domain level.
279+
- **Execution hooks & lifecycle callbacks**
280+
Stable hook API for external orchestration frameworks and platforms
281+
(Airflow, Dagster, Prefect, custom controllers).
263282

264-
- **Product-level metadata & ownership**
265-
Ownership, contracts, and usage semantics for business-facing datasets.
283+
- **Policy-aware execution outcomes**
284+
Explicit distinction between execution failures and policy violations,
285+
surfaced consistently in logs and snapshots.
266286

267-
- **Explicit separation of business and presentation layers**
268-
Clear distinction between business logic (Bizcore) and consumption-oriented
269-
presentation layers (Serving), enabling tool-specific semantic models
270-
without polluting core business datasets.
287+
- **First-class execution metadata**
288+
Structured access to load run logs and snapshots for governance,
289+
observability, and external consumers.
271290

272-
- **Optional metrics & analytical abstractions**
273-
Foundations for a native metrics layer and reusable analytical definitions.
291+
---
274292

275-
- **Extended catalog capabilities**
276-
(contracts, schema registry, dataset capabilities)
293+
### Vision (Towards 1.0)
277294

278-
- **Additional dialects and warehouse platforms**
279-
e.g. Snowflake, BigQuery, Databricks SQL, Microsoft Fabric
295+
elevata aims to become a **metadata-native data platform engine**:
296+
a system where structure, execution, governance, and business intent are derived from
297+
explicit definitions rather than implicit SQL behavior.
280298

281-
- **Warehouse-native metadata and observability features**
299+
By building on deterministic execution, explainable orchestration, and policy-aware governance,
300+
elevata provides a stable core on which organizations can model data products, business semantics,
301+
and analytical contracts without coupling them to specific tools or warehouses.
302+
303+
The long-term goal is not to replace orchestration frameworks or BI tools,
304+
but to act as a **reliable, transparent backbone** that makes data pipelines
305+
predictable, governable, and evolvable across teams and platforms.
282306

283307
---
284308

@@ -292,6 +316,15 @@ elevata becomes a **self-contained data platform core**, orchestrable without ex
292316

293317
- **Extensible:** Dialects, rules, orchestrators and catalog integrations can grow as the platform evolves.
294318

319+
- **Explainable by design:** Execution decisions, failures, and outcomes are observable and reproducible.
320+
321+
---
322+
323+
### ♟️ Architecture & Strategy
324+
325+
For a deeper architectural and strategic overview of elevata’s direction,
326+
see the [elevata Platform Strategy](https://github.com/elevata-labs/elevata/blob/main/docs/strategy/elevata_platform_strategy.md).
327+
295328
---
296329

297330
## 🛡️ Data Privacy (GDPR/DSGVO)
@@ -324,7 +357,7 @@ The project is published under the AGPL v3 license and open for use by any organ
324357

325358
## 🧾 License & Notices
326359

327-
© 2025 Ilona Tag — All rights reserved.
360+
© 2025-2026 Ilona Tag — All rights reserved.
328361
**elevata™** is an open-source software project for data & analytics innovation.
329362
The name *elevata* is currently under trademark registration with the German Patent and Trade Mark Office (DPMA).
330363
Other product names, logos, and brands mentioned here are property of their respective owners.

core/elevata_site/settings.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
"""
22
elevata - Metadata-driven Data Platform Framework
3-
Copyright © 2025 Ilona Tag
3+
Copyright © 2025-2026 Ilona Tag
44
55
This file is part of elevata.
66
@@ -43,7 +43,7 @@
4343
BASE_DIR = Path(__file__).resolve().parent.parent
4444
load_dotenv(find_dotenv(filename=".env", raise_error_if_not_found=False))
4545

46-
ELEVATA_VERSION = "0.7.1"
46+
ELEVATA_VERSION = "0.8.0"
4747

4848
ELEVATA_PROFILES_PATH = os.getenv("ELEVATA_PROFILES_PATH", str((BASE_DIR.parent / "config" / "elevata_profiles.yaml")))
4949

0 commit comments

Comments
 (0)