|
1 | 1 | # Databricks Lakeflow Framework |
2 | 2 |
|
| 3 | +[](https://databricks-solutions.github.io/lakeflow_framework/) |
| 4 | +[](https://github.com/databricks-solutions/lakeflow_framework/actions/workflows/main-build.yml?query=branch%3Amain) |
| 5 | +[](https://github.com/databricks-solutions/lakeflow_framework/releases) |
| 6 | +[](https://github.com/databricks-solutions/lakeflow_framework/blob/main/LICENSE.md) |
| 7 | + |
3 | 8 | <!-- Top bar will be removed from PyPi packaged versions --> |
4 | 9 | <!-- Dont remove: exclude package --> |
5 | 10 | [Documentation](https://databricks-solutions.github.io/lakeflow_framework/) | |
|
8 | 13 |
|
9 | 14 | ## Project Description |
10 | 15 |
|
11 | | -The Lakeflow Framework is a meta-data driven framework designed to: |
12 | | -- accelerate and simplify the deployment of Spark Declarative Pipelines, and support their deployment through your SDLC. |
13 | | -- support a wide variety of patterns across the medallion architecture for both batch and streaming workloads. |
| 16 | +The Lakeflow Framework is a metadata-driven framework for building Databricks Lakeflow Spark Declarative Pipelines. It uses a configuration-driven, pattern-based approach to support both batch and streaming workloads across the medallion architecture. |
| 17 | + |
| 18 | +The framework supports centralized and domain-oriented operating models, and accommodates multiple modelling paradigms (including dimensional, Data Vault, and enterprise canonical models). It is designed for simplicity, performance, maintainability, and extensibility as the Databricks product evolves. |
| 19 | + |
| 20 | +## Why use Lakeflow Framework |
| 21 | + |
| 22 | +- Configuration-driven pattern based pipeline delivery with reusable implementation patterns |
| 23 | +- Support for batch and streaming pipelines across Bronze/Silver/Gold, aligned to your chosen modelling pattern |
| 24 | +- Flexible for centralized and domain-oriented operating models |
| 25 | + |
| 26 | +## Quick start |
| 27 | + |
| 28 | +```bash |
| 29 | +git clone https://github.com/databricks-solutions/lakeflow_framework.git |
| 30 | +cd lakeflow_framework |
| 31 | +pip install -r requirements-dev.txt |
| 32 | +``` |
| 33 | + |
| 34 | +Then: |
| 35 | + |
| 36 | +1. Open the hosted docs: https://databricks-solutions.github.io/lakeflow_framework/ |
| 37 | +2. Deploy the framework using the `Deploy Framework` guide |
| 38 | +3. Deploy samples from `samples/` using the documentation walkthroughs |
| 39 | +4. Build your first pipeline bundle using the `Build a Pipeline Bundle` guide |
| 40 | + |
| 41 | +## Prerequisites |
| 42 | + |
| 43 | +- Access to a Databricks workspace |
| 44 | +- Databricks CLI installed and configured |
| 45 | +- Python environment with project dependencies installed |
| 46 | +- Familiarity with Databricks Lakeflow Spark Declarative Pipelines concepts |
14 | 47 |
|
15 | | -The Framework is designed for simplicity, performance and alignment to the Databricks Product Roadmap. The Framework is designed in such away to allow ease of maintenance and extensibility as the SDP product evolves. |
| 48 | +## Repository structure |
| 49 | + |
| 50 | +- `docs/` - Sphinx documentation and versioned docs build tooling |
| 51 | +- `samples/` - example framework and pipeline bundles |
| 52 | +- `src/` - framework source code and runtime components |
| 53 | + |
| 54 | +## Version compatibility |
| 55 | + |
| 56 | +This project tracks Databricks Lakeflow Spark Declarative Pipelines capabilities and evolves with platform changes. Validate runtime, feature, and API compatibility against your target Databricks workspace and the latest project documentation before production rollout. |
| 57 | + |
| 58 | +## Project status and support |
| 59 | + |
| 60 | +The framework is actively maintained. Databricks support does not cover this repository; issue support is best effort through GitHub issues. |
| 61 | + |
| 62 | +## Releases and changelog |
| 63 | + |
| 64 | +- Releases: https://github.com/databricks-solutions/lakeflow_framework/releases |
| 65 | +- Tags: https://github.com/databricks-solutions/lakeflow_framework/tags |
16 | 66 |
|
17 | 67 | ## Documentation |
18 | 68 |
|
19 | 69 | Please refer to the [documentation](https://databricks-solutions.github.io/lakeflow_framework/) for further details and an explanation of the samples. |
20 | 70 | The documentation needs to be deployed as HTML or Markdown within your org before it can be used. |
21 | 71 |
|
| 72 | +### Local docs development (optional) |
| 73 | + |
| 74 | +```bash |
| 75 | +pip install -r requirements-docs.txt |
| 76 | +make -C docs html |
| 77 | +``` |
| 78 | + |
22 | 79 | ## How to get help |
23 | 80 |
|
24 | 81 | Databricks support doesn't cover this content. For questions or bugs, please open a GitHub issue and the team will help on a best effort basis. |
|
0 commit comments