Skip to content

Latest commit

 

History

History
44 lines (34 loc) · 2.12 KB

File metadata and controls

44 lines (34 loc) · 2.12 KB

Analytics Data Platform

Audience: This is the developer documentation for contributors to the platform. For user-facing documentation (Superset guides, data dictionaries, etc.) see the published site built from docs/.

Introduction

An Apache Iceberg-based data lakehouse to support analytics for the facility. See Background for a high-level overview of a data lakehouse.

Repository overview

This repository is a monorepo containing all of the code for the platform. It may be separated out in the future.

.
├── docs/                   # User-facing documentation site (MkDocs). See docs/src for content.
├── docs-devel/             # Developer documentation (this directory).
├── elt-common/             # Reusable Python package with common ELT helpers used by the warehouses
├── infra/
│   ├── ansible/            # Ansible playbooks/roles to deploy the system to the STFC (OpenStack) cloud.
│   ├── container-images/   # Container definitions for deployed services
│   └── local/              # docker-compose configuration for local development and end-to-end CI tests.
└── warehouses/             # One subdirectory per Lakekeeper warehouse. Each contains ELT code for that warehouse.
    ├── facility_ops_landing/   # Ingestion scripts (bronze layer)
    └── facility_ops/           # dbt transformation models (silver/gold layers)

Getting started

New to the project? Start with the Getting Started guide to set up your local development environment.

Details