Strake

The AI Data Layer

Strake is the AI Data Layer. Not a query tool. Not a RAG pipeline. The sandboxed execution environment where agents meet your data and return answers, not rows.

Built on Apache Arrow DataFusion, Strake enables AI agents to discover, query, and process data across your entire stack (PostgreSQL, Snowflake, S3, and more) without the need for data movement or ETL. Give AI agents structured access to your entire data stack safely.

📚 Full Documentation: Check out the complete documentation for installation, architecture, and API references.

Key Features

MCP-Native Discovery: Built for the Model Context Protocol. Your agents immediately discover your entire data catalog and schemas.
Run Python, Not Prompts: Every agent execution runs inside strict native OS sandboxes for performance, or ephemeral MicroVMs for hardware-level isolation.
Zero-Copy Federation: Query Postgres, S3, Local Files, REST, gRPC, and more simultaneously with Pushdown optimization via Apache Arrow.
Read-Only by Default: Strict read-only enforcement, dynamic Row-Level Security (RLS), and PII masking out of the box.
Developer First: Built for engineers shipping agents to production. Type-safe configuration, rich CLI tooling, and local development workflows.
Python Native: Zero-copy integration with Pandas and Polars via PyO3.
GitOps Native: Manage your data mesh configuration as code. Version control your sources, policies, and metrics.
Observability: Built-in OpenTelemetry tracing and Prometheus metrics.
Enterprise Capabilities: OIDC Authentication, Row-Level Security, and Data Contracts (Enterprise Edition).

Code Mode: Don't Compute in Context

Most agents fail by swallowing thousands of raw SQL rows. Strake's Code Mode lets them process data in Python where it lives, inside a secure sandbox, sending only the parsed results that matter to the LLM.

import strake
from strake.mcp import run_python

script = """
# 1. Query 10M rows instantly via DataFusion
df = strake.sql("SELECT * FROM user_events")

# 2. Aggregate in Python to prevent context bloat
summary = df.groupby('feature_flag')['latency'].median()

# 3. Print exactly what the LLM needs
print(summary.to_json())
"""

# Runs isolated with OS Sandboxing or Firecracker VMs
result = await run_python(script)
print(result)

Quick Start (5-Minute Setup)

If you're building agents that need to query Postgres, S3, and a REST API in a single operation — without context overflow and without leaking credentials — Strake is the runtime you're missing.

1. Installation

Quick Install (Linux/macOS)

curl -sSfL https://strakedata.com/install.sh | sh

Install via Cargo (Rust)

cargo install --path crates/cli
cargo install --path crates/server

Python Client

pip install strake

2. Configuration (GitOps)

Initialize a new config and validate your sources:

# Initialize a new config
strake-cli init

# Validate configuration
strake-cli validate sources.yaml

# Apply to the metadata store (Sync)
strake-cli apply sources.yaml --force

3. Query with Python

First, define your data sources in a sources.yaml file:

sources:
  - name: local_files
    type: csv
    path: "data/*.csv"
    has_header: true
    tables:
      - name: measurements

Then, query using the Strake Python client:

import strake
import polars as pl

# Connect using your source configuration
conn = strake.connect(sources_config="sources.yaml")

# Query across sources using standard SQL
query = "SELECT * FROM measurements LIMIT 5"
data = conn.sql(query)

# Zero-copy integration with Polars/Pandas
df = pl.from_arrow(data)
print(df)

Project Structure

Component	Description
strake-runtime	Orchestration layer (Federation Engine, Sidecar).
strake-connectors	Data source implementations (Postgres, S3, REST, etc).
strake-sql	SQL Dialects, Query Optimization, and Substrait generation.
strake-common	Shared types, configuration, and error handling.
strake-server	Arrow Flight SQL server implementation.
strake-cli	GitOps CLI for managing data mesh configurations.
strake-python	Python bindings for high-performance data access.

Contributing

We welcome contributions! Please see our Contributing Guidelines for details on how to get started.

License

Strake is licensed under the Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
.cargo		.cargo
.github/workflows		.github/workflows
config		config
crates		crates
docs		docs
keys		keys
python		python
strake-enterprise @ 20ad2d7		strake-enterprise @ 20ad2d7
.cursorrules		.cursorrules
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
Dockerfile.release		Dockerfile.release
LICENSE		LICENSE
README.md		README.md
clippy.toml		clippy.toml
deny.toml		deny.toml
install.sh		install.sh
mkdocs.yml		mkdocs.yml
prek.toml		prek.toml
rustfmt.toml		rustfmt.toml
scratch_check.rs		scratch_check.rs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Strake

Key Features

Code Mode: Don't Compute in Context

Quick Start (5-Minute Setup)

1. Installation

Quick Install (Linux/macOS)

Install via Cargo (Rust)

Python Client

2. Configuration (GitOps)

3. Query with Python

Project Structure

Contributing

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Strake

Key Features

Code Mode: Don't Compute in Context

Quick Start (5-Minute Setup)

1. Installation

Quick Install (Linux/macOS)

Install via Cargo (Rust)

Python Client

2. Configuration (GitOps)

3. Query with Python

Project Structure

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages