Skip to content

Commit 46d2dc3

Browse files
whummerclaude
andcommitted
add initial version of ParadeDB extension
LocalStack extension for ParadeDB (PostgreSQL-based search and analytics). Features: - Runs paradedb/paradedb Docker container - Exposes PostgreSQL port 5432 for direct connections - Configurable via PARADEDB_POSTGRES_USER/PASSWORD/DB env vars - Integration tests for basic SQL and pg_search BM25 functionality - CI workflow for automated testing Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent f4b4626 commit 46d2dc3

11 files changed

Lines changed: 718 additions & 0 deletions

File tree

.github/workflows/paradedb.yml

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
name: LocalStack ParadeDB Extension Tests
2+
3+
on:
4+
push:
5+
paths:
6+
- paradedb/**
7+
branches:
8+
- main
9+
pull_request:
10+
paths:
11+
- .github/workflows/paradedb.yml
12+
- paradedb/**
13+
workflow_dispatch:
14+
15+
env:
16+
LOCALSTACK_DISABLE_EVENTS: "1"
17+
LOCALSTACK_AUTH_TOKEN: ${{ secrets.LOCALSTACK_AUTH_TOKEN }}
18+
19+
jobs:
20+
integration-tests:
21+
name: Run Integration Tests
22+
runs-on: ubuntu-latest
23+
timeout-minutes: 10
24+
steps:
25+
- name: Checkout
26+
uses: actions/checkout@v4
27+
28+
- name: Setup LocalStack and extension
29+
run: |
30+
cd paradedb
31+
32+
docker pull localstack/localstack-pro &
33+
docker pull paradedb/paradedb &
34+
pip install localstack
35+
36+
make install
37+
make lint
38+
make dist
39+
localstack extensions -v install file://$(ls ./dist/localstack_extension_paradedb-*.tar.gz)
40+
41+
DEBUG=1 localstack start -d
42+
localstack wait
43+
44+
- name: Run integration tests
45+
run: |
46+
cd paradedb
47+
make test
48+
49+
- name: Print logs
50+
if: always()
51+
run: |
52+
localstack logs
53+
localstack stop

paradedb/.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
.venv
2+
dist
3+
build
4+
**/*.egg-info
5+
.eggs

paradedb/Makefile

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
VENV_BIN = python3 -m venv
2+
VENV_DIR ?= .venv
3+
VENV_ACTIVATE = $(VENV_DIR)/bin/activate
4+
VENV_RUN = . $(VENV_ACTIVATE)
5+
TEST_PATH ?= tests
6+
7+
usage: ## Shows usage for this Makefile
8+
@cat Makefile | grep -E '^[a-zA-Z_-]+:.*?## .*$$' | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-15s\033[0m %s\n", $$1, $$2}'
9+
10+
venv: $(VENV_ACTIVATE)
11+
12+
$(VENV_ACTIVATE): pyproject.toml
13+
test -d .venv || $(VENV_BIN) .venv
14+
$(VENV_RUN); pip install --upgrade pip setuptools plux
15+
$(VENV_RUN); pip install -e .[dev]
16+
touch $(VENV_DIR)/bin/activate
17+
18+
clean:
19+
rm -rf .venv/
20+
rm -rf build/
21+
rm -rf .eggs/
22+
rm -rf *.egg-info/
23+
24+
install: venv ## Install dependencies
25+
$(VENV_RUN); python -m plux entrypoints
26+
27+
dist: venv ## Create distribution
28+
$(VENV_RUN); python -m build
29+
30+
publish: clean-dist venv dist ## Publish extension to pypi
31+
$(VENV_RUN); pip install --upgrade twine; twine upload dist/*
32+
33+
entrypoints: venv ## Generate plugin entrypoints for Python package
34+
$(VENV_RUN); python -m plux entrypoints
35+
36+
format: ## Run ruff to format the codebase
37+
$(VENV_RUN); python -m ruff format .; make lint
38+
39+
lint: ## Run ruff to lint the codebase
40+
$(VENV_RUN); python -m ruff check --output-format=full .
41+
42+
test: ## Run integration tests (requires LocalStack running with the Extension installed)
43+
$(VENV_RUN); pytest $(PYTEST_ARGS) $(TEST_PATH)
44+
45+
clean-dist: clean
46+
rm -rf dist/
47+
48+
.PHONY: clean clean-dist dist install publish usage venv format test

paradedb/README.md

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
ParadeDB on LocalStack
2+
======================
3+
4+
This repo contains a [LocalStack Extension](https://github.com/localstack/localstack-extensions) that facilitates developing [ParadeDB](https://www.paradedb.com)-based applications locally.
5+
6+
ParadeDB is an Elasticsearch alternative built on Postgres. It provides full-text search with BM25 scoring, hybrid search combining semantic and keyword search, and real-time analytics capabilities.
7+
8+
After installing the extension, a ParadeDB server instance will become available and can be accessed using standard PostgreSQL clients.
9+
10+
## Connection Details
11+
12+
Once the extension is running, you can connect to ParadeDB using any PostgreSQL client with the following default credentials:
13+
14+
- **Host**: `localhost` (or the Docker host if running in a container)
15+
- **Port**: `5432` (mapped from the container)
16+
- **Database**: `postgres`
17+
- **Username**: `postgres`
18+
- **Password**: `postgres`
19+
20+
Example connection using `psql`:
21+
```bash
22+
psql -h localhost -p 5432 -U postgres -d postgres
23+
```
24+
25+
Example connection using Python:
26+
```python
27+
import psycopg2
28+
29+
conn = psycopg2.connect(
30+
host="localhost",
31+
port=5432,
32+
database="postgres",
33+
user="postgres",
34+
password="postgres"
35+
)
36+
```
37+
38+
## ParadeDB Features
39+
40+
ParadeDB includes several powerful extensions:
41+
42+
- **pg_search**: Full-text search with BM25 ranking
43+
- **pg_analytics**: DuckDB-powered analytics for OLAP workloads
44+
- **pg_lakehouse**: Query data lakes (S3, Delta Lake, Iceberg) directly
45+
46+
Example using pg_search:
47+
```sql
48+
-- Create a table with search index
49+
CREATE TABLE products (
50+
id SERIAL PRIMARY KEY,
51+
name TEXT,
52+
description TEXT
53+
);
54+
55+
-- Create a BM25 search index
56+
CALL paradedb.create_bm25(
57+
index_name => 'products_idx',
58+
table_name => 'products',
59+
key_field => 'id',
60+
text_fields => paradedb.field('name') || paradedb.field('description')
61+
);
62+
63+
-- Search with BM25 scoring
64+
SELECT * FROM products.search('description:electronics');
65+
```
66+
67+
## Configuration
68+
69+
The following environment variables can be passed to the LocalStack container to configure the extension:
70+
71+
* `PARADEDB_POSTGRES_USER`: PostgreSQL username (default: `postgres`)
72+
* `PARADEDB_POSTGRES_PASSWORD`: PostgreSQL password (default: `postgres`)
73+
* `PARADEDB_POSTGRES_DB`: Default database name (default: `postgres`)
74+
75+
## Prerequisites
76+
77+
* Docker
78+
* LocalStack Pro (free trial available)
79+
* `localstack` CLI
80+
* `make`
81+
82+
## Install from GitHub repository
83+
84+
This extension can be installed directly from this Github repo via:
85+
86+
```bash
87+
localstack extensions install "git+https://github.com/localstack/localstack-extensions.git#egg=localstack-extension-paradedb&subdirectory=paradedb"
88+
```
89+
90+
## Install local development version
91+
92+
Please refer to the docs [here](https://github.com/localstack/localstack-extensions?tab=readme-ov-file#start-localstack-with-the-extension) for instructions on how to start the extension in developer mode.
93+
94+
## Change Log
95+
96+
* `0.1.0`: Initial version of the extension
97+
98+
## License
99+
100+
The code in this repo is available under the Apache 2.0 license.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
name = "localstack_paradedb"
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
import os
2+
import logging
3+
4+
from localstack_paradedb.utils.docker import DatabaseDockerContainerExtension
5+
6+
LOG = logging.getLogger(__name__)
7+
8+
# Environment variables for configuration
9+
ENV_POSTGRES_USER = "PARADEDB_POSTGRES_USER"
10+
ENV_POSTGRES_PASSWORD = "PARADEDB_POSTGRES_PASSWORD"
11+
ENV_POSTGRES_DB = "PARADEDB_POSTGRES_DB"
12+
13+
# Default values
14+
DEFAULT_POSTGRES_USER = "postgres"
15+
DEFAULT_POSTGRES_PASSWORD = "postgres"
16+
DEFAULT_POSTGRES_DB = "postgres"
17+
18+
19+
class ParadeDbExtension(DatabaseDockerContainerExtension):
20+
name = "paradedb"
21+
22+
# Name of the Docker image to spin up
23+
DOCKER_IMAGE = "paradedb/paradedb"
24+
# Default port for PostgreSQL
25+
POSTGRES_PORT = 5432
26+
27+
def __init__(self):
28+
# Get configuration from environment variables
29+
postgres_user = os.environ.get(ENV_POSTGRES_USER, DEFAULT_POSTGRES_USER)
30+
postgres_password = os.environ.get(ENV_POSTGRES_PASSWORD, DEFAULT_POSTGRES_PASSWORD)
31+
postgres_db = os.environ.get(ENV_POSTGRES_DB, DEFAULT_POSTGRES_DB)
32+
33+
# Environment variables to pass to the container
34+
env_vars = {
35+
"POSTGRES_USER": postgres_user,
36+
"POSTGRES_PASSWORD": postgres_password,
37+
"POSTGRES_DB": postgres_db,
38+
}
39+
40+
super().__init__(
41+
image_name=self.DOCKER_IMAGE,
42+
container_ports=[self.POSTGRES_PORT],
43+
env_vars=env_vars,
44+
)
45+
46+
# Store configuration for connection info
47+
self.postgres_user = postgres_user
48+
self.postgres_password = postgres_password
49+
self.postgres_db = postgres_db
50+
51+
def get_connection_info(self) -> dict:
52+
"""Return connection information for ParadeDB."""
53+
info = super().get_connection_info()
54+
info.update({
55+
"database": self.postgres_db,
56+
"user": self.postgres_user,
57+
"password": self.postgres_password,
58+
"port": self.POSTGRES_PORT,
59+
"connection_string": (
60+
f"postgresql://{self.postgres_user}:{self.postgres_password}"
61+
f"@{self.container_host}:{self.POSTGRES_PORT}/{self.postgres_db}"
62+
),
63+
})
64+
return info

paradedb/localstack_paradedb/utils/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)