Skip to content

Commit 3a6a175

Browse files
authored
feat: SQL translator with ClickHouse dialect (#1)
1 parent 0d41586 commit 3a6a175

29 files changed

Lines changed: 4077 additions & 1 deletion

.github/workflows/ci.yml

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
name: CI
2+
3+
on:
4+
pull_request:
5+
push:
6+
branches: [ main ]
7+
8+
jobs:
9+
ci:
10+
name: CI
11+
runs-on: ubuntu-latest
12+
env:
13+
CLICKHOUSE_HOST: localhost
14+
CLICKHOUSE_PORT: "8123"
15+
steps:
16+
- uses: actions/checkout@v5
17+
with:
18+
submodules: recursive
19+
- uses: astral-sh/setup-uv@v7
20+
with:
21+
enable-cache: true
22+
- run: make lint
23+
- run: make typecheck
24+
- name: Start ClickHouse
25+
run: docker compose up --detach --wait clickhouse
26+
- run: make test
27+
- name: Check Coverage
28+
uses: 5monkeys/cobertura-action@v14
29+
with:
30+
minimum_coverage: 100
31+
fail_below_threshold: true
32+
show_missing: true

.gitignore

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,11 @@ wheels/
1313
.pytest_cache/
1414
.mypy_cache/
1515
.ruff_cache/
16+
17+
# Coverage
18+
.coverage
19+
coverage.xml
20+
htmlcov/
21+
22+
# Local secrets
23+
.env

.gitmodules

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
[submodule "engine-test-data"]
2+
path = engine-test-data
3+
url = https://github.com/Flagsmith/engine-test-data.git
4+
branch = v3.7.0

.pre-commit-config.yaml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
repos:
2+
- repo: https://github.com/astral-sh/ruff-pre-commit
3+
rev: v0.15.6
4+
hooks:
5+
- id: ruff-check
6+
args: [--fix]
7+
- id: ruff-format
8+
- repo: https://github.com/astral-sh/uv-pre-commit
9+
rev: 0.10.10
10+
hooks:
11+
- id: uv-lock
12+
- repo: https://github.com/pre-commit/pre-commit-hooks
13+
rev: v6.0.0
14+
hooks:
15+
- id: check-yaml
16+
- id: check-json
17+
- id: check-toml
18+
- repo: https://github.com/Flagsmith/flagsmith-common
19+
rev: v3.8.2
20+
hooks:
21+
- id: flagsmith-lint-tests
22+
- repo: local
23+
hooks:
24+
- id: python-typecheck
25+
name: python-typecheck
26+
language: system
27+
entry: make typecheck
28+
require_serial: true
29+
pass_filenames: false
30+
types: [python]

.python-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.10

CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
* @flagsmith/flagsmith-back-end

Makefile

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
.PHONY: install-packages
2+
install-packages: ## Install all required packages
3+
uv sync
4+
5+
.PHONY: install-pre-commit
6+
install-pre-commit: ## Install pre-commit hooks
7+
uv run prek install
8+
9+
.PHONY: install
10+
install: install-packages install-pre-commit ## Ensure the environment is set up
11+
12+
.PHONY: lint
13+
lint: ## Run linters (pre-commit hooks across the tree)
14+
uv run prek run --all-files
15+
16+
.PHONY: test
17+
test: ## Run unit tests. Override scope with opts, e.g. `make test opts='-m engine_parity'`
18+
uv run pytest $(opts)
19+
20+
.PHONY: typecheck
21+
typecheck: ## Run mypy
22+
uv run mypy
23+
24+
.PHONY: help
25+
help:
26+
@echo "Usage: make [target]"
27+
@echo ""
28+
@echo "Available targets:"
29+
@awk 'BEGIN {FS = ":.*?## "} /^[a-zA-Z_-]+:.*?## / {printf " \033[36m%-30s\033[0m %s\n", $$1, $$2}' $(MAKEFILE_LIST)

README.md

Lines changed: 135 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,137 @@
11
# flagsmith-sql-flag-engine
22

3-
Placeholder. The initial package scaffold lands via the first pull request.
3+
SQL translator for Flagsmith segment predicates.
4+
5+
Where the Python and Rust `flag_engine` implementations evaluate
6+
`is_context_in_segment` against an in-memory `EvaluationContext`, this
7+
package takes a `SegmentContext` and emits a SQL `WHERE` expression that
8+
evaluates the segment against an entire `IDENTITIES` table — one row per
9+
identity, with the identity's full trait map held in a single column
10+
the translator path-extracts at query time. `PERCENTAGE_SPLIT` and
11+
`:semver`-marked comparators compile to inline pure-SQL.
12+
13+
## Quickstart
14+
15+
```python
16+
from flag_engine.context.types import EvaluationContext, SegmentContext
17+
18+
from flagsmith_sql_flag_engine import TranslateContext, translate_segment
19+
from flagsmith_sql_flag_engine.dialects import ClickHouseDialect
20+
21+
eval_context: EvaluationContext = {
22+
"environment": {"key": "n9fbf9...3ngWhb", "name": "Production"},
23+
}
24+
ctx = TranslateContext(evaluation_context=eval_context, dialect=ClickHouseDialect())
25+
26+
segment: SegmentContext = {
27+
"key": "growth-cohort",
28+
"name": "Growth cohort",
29+
"rules": [
30+
{
31+
"type": "ALL",
32+
"conditions": [
33+
{"operator": "EQUAL", "property": "plan", "value": "growth"},
34+
],
35+
},
36+
],
37+
}
38+
where_expr = translate_segment(segment, ctx)
39+
# where_expr is a SQL string. Drop into:
40+
# SELECT COUNT(*) FROM IDENTITIES i
41+
# WHERE i.environment_id = 'n9fbf9...3ngWhb' AND ({where_expr})
42+
```
43+
44+
`environment_id` in the `IDENTITIES` table is a string column holding
45+
`EnvironmentContext.key` directly — the same identifier the engine uses,
46+
no separate integer PK.
47+
48+
`translate_segment` returns `None` if the segment uses an operator the
49+
translator can't handle — typically a REGEX pattern the active dialect's
50+
regex flavour can't compile. Callers should fall back to
51+
`flag_engine.is_context_in_segment` for those segments.
52+
53+
## Schema
54+
55+
Each dialect publishes the table layout it expects via a `schema_ddl`
56+
constant. For ClickHouse:
57+
58+
```sql
59+
CREATE TABLE IF NOT EXISTS IDENTITIES (
60+
environment_id String,
61+
id UInt64,
62+
identifier String,
63+
identity_key String,
64+
traits JSON
65+
)
66+
ENGINE = MergeTree()
67+
ORDER BY (environment_id, id);
68+
```
69+
70+
Traits live in a single `JSON` column (CH 24+, GA in 25.x). Each key is
71+
stored as a typed subcolumn, so trait reads are direct columnar scans
72+
rather than per-row JSON parses. Trait keys are *data* — new keys appear
73+
without schema changes — and the translator only sees the abstract path
74+
extraction.
75+
76+
ClickHouse Cloud requires `SET allow_experimental_json_type = 1` when
77+
creating a `JSON`-column table (the type is GA on OSS 25.x); the test
78+
harness applies this setting automatically.
79+
80+
Programmatic access:
81+
82+
```python
83+
from flagsmith_sql_flag_engine.dialects.clickhouse import SCHEMA_DDL
84+
```
85+
86+
## Engine parity
87+
88+
Validated against [Flagsmith/engine-test-data](https://github.com/Flagsmith/engine-test-data),
89+
the test suite every engine implementation is checked against. The
90+
engine-parity suite loads each test case's identity into a per-dialect
91+
scratch table, translates the case's segments, runs the generated SQL,
92+
and compares to `flag_engine.is_context_in_segment`.
93+
94+
To run the engine-parity suite locally:
95+
96+
```bash
97+
git submodule update --init # pull engine-test-data
98+
docker compose up --detach --wait clickhouse
99+
uv run pytest tests/test_engine.py
100+
```
101+
102+
Adding a new dialect's parity coverage is one harness module — see
103+
`tests/harnesses/` for the shape.
104+
105+
## Dialects
106+
107+
The translator is dialect-aware: a `Dialect` protocol abstracts the
108+
SQL fragments that differ across SQL engines — MD5 hex, hex-to-int
109+
parsing, prefix-anchored regex, padded-version comparison, type-aware
110+
trait predicates, regex flavour. Today `ClickHouseDialect` is the only
111+
implementation; adding another engine such as Snowflake, DuckDB or
112+
Postgres means writing one class.
113+
114+
## Operator coverage
115+
116+
| Operator | Translatable | Notes |
117+
| -------------------------------------------- | :----------: | -------------------------------------------------------------- |
118+
| `EQUAL`, `NOT_EQUAL`, `IN` | yes | |
119+
| `IS_SET`, `IS_NOT_SET` | yes | trait subcolumn `IS NOT NULL` / `IS NULL` |
120+
| `CONTAINS`, `NOT_CONTAINS` | yes | |
121+
| `GREATER_THAN`, `LESS_THAN` plus `_INCLUSIVE`| yes | |
122+
| `MODULO` | yes | |
123+
| `PERCENTAGE_SPLIT` | yes | inlined MD5-mod-9999; ~0.005% diverge on hash==9998 |
124+
| `REGEX` | partial | dialect-flavour gated; unsupported patterns → caller fallback |
125+
| `:semver`-marked comparators | yes | major.minor.patch only; ignores prerelease |
126+
127+
## Development
128+
129+
```bash
130+
make install # uv sync + pre-commit install
131+
make lint # run pre-commit hooks across the tree
132+
make typecheck # mypy
133+
make test # unit tests
134+
```
135+
136+
Ruff (lint + format) runs as a pre-commit hook on every commit. Mypy
137+
runs as a `make typecheck` hook on staged Python files.

docker-compose.yml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
services:
2+
clickhouse:
3+
image: clickhouse/clickhouse-server:25.5.6
4+
environment:
5+
# Skip the random-password bootstrap. The container is only ever
6+
# reachable from the harness on the same compose network / host
7+
# loopback, so the default `default` user with no password is fine.
8+
CLICKHOUSE_SKIP_USER_SETUP: "1"
9+
ports:
10+
- "8123:8123"
11+
ulimits:
12+
nofile:
13+
soft: 262144
14+
hard: 262144
15+
healthcheck:
16+
test: ["CMD", "wget", "--spider", "-q", "http://localhost:8123/ping"]
17+
interval: 2s
18+
timeout: 2s
19+
retries: 15

engine-test-data

Submodule engine-test-data added at 4b29dc7

0 commit comments

Comments
 (0)