Skip to content

Commit 755a94b

Browse files
committed
refactor(v3): finish ore_block_256 rename + sync codegen goldens
Complete the eql_v3.ore_block_u64_8_256 -> ore_block_256 rename across the docs (CLAUDE.md, the scalar guide, eql-functions, sql-support; eql_v2 names left unchanged) and the v3 SEM file header — the @file/subset comment had over-applied the rename to the v2-origin path (src/ore_block_u64_8_256). Regenerate the codegen reference goldens for every catalog type after rebasing onto eql_v3: add the numeric reference dir and expand timestamptz to its now-ordered shape so the parity gate passes. Also address review feedback: - ScalarKind::rust_type returns the now-real numeric type (rust_decimal::Decimal); only jsonb remains surfaceless. De-stale its doc and the 'timestamptz is equality-only' test comment; add numeric_maps_to_decimal. - Correct the guide's stale 'timestamptz is equality-only' prose and a dangling link to the removed design plan doc. - Add comparator_rejects_mismatched_block_widths (8-block vs 14-block terms must raise via the different-lengths guard). - Add the PR link (#276) to the numeric and N-block changelog entries.
1 parent 2d3fa18 commit 755a94b

35 files changed

Lines changed: 4309 additions & 74 deletions

CHANGELOG.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Each entry that ships in a published release links to the PR that introduced it.
2727
- **`eql_v3.int8` encrypted-domain type family.** Four jsonb-backed domains for encrypted `int8` columns — `eql_v3.int8` (storage-only), `eql_v3.int8_eq` (`=` / `<>` via HMAC), and `eql_v3.int8_ord` / `eql_v3.int8_ord_ore` (also `<` `<=` `>` `>=` via ORE block terms, with `MIN` / `MAX` aggregates) — generated from the `int8` row in `eql-scalars::CATALOG` by the same materializer as the `eql_v3.int4` reference. Index via a functional index on the `eql_v3.eq_term` / `eql_v3.ord_term` extractors, not an operator class on the domain. Why: a type-safe, per-capability encrypted `bigint` column, extending the scalar generator across the full 64-bit integer width. ([#253](https://github.com/cipherstash/encrypt-query-language/pull/253))
2828
- **`eql_v3.date` encrypted-domain type family.** Four jsonb-backed domains for encrypted `date` columns — `eql_v3.date` (storage-only), `eql_v3.date_eq` (`=` / `<>` via HMAC), and `eql_v3.date_ord` / `eql_v3.date_ord_ore` (also `<` `<=` `>` `>=` via ORE block terms, with `MIN` / `MAX` aggregates) — generated from the `date` row in `eql-scalars::CATALOG` by the same materializer as the `eql_v3.int4` reference. Plaintexts encrypt under the `date` cast and compare via the same ORE block terms as the integer scalars (ORE is plaintext-agnostic — dates order like integers). Index via a functional index on the `eql_v3.eq_term` / `eql_v3.ord_term` extractors, not an operator class on the domain. Why: the first **non-integer ordered** scalar encrypted-domain type — a type-safe, per-capability encrypted `date` column — proving the generator and SQLx test matrix generalize beyond fixed-width integers. ([#256](https://github.com/cipherstash/encrypt-query-language/pull/256))
2929
- **`eql_v3.timestamptz` encrypted-domain type family (ordered).** Four jsonb-backed domains for encrypted `timestamptz` columns — `eql_v3.timestamptz` (storage-only), `eql_v3.timestamptz_eq` (`=` / `<>` via HMAC), and `eql_v3.timestamptz_ord` / `eql_v3.timestamptz_ord_ore` (also `<` `<=` `>` `>=`, `MIN` / `MAX` via 12-block ORE) — generated from the `timestamptz` row in `eql-scalars::CATALOG` by the same materializer as the `eql_v3.date` family. Values are **UTC-normalized** (cipherstash has no timezone-preserving type): plaintexts encrypt under the `timestamp` cast. Ordering works because the `eql_v3` ORE block comparator now derives its block count from the ciphertext width (see the comparator entry below) instead of assuming 8. Index via a functional index on the `eql_v3.eq_term` / `eql_v3.ord_term` extractors, not an operator class on the domain. ([#257](https://github.com/cipherstash/encrypt-query-language/pull/257))
30-
- **`eql_v3.numeric` encrypted-domain type family (ordered).** Four jsonb-backed domains for encrypted `numeric` / `decimal` columns — `eql_v3.numeric` (storage-only), `eql_v3.numeric_eq` (`=` / `<>` via HMAC), and `eql_v3.numeric_ord` / `eql_v3.numeric_ord_ore` (also `<` `<=` `>` `>=`, `MIN` / `MAX` via 14-block ORE) — generated from the `numeric` row in `eql-scalars::CATALOG`. cipherstash encrypts `Plaintext::Decimal` at native 14-block ORE width; ordering matches `rust_decimal::Decimal` ordering exactly (equivalent scales such as `1` and `1.0` collide, like Postgres `numeric`). Index via a functional index on the `eql_v3.eq_term` / `eql_v3.ord_term` extractors. Why: a type-safe, ordered encrypted decimal column, the first scalar to exercise an ORE term wider than 8 blocks. ([#241](https://github.com/cipherstash/encrypt-query-language/issues/241))
30+
- **`eql_v3.numeric` encrypted-domain type family (ordered).** Four jsonb-backed domains for encrypted `numeric` / `decimal` columns — `eql_v3.numeric` (storage-only), `eql_v3.numeric_eq` (`=` / `<>` via HMAC), and `eql_v3.numeric_ord` / `eql_v3.numeric_ord_ore` (also `<` `<=` `>` `>=`, `MIN` / `MAX` via 14-block ORE) — generated from the `numeric` row in `eql-scalars::CATALOG`. cipherstash encrypts `Plaintext::Decimal` at native 14-block ORE width; ordering matches `rust_decimal::Decimal` ordering exactly (equivalent scales such as `1` and `1.0` collide, like Postgres `numeric`). Index via a functional index on the `eql_v3.eq_term` / `eql_v3.ord_term` extractors. Why: a type-safe, ordered encrypted decimal column, the first scalar to exercise an ORE term wider than 8 blocks. ([#241](https://github.com/cipherstash/encrypt-query-language/issues/241), [#276](https://github.com/cipherstash/encrypt-query-language/pull/276))
3131
- **Per-domain `MIN` / `MAX` aggregates for the encrypted-domain family.** `eql_v3.min(eql_v3.<T>_ord)` / `eql_v3.max(eql_v3.<T>_ord)` (and the `_ord_ore` twin) are generated for every ord-capable scalar variant, giving type-safe extrema on domain-typed columns — comparison routes through the variant's `<` / `>` operator (ORE block term, no decryption). The aggregates are declared `PARALLEL = SAFE` with a combine function (the state function itself — min/max are associative), so PostgreSQL can use partial/parallel aggregation on large `GROUP BY` workloads. Why: the new domain types previously had no equivalent of the composite-type aggregates. The existing `eql_v2.min(eql_v2_encrypted)` / `eql_v2.max(eql_v2_encrypted)` aggregates are **retained** and continue to work on `eql_v2_encrypted` columns; the per-domain aggregates are additive and coexist with them. ([#239](https://github.com/cipherstash/encrypt-query-language/pull/239))
3232
- **`eql_v3.text` encrypted-domain family (`text`, `text_eq`, `text_match`, `text_ord`, `text_ord_ore`).** Adds equality (`=` / `<>` via HMAC), match (`@>` / `<@` via a new self-contained `eql_v3.bloom_filter` SEM index term), and ORE ordering (`<` `<=` `>` `>=`, `min` / `max`) for encrypted text, at parity with EQL v2 text — generated from the `text` row in `eql-scalars::CATALOG` by the same materializer as the `eql_v3.int4` reference. `text` is the first scalar to add a new index `Term` (`Bloom`) and the first non-integer, unbounded ordered kind (lexicographic pivots, hand-written `impl ScalarType`). Index via a functional index on the `eql_v3.eq_term` / `eql_v3.ord_term` / `eql_v3.match_term` extractors, not an operator class on the domain. Why: brings searchable encrypted text to the namespaced, `eql_v2`-free `eql_v3` surface. Match is exposed as bloom-filter containment on the `text_match` domain — deliberately *not* SQL `LIKE` (no wildcard/anchoring; probabilistic ngram containment) — and never backs equality (which always routes through `Hm`). ([#260](https://github.com/cipherstash/encrypt-query-language/pull/260))
3333
- **Self-contained `eql_v3` schema + standalone `release/cipherstash-encrypt-v3.sql` installer.** The `eql_v3` encrypted-domain surface no longer depends on `eql_v2` at runtime: it now owns its own copies of the searchable-encrypted-metadata (SEM) index-term types — `eql_v3.hmac_256` and `eql_v3.ore_block_256` (with its btree operator class) — so the `eql_v3.eq_term` / `eql_v3.ord_term` extractors return `eql_v3` types and no `eql_v2.<symbol>` appears anywhere in the v3 SQL. The whole v3 surface relocated under a single `src/v3/` tree (`src/v3/sem/` for the hand-written SEM types, `src/v3/scalars/` for the generated domain families). A new build variant ships the `eql_v3` schema on its own as `release/cipherstash-encrypt-v3.sql`, installable into a database with no `eql_v2` present; a CI gate greps that artifact and its dependency closure to keep it `eql_v2`-free. Why: a clean foundation for the per-scalar encrypted-domain model to stand alone, ahead of it replacing the `eql_v2_encrypted` composite column type. This is additive — a new schema and a new artifact — and leaves `eql_v2` byte-for-byte unchanged. ([#255](https://github.com/cipherstash/encrypt-query-language/pull/255))
@@ -38,7 +38,7 @@ Each entry that ships in a published release links to the PR that introduced it.
3838

3939
### Fixed
4040

41-
- **The `eql_v3` ORE block comparator now orders ciphertexts of any block count, not just 8.** `eql_v3.compare_ore_block_256_term` derives the block count `N` from the term length (`octet_length = 49·N + 16`) instead of hardcoding 8, so encrypted types whose native ORE width exceeds 8 blocks — `numeric` (14) and `timestamptz` (12) — order, range-query, `ORDER BY`, and `MIN`/`MAX` correctly instead of silently mis-ordering. Malformed terms (length not `49·N + 16` for `N ≥ 1`) now raise instead of returning a bogus comparison. The self-contained `eql_v3` SEM type was renamed `eql_v3.ore_block_u64_8_256 → eql_v3.ore_block_256` to reflect that it is width-agnostic (the `eql_v2` type is unchanged). No effect on existing 8-block types (a no-op for `N = 8`). ([#241](https://github.com/cipherstash/encrypt-query-language/issues/241))
41+
- **The `eql_v3` ORE block comparator now orders ciphertexts of any block count, not just 8.** `eql_v3.compare_ore_block_256_term` derives the block count `N` from the term length (`octet_length = 49·N + 16`) instead of hardcoding 8, so encrypted types whose native ORE width exceeds 8 blocks — `numeric` (14) and `timestamptz` (12) — order, range-query, `ORDER BY`, and `MIN`/`MAX` correctly instead of silently mis-ordering. Malformed terms (length not `49·N + 16` for `N ≥ 1`) now raise instead of returning a bogus comparison. The self-contained `eql_v3` SEM type was renamed `eql_v3.ore_block_u64_8_256 → eql_v3.ore_block_256` to reflect that it is width-agnostic (the `eql_v2` type is unchanged). No effect on existing 8-block types (a no-op for `N = 8`). ([#241](https://github.com/cipherstash/encrypt-query-language/issues/241), [#276](https://github.com/cipherstash/encrypt-query-language/pull/276))
4242

4343
## [2.3.1] — 2026-05-21
4444

CLAUDE.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ This project uses `mise` for task management. Common commands:
5353
This is the **Encrypt Query Language (EQL)** - a PostgreSQL extension for searchable encryption. Key architectural components:
5454

5555
### Core Structure
56-
- **Schema**: Core EQL functions/types are in the `eql_v2` PostgreSQL schema. The encrypted-domain type families (`int4` and future scalar domains) live in a separate `eql_v3` schema (see below). The `eql_v3` surface is **self-contained**: it owns its own copies of the searchable-encrypted-metadata (SEM) index-term types (`eql_v3.hmac_256`, `eql_v3.ore_block_u64_8_256`, hand-written under `src/v3/sem/`) and has no runtime dependency on `eql_v2`. `eql_v2` is unchanged and remains the documented public API.
56+
- **Schema**: Core EQL functions/types are in the `eql_v2` PostgreSQL schema. The encrypted-domain type families (`int4` and future scalar domains) live in a separate `eql_v3` schema (see below). The `eql_v3` surface is **self-contained**: it owns its own copies of the searchable-encrypted-metadata (SEM) index-term types (`eql_v3.hmac_256`, `eql_v3.ore_block_256`, hand-written under `src/v3/sem/`) and has no runtime dependency on `eql_v2`. `eql_v2` is unchanged and remains the documented public API.
5757
- **Main Type**: `eql_v2_encrypted` - composite type for encrypted columns (stored as JSONB)
5858
- **Configuration**: `eql_v2_configuration` table tracks encryption configs
5959
- **Index Types**: Various encrypted index types (blake3, hmac_256, bloom_filter, ore variants)
@@ -64,7 +64,7 @@ This is the **Encrypt Query Language (EQL)** - a PostgreSQL extension for search
6464
- `src/operators/` - SQL operators for encrypted data comparisons
6565
- `src/config/` - Configuration management functions
6666
- `src/blake3/`, `src/hmac_256/`, `src/bloom_filter/`, `src/ore_*` - Index implementations
67-
- `src/v3/` - Self-contained `eql_v3` surface: `src/v3/schema.sql`, forked `src/v3/crypto.sql` / `src/v3/common.sql`, hand-written SEM index-term types under `src/v3/sem/` (`hmac_256`, `ore_block_u64_8_256`), and the generated scalar encrypted-domain families under `src/v3/scalars/<T>/` (plus the shared blocker `src/v3/scalars/functions.sql`)
67+
- `src/v3/` - Self-contained `eql_v3` surface: `src/v3/schema.sql`, forked `src/v3/crypto.sql` / `src/v3/common.sql`, hand-written SEM index-term types under `src/v3/sem/` (`hmac_256`, `ore_block_256`), and the generated scalar encrypted-domain families under `src/v3/scalars/<T>/` (plus the shared blocker `src/v3/scalars/functions.sql`)
6868
- `tasks/` - mise task scripts
6969
- `tests/sqlx/` - Rust/SQLx test framework (PostgreSQL 14-17 support)
7070
- `release/` - Generated SQL installation files
@@ -78,7 +78,7 @@ This is the **Encrypt Query Language (EQL)** - a PostgreSQL extension for search
7878

7979
### Encrypted-Domain Types
8080

81-
`src/v3/scalars/` holds the generated **encrypted-domain type families** — jsonb-backed PostgreSQL domains in the **`eql_v3` schema**, one domain per operator/index capability (`eql_v3.<T>` storage-only, `eql_v3.<T>_eq`, `eql_v3.<T>_ord`). The schema qualifier replaces the old version-prefixed name, so the domains are `eql_v3.int4`, `eql_v3.int4_eq`, `eql_v3.int4_ord`, `eql_v3.int4_ord_ore` — created in `eql_v3`, not `public`. Their extractors/wrappers/aggregates (`eql_v3.eq_term`, `eql_v3.ord_term`, `eql_v3.eq`/`lt`/…, `eql_v3.min`/`max`) also live in `eql_v3`, and the SEM index-term types they return and construct (`eql_v3.hmac_256`, `eql_v3.ore_block_u64_8_256`) are **also `eql_v3`** — hand-written under `src/v3/sem/` so the whole v3 surface is self-contained (no `eql_v2.<symbol>` appears anywhere in v3 SQL; CI gates this via `mise run test:self_contained_v3` and the standalone `release/cipherstash-encrypt-v3.sql` installer). `eql_v3.int4` (PR #239, supersedes #225) is the reference scalar implementation; future scalar types such as `int8`, `bool`, `date`, `float`, `numeric`, `timestamp`, `text`, and `jsonb` follow this materializer pattern. `text`, `numeric`, and `jsonb` are planned but have no generated SQL surface yet — `jsonb` in particular needs a separate SQL design beyond the ordered-scalar materializer. The `eql-scalars` fixture catalog (`crates/eql-scalars`) already models their fixture values ahead of the SQL surface.
81+
`src/v3/scalars/` holds the generated **encrypted-domain type families** — jsonb-backed PostgreSQL domains in the **`eql_v3` schema**, one domain per operator/index capability (`eql_v3.<T>` storage-only, `eql_v3.<T>_eq`, `eql_v3.<T>_ord`). The schema qualifier replaces the old version-prefixed name, so the domains are `eql_v3.int4`, `eql_v3.int4_eq`, `eql_v3.int4_ord`, `eql_v3.int4_ord_ore` — created in `eql_v3`, not `public`. Their extractors/wrappers/aggregates (`eql_v3.eq_term`, `eql_v3.ord_term`, `eql_v3.eq`/`lt`/…, `eql_v3.min`/`max`) also live in `eql_v3`, and the SEM index-term types they return and construct (`eql_v3.hmac_256`, `eql_v3.ore_block_256`) are **also `eql_v3`** — hand-written under `src/v3/sem/` so the whole v3 surface is self-contained (no `eql_v2.<symbol>` appears anywhere in v3 SQL; CI gates this via `mise run test:self_contained_v3` and the standalone `release/cipherstash-encrypt-v3.sql` installer). `eql_v3.int4` (PR #239, supersedes #225) is the reference scalar implementation; future scalar types such as `int8`, `bool`, `date`, `float`, `numeric`, `timestamp`, `text`, and `jsonb` follow this materializer pattern. `text`, `numeric`, and `jsonb` are planned but have no generated SQL surface yet — `jsonb` in particular needs a separate SQL design beyond the ordered-scalar materializer. The `eql-scalars` fixture catalog (`crates/eql-scalars`) already models their fixture values ahead of the SQL surface.
8282

8383
Adding a scalar encrypted-domain type is one row in the Rust catalog `eql-scalars::CATALOG` (`crates/eql-scalars/src/lib.rs`): a `ScalarSpec` giving the type `token` (e.g. `int8`), its `ScalarKind` (the `kind` field), the `DomainSpec`s mapping each generated domain suffix to its fixed index `Term`s (`_eq => [Hm]`, `_ord`/`_ord_ore => [Ore]`), and the `Fixture` value list. Term capabilities are fixed in the `Term` enum's `impl` methods (with unit tests): `Hm` provides equality, and `Ore` provides equality plus ordering. There is no TOML manifest and no Python — the catalog is the source of truth, validated by the compiler (an undefined term or unknown scalar is a compile error) plus catalog `#[test]`s. `mise run build` runs `cargo run -p eql-codegen`, which regenerates the scalar SQL surface into `src/v3/scalars/<T>/` from `CATALOG` at the start of every build; that surface includes supported comparison wrappers plus blockers for native `jsonb` operators that would otherwise be reachable through domain fallback. `cargo run -p eql-codegen` regenerates every type at once (the same call `mise run build` uses; there is no per-type codegen task). The generated `*_types.sql` / `*_functions.sql` / `*_operators.sql` / `*_aggregates.sql` files are gitignored and never committed. The per-type plaintext fixture lists the SQLx matrix consumes are **not** a generated file — they are materialised from each `CATALOG` row at compile time as `eql_scalars::INT4_VALUES` / `INT2_VALUES` (the `int_values!` macro) and read directly by `ScalarType::FIXTURE_VALUES`; a Rust source of truth no longer round-trips through a committed generated `.rs`. Generated SQL carries a `-- AUTOMATICALLY GENERATED FILE` header (the project-wide marker `docs:validate` greps on); change the catalog and rebuild, never hand-edit. Hand-written SQL beyond the fixed surface goes in `src/v3/scalars/<T>/<T>_extensions.sql` with no auto-generated header and explicit `-- REQUIRE:` edges — that file IS committed. `text` and `jsonb` are out of scope for this scalar materializer.
8484

crates/eql-scalars/src/kind.rs

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -97,11 +97,11 @@ impl ScalarKind {
9797
}
9898

9999
/// A debug/identifier string for the kind: the canonical Rust plaintext type
100-
/// name (`"i32"`, `"chrono::NaiveDate"`). `Numeric`/`Jsonb` have **no
101-
/// generated SQL surface** and no catalog row, so calling this on them is a
102-
/// programming error and panics loudly rather than returning a plausible SQL
103-
/// token a premature caller might feed into codegen. Only call site today is
104-
/// `crates/eql-scalars/src/tests.rs`.
100+
/// name (`"i32"`, `"chrono::NaiveDate"`, `"rust_decimal::Decimal"`). `Jsonb`
101+
/// has **no generated SQL surface** and no catalog row, so calling this on it
102+
/// is a programming error and panics loudly rather than returning a plausible
103+
/// SQL token a premature caller might feed into codegen. Only call site today
104+
/// is `crates/eql-scalars/src/tests.rs`.
105105
pub const fn rust_type(self) -> &'static str {
106106
match self {
107107
ScalarKind::I16 => "i16",
@@ -110,8 +110,9 @@ impl ScalarKind {
110110
ScalarKind::Text => "text",
111111
ScalarKind::Date => "chrono::NaiveDate",
112112
ScalarKind::Timestamptz => "chrono::DateTime<Utc>",
113-
ScalarKind::Numeric | ScalarKind::Jsonb => {
114-
panic!("ScalarKind::rust_type: numeric/jsonb have no generated surface yet")
113+
ScalarKind::Numeric => "rust_decimal::Decimal",
114+
ScalarKind::Jsonb => {
115+
panic!("ScalarKind::rust_type: jsonb has no generated surface yet")
115116
}
116117
}
117118
}

0 commit comments

Comments
 (0)