Skip to content

Commit 3dff874

Browse files
Merge pull request #10 from utopia-php/feat/clickhouse-schema-extras-2
feat(clickhouse): add UUID, Decimal, Array/Tuple, UInt8/Int8, raw ORDER BY, rawColumn passthrough
2 parents fb0b086 + ebcdaba commit 3dff874

22 files changed

Lines changed: 985 additions & 29 deletions

README.md

Lines changed: 100 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1849,9 +1849,11 @@ $result = $schema->table('users')
18491849
->createIfNotExists();
18501850
```
18511851

1852-
Available column types: `id`, `string`, `text`, `mediumText`, `longText`, `integer`, `bigInteger`, `serial`, `bigSerial`, `smallSerial`, `float`, `boolean`, `datetime`, `timestamp`, `json`, `binary`, `enum`, `point`, `linestring`, `polygon`, `vector` (PostgreSQL only), `timestamps`.
1852+
Available column types: `id`, `uuid`, `string`, `text`, `mediumText`, `longText`, `tinyInteger`, `smallInteger`, `integer`, `bigInteger`, `serial`, `bigSerial`, `smallSerial`, `float`, `decimal`, `boolean`, `datetime`, `timestamp`, `json`, `binary`, `enum`, `point`, `linestring`, `polygon`, `vector` (PostgreSQL only), `timestamps`.
18531853

1854-
Column modifiers: `nullable()`, `default($value)`, `unsigned()`, `unique()`, `primary()`, `autoIncrement()`, `after($column)`, `comment($text)`, `collation($collation)`, `check($expression)`, `generatedAs($expression)` + `stored()` / `virtual()`, `ttl($expression)` (ClickHouse), `userType($name)` (PostgreSQL).
1854+
Column modifiers: `nullable()`, `default($value)`, `defaultRaw($expression)`, `unsigned()`, `unique()`, `primary()`, `autoIncrement()`, `after($column)`, `comment($text)`, `collation($collation)`, `check($expression)`, `generatedAs($expression)` + `stored()` / `virtual()`, `ttl($expression)` (ClickHouse), `userType($name)` (PostgreSQL).
1855+
1856+
**Raw default expressions** — use `defaultRaw($expression)` for dialect-specific server-generated defaults that `default()` would otherwise quote as a string literal (`now()`, `CURRENT_TIMESTAMP`, `gen_random_uuid()`, `generateUUIDv4()`, `UUID()`, …). The expression is emitted verbatim and must come from a trusted source; it must not be empty or contain a semicolon. Takes precedence over `default()` when both are set.
18551857

18561858
**SERIAL types** — auto-incrementing integers. PostgreSQL emits native `SERIAL` / `BIGSERIAL` / `SMALLSERIAL`; MySQL/MariaDB compile to `INT AUTO_INCREMENT` / `BIGINT AUTO_INCREMENT` / `SMALLINT AUTO_INCREMENT`; SQLite maps to `INTEGER`. ClickHouse and MongoDB throw `UnsupportedException`:
18571859

@@ -2270,7 +2272,102 @@ $schema->table('events')
22702272

22712273
The expression is emitted verbatim and must not be empty or contain a semicolon. `SAMPLE BY` only applies to engines that take an `ORDER BY` clause (the MergeTree family); using it with `Memory`, `Log`, `TinyLog`, or `StripeLog` throws `UnsupportedException`. The `sampleBy()` method is only available on the ClickHouse builder.
22722274

2273-
These OLAP-shaped modifiers live on the ClickHouse-specific `Column\ClickHouse` and `Table\ClickHouse` builders. Because the methods only exist on the dialect's own builder subclasses, calling `->lowCardinality()` or `->sampleBy()` on a `MySQL`, `PostgreSQL`, `SQLite`, or `MongoDB` builder fails at the type level, with no runtime branch needed.
2275+
**`UInt8` / `Int8` via `tinyInteger()` and `UInt16` / `Int16` via `smallInteger()`** — small integer columns are useful for bounded enumerations, percentage values, scroll depth, and similar fields where the value range fits well below 32 bits. Storing them as `UInt8` saves 75% of the disk and memory footprint compared to the default `UInt32` produced by `integer()->unsigned()`:
2276+
2277+
```php
2278+
$schema->table('events')
2279+
->bigInteger('id')->primary()
2280+
->tinyInteger('scroll_depth')->unsigned() // 0–100 percentage
2281+
->smallInteger('year_offset') // signed, fits years from epoch
2282+
->create();
2283+
2284+
// CREATE TABLE `events` (`id` Int64, `scroll_depth` UInt8, `year_offset` Int16)
2285+
// ENGINE = MergeTree() ORDER BY (`id`)
2286+
```
2287+
2288+
`tinyInteger()` and `smallInteger()` are on the base builder, so the same calls map to `TINYINT` / `SMALLINT` on MySQL, `SMALLINT` on PostgreSQL (both shapes — PostgreSQL has no `TINYINT`), and `INTEGER` on SQLite.
2289+
2290+
**`Array(T)` and `Tuple(...)` column types** — model multi-valued attributes (tags, labels, parallel-array nested records) and fixed-arity composites (geo points, key/value pairs) directly on the builder:
2291+
2292+
```php
2293+
use Utopia\Query\Schema\ColumnType;
2294+
2295+
$schema->table('events')
2296+
->bigInteger('id')->primary()
2297+
->array('meta.key', ColumnType::String)
2298+
->array('meta.value', ColumnType::String)
2299+
->array('user_ids', ColumnType::BigInteger)->unsigned()
2300+
->tuple('coords', [ColumnType::Float, ColumnType::Float])
2301+
->array('scores', ColumnType::String)->nullable()
2302+
->create();
2303+
2304+
// CREATE TABLE `events` (`id` Int64,
2305+
// `meta.key` Array(String), `meta.value` Array(String),
2306+
// `user_ids` Array(UInt64),
2307+
// `coords` Tuple(Float64, Float64),
2308+
// `scores` Nullable(Array(String))) ENGINE = MergeTree() ORDER BY (`id`)
2309+
```
2310+
2311+
The element type runs back through the standard column-type compiler, so the parent column's `unsigned()` and `precision` flags carry through to the inner type. `Nullable(...)` wraps the whole `Array`/`Tuple`; `LowCardinality(...)` is rejected on these columns because ClickHouse only permits it on scalar types. Both methods are only available on the ClickHouse builder.
2312+
2313+
**`decimal(precision, scale)`** — fixed-point numeric column for monetary or precision-sensitive values where binary floating-point error is unacceptable:
2314+
2315+
```php
2316+
$schema->table('orders')
2317+
->bigInteger('id')->primary()
2318+
->decimal('amount', precision: 18, scale: 3)
2319+
->decimal('rate', precision: 5, scale: 4)->nullable()
2320+
->create();
2321+
2322+
// CREATE TABLE `orders` (`id` Int64,
2323+
// `amount` Decimal(18, 3),
2324+
// `rate` Nullable(Decimal(5, 4))) ENGINE = MergeTree() ORDER BY (`id`)
2325+
```
2326+
2327+
`decimal()` is on the base builder: ClickHouse emits `Decimal(P, S)`, MySQL and PostgreSQL emit `DECIMAL(P, S)`, SQLite emits `NUMERIC(P, S)`, and MongoDB maps to the `decimal` BSON type. Scale must not be negative or exceed precision.
2328+
2329+
**`UUID` column type with `defaultRaw()`** — UUIDs are a first-class, fixed-width identifier type in ClickHouse and PostgreSQL, and a 36-character string elsewhere. Pair with `defaultRaw()` to attach a server-generated default expression that the standard `default()` would otherwise quote as a literal:
2330+
2331+
```php
2332+
$schema->table('events')
2333+
->uuid('event_id')->defaultRaw('generateUUIDv4()')->primary()
2334+
->datetime('ts', 3)
2335+
->create();
2336+
2337+
// CREATE TABLE `events` (`event_id` UUID DEFAULT generateUUIDv4(), `ts` DateTime64(3))
2338+
// ENGINE = MergeTree() ORDER BY (`event_id`)
2339+
```
2340+
2341+
`uuid()` compiles to the native `UUID` type on ClickHouse and PostgreSQL, `CHAR(36)` on MySQL, `TEXT` on SQLite, and the `string` BSON type on MongoDB. `defaultRaw(string)` is on the base `Column` and emits the expression verbatim — use for `generateUUIDv4()` (ClickHouse), `gen_random_uuid()` (PostgreSQL), `UUID()` (MySQL), `now()`, `CURRENT_TIMESTAMP`, and similar dialect-specific server-generated defaults. The expression must come from a trusted source; it must not be empty or contain a semicolon. `defaultRaw()` takes precedence over `default()` when both are set.
2342+
2343+
**Raw expressions in `ORDER BY`** — MergeTree `ORDER BY` clauses routinely include scalar function calls (`toDate(ts)`, `cityHash64(...)`, `intHash32(user_id)`) to control sparse-index cardinality. `orderBy(array)` restricts each entry to a plain identifier; use `orderByRaw(string)` to emit the full tuple verbatim:
2344+
2345+
```php
2346+
$schema->table('events')
2347+
->string('tenant')
2348+
->bigInteger('id')
2349+
->datetime('ts')
2350+
->orderByRaw('(`tenant`, toDate(`ts`), `id`)')
2351+
->create();
2352+
2353+
// CREATE TABLE `events` (`tenant` String, `id` Int64, `ts` DateTime)
2354+
// ENGINE = MergeTree() ORDER BY (`tenant`, toDate(`ts`), `id`)
2355+
```
2356+
2357+
The expression is emitted verbatim and must come from a trusted source. `orderByRaw()` takes precedence over `orderBy()` when both are set. Mirrors the existing `partitionBy(string)` convention. Only available on the ClickHouse builder.
2358+
2359+
**`rawColumn()` passthrough**`Table::rawColumn(string $definition)` is the standard escape hatch for column types the builder does not yet model. It is honoured on every dialect, including ClickHouse:
2360+
2361+
```php
2362+
$schema->table('events')
2363+
->bigInteger('id')->primary()
2364+
->rawColumn('`payload` JSON CODEC(ZSTD(3))')
2365+
->create();
2366+
2367+
// CREATE TABLE `events` (`id` Int64, `payload` JSON CODEC(ZSTD(3))) ...
2368+
```
2369+
2370+
These OLAP-shaped modifiers live on the ClickHouse-specific `Column\ClickHouse` and `Table\ClickHouse` builders. Because the methods only exist on the dialect's own builder subclasses, calling `->lowCardinality()`, `->sampleBy()`, `->array()`, `->tuple()`, or `->orderByRaw()` on a `MySQL`, `PostgreSQL`, `SQLite`, or `MongoDB` builder fails at the type level, with no runtime branch needed.
22742371

22752372
### SQLite Schema
22762373

src/Query/Schema.php

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -308,7 +308,9 @@ protected function compileColumnDefinition(Column $column): string
308308
$parts[] = 'NULL';
309309
}
310310

311-
if ($column->hasDefault) {
311+
if ($column->defaultRaw !== null) {
312+
$parts[] = 'DEFAULT ' . $column->defaultRaw;
313+
} elseif ($column->hasDefault) {
312314
$parts[] = 'DEFAULT ' . $this->compileDefaultValue($column->default);
313315
}
314316

src/Query/Schema/ClickHouse.php

Lines changed: 101 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -48,13 +48,52 @@ protected function compileColumnType(Column $column): string
4848
return $type;
4949
}
5050

51+
if ($column instanceof Column\ClickHouse && $column->arrayElementType !== null) {
52+
if ($column->isLowCardinality) {
53+
throw new UnsupportedException('LowCardinality is not supported inside Array(...). Wrap the element type instead.');
54+
}
55+
56+
if ($column->isNullable) {
57+
throw new UnsupportedException('Nullable(Array(...)) is not supported in ClickHouse. Use an empty array [] to represent a missing value instead.');
58+
}
59+
60+
$inner = $this->compileNestedElementType($column->arrayElementType, $column);
61+
$type = 'Array(' . $inner . ')';
62+
63+
return $type;
64+
}
65+
66+
if ($column instanceof Column\ClickHouse && $column->tupleElementTypes !== []) {
67+
if ($column->isLowCardinality) {
68+
throw new UnsupportedException('LowCardinality is not supported on Tuple(...) columns.');
69+
}
70+
71+
$inner = \implode(
72+
', ',
73+
\array_map(
74+
fn (ColumnType $element): string => $this->compileNestedElementType($element, $column),
75+
$column->tupleElementTypes,
76+
),
77+
);
78+
$type = 'Tuple(' . $inner . ')';
79+
80+
if ($column->isNullable) {
81+
throw new UnsupportedException('Nullable(Tuple(...)) is an experimental ClickHouse feature and requires allow_experimental_nullable_tuple_type = 1. Use Tuple(Nullable(T1), Nullable(T2), ...) instead.');
82+
}
83+
84+
return $type;
85+
}
86+
5187
$type = match ($column->type) {
5288
ColumnType::String, ColumnType::Varchar, ColumnType::Relationship => 'String',
5389
ColumnType::Text => 'String',
5490
ColumnType::MediumText, ColumnType::LongText => 'String',
91+
ColumnType::TinyInteger => $column->isUnsigned ? 'UInt8' : 'Int8',
92+
ColumnType::SmallInteger => $column->isUnsigned ? 'UInt16' : 'Int16',
5593
ColumnType::Integer => $column->isUnsigned ? 'UInt32' : 'Int32',
5694
ColumnType::BigInteger, ColumnType::Id => $column->isUnsigned ? 'UInt64' : 'Int64',
5795
ColumnType::Float, ColumnType::Double => 'Float64',
96+
ColumnType::Decimal => 'Decimal(' . ($column->precision ?? 10) . ', ' . ($column->scale ?? 0) . ')',
5897
ColumnType::Boolean => 'UInt8',
5998
ColumnType::Datetime => $column->precision ? 'DateTime64(' . $column->precision . ')' : 'DateTime',
6099
ColumnType::Timestamp => $column->precision ? 'DateTime64(' . $column->precision . ')' : 'DateTime',
@@ -64,9 +103,13 @@ protected function compileColumnType(Column $column): string
64103
ColumnType::Point => 'Tuple(Float64, Float64)',
65104
ColumnType::Linestring => 'Array(Tuple(Float64, Float64))',
66105
ColumnType::Polygon => 'Array(Array(Tuple(Float64, Float64)))',
106+
ColumnType::Uuid => 'UUID',
67107
ColumnType::Uuid7 => 'FixedString(36)',
68108
ColumnType::Vector => 'Array(Float64)',
69109
ColumnType::Serial, ColumnType::BigSerial, ColumnType::SmallSerial => throw new UnsupportedException('SERIAL types are not supported in ClickHouse.'),
110+
ColumnType::Array, ColumnType::Tuple => throw new UnsupportedException(
111+
'Array/Tuple columns must be declared via Table\\ClickHouse::array() or ::tuple().'
112+
),
70113
};
71114

72115
if ($column instanceof Column\ClickHouse && $column->isLowCardinality) {
@@ -105,7 +148,9 @@ protected function compileColumnDefinition(Column $column): string
105148
$this->compileColumnType($column),
106149
];
107150

108-
if ($column->hasDefault) {
151+
if ($column->defaultRaw !== null) {
152+
$parts[] = 'DEFAULT ' . $column->defaultRaw;
153+
} elseif ($column->hasDefault) {
109154
$parts[] = 'DEFAULT ' . $this->compileDefaultValue($column->default);
110155
}
111156

@@ -213,6 +258,10 @@ public function compileCreate(Table $table, bool $ifNotExists = false): Statemen
213258
$primaryKeys = \array_map(fn (string $c): string => $this->quote($c), $table->compositePrimaryKey);
214259
}
215260

261+
foreach ($table->rawColumnDefs as $rawDef) {
262+
$columnDefs[] = $rawDef;
263+
}
264+
216265
foreach ($table->indexes as $index) {
217266
if ($index->type !== IndexType::Index) {
218267
throw new UnsupportedException(
@@ -241,13 +290,17 @@ public function compileCreate(Table $table, bool $ifNotExists = false): Statemen
241290
}
242291

243292
if ($engine->requiresOrderBy()) {
244-
$orderBy = ! empty($table->orderBy)
245-
? \array_map(fn (string $c): string => $this->quote($c), $table->orderBy)
246-
: $primaryKeys;
247-
248-
$sql .= ! empty($orderBy)
249-
? ' ORDER BY (' . \implode(', ', $orderBy) . ')'
250-
: ' ORDER BY tuple()';
293+
if ($table instanceof Table\ClickHouse && $table->orderByRaw !== null) {
294+
$sql .= ' ORDER BY ' . $table->orderByRaw;
295+
} else {
296+
$orderBy = ! empty($table->orderBy)
297+
? \array_map(fn (string $c): string => $this->quote($c), $table->orderBy)
298+
: $primaryKeys;
299+
300+
$sql .= ! empty($orderBy)
301+
? ' ORDER BY (' . \implode(', ', $orderBy) . ')'
302+
: ' ORDER BY tuple()';
303+
}
251304
}
252305

253306
if ($table instanceof Table\ClickHouse && $table->sampleBy !== null) {
@@ -352,6 +405,46 @@ private function compileEngine(Engine $engine, array $args): string
352405
};
353406
}
354407

408+
/**
409+
* Compile an element type for use inside `Array(T)` or `Tuple(...)`.
410+
*
411+
* Element types come from the {@see ColumnType} enum directly, so they
412+
* lack the per-column state (precision, unsigned flag, etc.) that
413+
* {@see compileColumnType()} relies on. This helper falls back to the
414+
* parent column's `isUnsigned` flag for integer elements and to the
415+
* parent's `precision` for `Decimal` elements so callers can spell common
416+
* shapes (`Array(UInt64)`, `Array(Decimal(18, 3))`) without leaking the
417+
* inner-type complexity into the public API.
418+
*/
419+
private function compileNestedElementType(ColumnType $element, Column $parent): string
420+
{
421+
return match ($element) {
422+
ColumnType::String, ColumnType::Varchar, ColumnType::Relationship,
423+
ColumnType::Text, ColumnType::MediumText, ColumnType::LongText,
424+
ColumnType::Json, ColumnType::Object, ColumnType::Binary => 'String',
425+
ColumnType::TinyInteger => $parent->isUnsigned ? 'UInt8' : 'Int8',
426+
ColumnType::SmallInteger => $parent->isUnsigned ? 'UInt16' : 'Int16',
427+
ColumnType::Integer => $parent->isUnsigned ? 'UInt32' : 'Int32',
428+
ColumnType::BigInteger, ColumnType::Id => $parent->isUnsigned ? 'UInt64' : 'Int64',
429+
ColumnType::Float, ColumnType::Double => 'Float64',
430+
ColumnType::Decimal => 'Decimal(' . ($parent->precision ?? 10) . ', ' . ($parent->scale ?? 0) . ')',
431+
ColumnType::Boolean => 'UInt8',
432+
ColumnType::Datetime, ColumnType::Timestamp => $parent->precision
433+
? 'DateTime64(' . $parent->precision . ')'
434+
: 'DateTime',
435+
ColumnType::Uuid => 'UUID',
436+
ColumnType::Uuid7 => 'FixedString(36)',
437+
ColumnType::Point => 'Tuple(Float64, Float64)',
438+
ColumnType::Linestring => 'Array(Tuple(Float64, Float64))',
439+
ColumnType::Polygon => 'Array(Array(Tuple(Float64, Float64)))',
440+
ColumnType::Vector => 'Array(Float64)',
441+
ColumnType::Enum, ColumnType::Serial, ColumnType::BigSerial,
442+
ColumnType::SmallSerial, ColumnType::Array, ColumnType::Tuple => throw new UnsupportedException(
443+
'Nested element type ' . $element->value . ' is not supported inside Array/Tuple.'
444+
),
445+
};
446+
}
447+
355448
/**
356449
* @param string[] $values
357450
*/

src/Query/Schema/Column.php

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,13 @@ class Column
1717

1818
public protected(set) bool $hasDefault = false;
1919

20+
/**
21+
* Raw default expression emitted verbatim after `DEFAULT` (e.g. `now()`,
22+
* `generateUUIDv4()`, `gen_random_uuid()`). Distinct from {@see $default},
23+
* which is rendered as a quoted literal.
24+
*/
25+
public protected(set) ?string $defaultRaw = null;
26+
2027
public protected(set) bool $isUnsigned = false;
2128

2229
public protected(set) bool $isUnique = false;
@@ -63,6 +70,7 @@ public function __construct(
6370
public ColumnType $type,
6471
public ?int $length = null,
6572
public ?int $precision = null,
73+
public ?int $scale = null,
6674
) {
6775
}
6876

@@ -81,6 +89,33 @@ public function default(mixed $value): static
8189
return $this;
8290
}
8391

92+
/**
93+
* Set a raw default expression rendered verbatim after `DEFAULT`.
94+
*
95+
* Use for dialect-specific server-generated defaults that {@see default()}
96+
* would otherwise quote: `now()`, `CURRENT_TIMESTAMP`, `gen_random_uuid()`,
97+
* `generateUUIDv4()`, etc. The expression is emitted unquoted and must come
98+
* from a trusted (developer-controlled) source.
99+
*
100+
* @throws ValidationException if the expression is empty or contains ";".
101+
*/
102+
public function defaultRaw(string $expression): static
103+
{
104+
$trimmed = \trim($expression);
105+
106+
if ($trimmed === '') {
107+
throw new ValidationException('Raw default expression must not be empty.');
108+
}
109+
110+
if (\str_contains($trimmed, ';')) {
111+
throw new ValidationException('Raw default expression must not contain ";".');
112+
}
113+
114+
$this->defaultRaw = $trimmed;
115+
116+
return $this;
117+
}
118+
84119
public function unsigned(): static
85120
{
86121
$this->isUnsigned = true;
@@ -285,6 +320,18 @@ public function longText(string $name): static
285320
return $this->table->longText($name);
286321
}
287322

323+
public function tinyInteger(string $name): static
324+
{
325+
/** @var static */
326+
return $this->table->tinyInteger($name);
327+
}
328+
329+
public function smallInteger(string $name): static
330+
{
331+
/** @var static */
332+
return $this->table->smallInteger($name);
333+
}
334+
288335
public function integer(string $name): static
289336
{
290337
/** @var static */
@@ -297,6 +344,18 @@ public function bigInteger(string $name): static
297344
return $this->table->bigInteger($name);
298345
}
299346

347+
public function decimal(string $name, int $precision = 10, int $scale = 0): static
348+
{
349+
/** @var static */
350+
return $this->table->decimal($name, $precision, $scale);
351+
}
352+
353+
public function uuid(string $name): static
354+
{
355+
/** @var static */
356+
return $this->table->uuid($name);
357+
}
358+
300359
public function serial(string $name): static
301360
{
302361
/** @var static */

0 commit comments

Comments
 (0)