This document describes the internal architecture of CH.Toolkit, its package layering, data flow, and core design patterns.
CH.Toolkit.Types (no deps)
|
CH.Toolkit.Sql (-> Types)
|
CH.Toolkit.Schema (-> Types, Sql)
/ | \
CH.Toolkit.Query CH.Toolkit.Modeling CH.Toolkit.Introspection
(-> Sql, Types) (-> Schema, Query, Types) (-> Schema, Types, Sql + ClickHouse.Driver)
\ | /
CH.Toolkit.Migrations
(-> Schema, Sql, Introspection, Types + Microsoft.CodeAnalysis.CSharp)
/ \
CH.Toolkit.Cli CH.Toolkit.DependencyInjection
(-> Introspection, (-> Migrations + MS.Extensions.DI)
Migrations +
System.CommandLine)
Dependencies flow upward -- lower packages never reference higher ones.
The pipeline for turning a C# model into DDL follows this path:
C# class (e.g., Event)
| SchemaBuilder.Table<T>() + fluent config
v
TableSchema / DatabaseSchema (Schema package -- immutable records)
| DdlCompiler.CompileCreateTable()
v
SqlNode AST (CreateTableNode, etc.) (Sql package -- immutable records)
| ClickHouseSqlRenderer.Render()
v
SQL string (e.g., CREATE TABLE IF NOT EXISTS ...)
DdlCompiler converts schema model objects into SqlNode AST nodes. It lives in the Schema package (not Sql) to avoid a circular dependency -- Schema depends on Sql for the AST types, and moving the compiler into Sql would require Sql to depend on Schema.
ClickHouseSqlRenderer is a SqlVisitor<string> that walks the AST and produces ClickHouse-dialect SQL. It handles identifier quoting, ON CLUSTER clauses, engine configuration, TTL, codecs, indexes, projections, and all query constructs.
The migration pipeline chains schema diffing, code generation, and DDL execution:
SchemaBuilder (desired schema) SchemaIntrospector (current DB)
\ /
v v
DatabaseSchema (desired) DatabaseSchema (current)
\ /
SchemaDiffer.Diff()
|
v
SchemaOperation[] (AddColumnOp, CreateTableOp, etc.)
|
v MigrationCodeGenerator.GenerateMigration()
C# migration files (MigrationBase subclasses)
|
v MigrationRunner.MigrateAsync()
DdlCompiler -> SqlNode AST -> ClickHouseSqlRenderer -> DDL execution
The runner also manages locking (distributed TTL-based lock table), checksum validation, history tracking, and safety policy enforcement.
All domain types are sealed records: ClickHouseType, SqlNode subtypes, ColumnSchema, TableSchema, SchemaOperation, and more. This provides value equality, immutability, and with expression support.
Records that contain IReadOnlyList fields override Equals and GetHashCode using SequenceEqual for correct value comparison. These overrides use bool Equals(T? other) (not virtual bool Equals) on sealed records to avoid the CS8851 warning.
SqlVisitor<TResult> is the abstract base for walking the SqlNode AST:
public abstract class SqlVisitor<T>
{
public T Visit(SqlNode node) => node switch { ... };
protected abstract T VisitCreateTable(CreateTableNode node);
protected abstract T VisitAlterTable(AlterTableNode node);
protected abstract T VisitDropTable(DropTableNode node);
protected abstract T VisitSelectQuery(SelectQueryNode node);
protected abstract T VisitSetOperationQuery(SetOperationQueryNode node);
protected abstract T VisitInsertSelect(InsertSelectNode node);
// ... one method per node type
}ClickHouseSqlRenderer is the primary visitor implementation. New SQL features are added by extending SqlNode with a new record type, adding a Visit* method to SqlVisitor<T>, and implementing rendering in ClickHouseSqlRenderer.
The modeling API uses a builder pattern with back-references for uninterrupted fluent chains:
var schema = new SchemaBuilder()
.Table<Event>()
.MergeTree()
.OrderBy(e => e.Timestamp, e => e.UserId)
.Column(e => e.EventType).LowCardinalityString().Table // .Table returns to TableBuilder
.Ttl("timestamp + INTERVAL 90 DAY", "DELETE")
.Build("analytics");TableBuilder<T> holds a reference back to SchemaBuilder, so .Build(database) delegates to the root builder. ColumnBuilder<T> holds a reference to TableBuilder<T>, so .Table chains back.
Column overrides are keyed by PascalCase property name (the raw C# member name), not the snake_case database column name.
TableBuilder<T> uses Expression<Func<T, object>> for compile-time-safe column references:
.OrderBy(e => e.Timestamp) // extracts property name "Timestamp"
.Column(e => e.EventType) // creates override keyed by "EventType"
.PartitionByMonth(e => e.CreatedAt) // generates toYYYYMM(created_at)Property names are converted to snake_case for database column names via ColumnNamingConvention.SnakeCase.
ClrTypeMapper converts C# types to ClickHouse types automatically when building table schemas from generic types. The mapping is extensible via TypeMappingOptions (see Type System).
DdlCompiler converts Schema records into Sql AST nodes. Its natural placement would be in the Sql package, but that would create a circular dependency:
- Sql would need to reference Schema (for
TableSchema,ColumnSchema, etc.) - Schema already references Sql (for
SqlNode,ColumnDefNode, etc.)
By keeping DdlCompiler in the Schema package, the dependency direction remains clean: Schema -> Sql, never the reverse.