Skip to content

Add POCO read materialization#318

Draft
alex-clickhouse wants to merge 10 commits intomainfrom
poco-reads-final
Draft

Add POCO read materialization#318
alex-clickhouse wants to merge 10 commits intomainfrom
poco-reads-final

Conversation

@alex-clickhouse
Copy link
Copy Markdown
Collaborator

@alex-clickhouse alex-clickhouse commented May 9, 2026

Summary

Adds a streaming POCO read path for IClickHouseClient and ClickHouseDataReader, backed by a per-client PocoTypeRegistry shared with the existing binary-insert path. Strict v1 type-matching: no implicit conversions, with rich diagnostics on mismatch.

API

  • RegisterBinaryInsertType<T>() — existing, unchanged. Insert-only.
  • RegisterPocoReadType<T>() — new. Read-only registration. Validates a parameterless ctor and at least one public non-init setter.
  • RegisterPocoType<T>() — new convenience. Atomic: runs read first; insert only commits if read succeeds, so a thrown exception leaves the registry untouched.
  • client.QueryAsync<T>(sql, parameters?, options?, ct?)IAsyncEnumerable<T>. Streams rows lazily; the underlying reader is disposed on completion, fault, or early break.
  • reader.MapTo<T>() — materializes the current row into a fresh T without advancing the reader. Pairs with Read() for ADO.NET-style consumers.

[ClickHouseColumn(Name = ...)] aliases and [ClickHouseNotMapped] exclusion behave the same as on the existing write path.

How it works

  1. Registration compiles getters (insert) and setters + a raw-newobj constructor (read) via Expression.Lambda and caches them in a per-client ConcurrentDictionary. The compiled constructor uses raw newobj, so types with required properties are supported.
  2. Headers parse when the reader opens. The first MapTo<T> (or first iteration of QueryAsync<T>) builds an ordinal-ordered binding plan from (column ordinal, property index) pairs and fail-fasts on impossible mappings: each column's FrameworkType is checked against the target property type before any rows are materialized. Polymorphic columns (FrameworkType == typeof(object) — Variant/Dynamic/JSON/Object) skip the static check.
  3. Per-row, the binding plan drives the loop: null/DBNull is gated by CanAssignNull; otherwise the compiled boxed setter is invoked. An InvalidCastException from the setter (only reachable for polymorphic columns or stale mismatches) is caught and rethrown as InvalidOperationException with column name, ClickHouse type, target property, and returned CLR type.

ClickHouseDataReader.Read() is unchanged. Existing readers keep the buffered CurrentRow available for arbitrary ordinal access alongside MapTo<T>.

Type-matching rules (strict v1)

  • Case-sensitive (StringComparer.Ordinal) column-to-property matching.
  • Missing columns in the result leave properties at their default value.
  • Extra columns in the result are ignored.
  • No implicit conversions: no widening (Int8Int32), no enum coercion (Enum8 returns string, not your CLR enum), no ClickHouseDecimaldecimal interop. Mismatches throw InvalidOperationException naming the POCO type, property, column, and returned CLR type.
  • Static mismatches fail fast at the first MapTo<T> / iteration — before any rows are read.

Limitations

  • Read registration requires a public parameterless constructor. Records with primary constructors are rejected on the read side (still fine on the insert side, which doesn't construct).
  • Init-only and read-only properties are skipped on the read side. The type still registers if at least one property has a public non-init setter.
  • required properties are supported (the compiled constructor bypasses required-member enforcement via raw newobj).
  • Per-reader bindingPlanCache is a non-concurrent Dictionary — fine because DbDataReader isn't thread-safe by spec; concurrent MapTo<T> calls on a single reader are not supported.
  • No conversion hooks in v1. Future opt-in widening / enum / decimal coercions would be additive (a PocoReadOptions flag or [ClickHouseColumn(Convert = true)] attribute), with no public-API break.

Performance

PocoReadBenchmark reads system.numbers projecting (Id, Name, Value) and compares manual GetValue row construction against QueryAsync<T> POCO materialization.

Method Count Mean Ratio Allocated
ManualGetValue 100k 104.0 ms 1.00 12.73 MB
Poco 100k 115.1 ms 1.14 12.73 MB
ManualGetValue 500k 327.0 ms 1.00 61.55 MB
Poco 500k 341.0 ms 1.05 61.55 MB

Allocations are identical — column values are boxed once during reader buffering and reused. The remaining per-row overhead is the boxed Action<T, object> setter invocation; a future generic-typed-setter pass could close it.

Run on .NET 10 / Linux Ubuntu 22.04 / Intel Core Ultra 7 255U.

Tests

Coverage spans registration edge cases (records, init-only, abstract, write-only, indexers, statics, fields, required, inheritance), the strict-v1 read matrix (alias, missing/extra, case sensitivity, nullable/non-nullable, decimal with and without UseCustomDecimals, type mismatch, unregistered, before-Read, after-end-of-stream, derived, multiple types per reader, parameters, query options), and three review-driven additions:

  • atomicity of RegisterPocoType<T> after a failure,
  • QueryAsync<T> cancellation mid-iteration with a successful follow-up query,
  • mid-stream server error surfacing as ClickHouseServerException through QueryAsync<T> (gated on ClickHouse 25.11+).

A new example, examples/Select/Select_005_PocoSelect.cs, demonstrates the three calling shapes end-to-end.

Introduce QueryAsync<T> on IClickHouseClient (streaming IAsyncEnumerable<T>)
and GetRecord<T>() on ClickHouseDataReader, both backed by a per-client
PocoTypeRegistry shared with the existing binary-insert path. v1 assignment
rules are strict: no widening, no enum coercion, no ClickHouseDecimal->decimal.

API:
  - RegisterBinaryInsertType<T>() — existing, unchanged
  - RegisterPocoReadType<T>()     — read-only registration
  - RegisterPocoType<T>()         — convenience: read + insert

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 9, 2026 06:51
@alex-clickhouse alex-clickhouse requested a review from mzitnik as a code owner May 9, 2026 06:51
@alex-clickhouse alex-clickhouse marked this pull request as draft May 9, 2026 06:51
@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces first-class POCO read materialization to the ClickHouse .NET client: a per-client POCO registry is shared between the existing binary insert path and new read APIs, enabling streaming materialization via IAsyncEnumerable<T> and row materialization directly from ClickHouseDataReader.

Changes:

  • Added POCO read registration APIs (RegisterPocoReadType<T>(), RegisterPocoType<T>()) and a streaming query API (QueryAsync<T>()) on IClickHouseClient / ClickHouseClient.
  • Implemented a unified PocoTypeRegistry that builds insert/read mappings and is threaded through ClickHouseDataReader for GetRecord<T>().
  • Added extensive NUnit coverage for registration and read materialization behavior, plus a benchmark, and updated public API + release notes/changelog.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
RELEASENOTES.md Documents the new POCO read feature set and API surface.
CHANGELOG.md Adds a v1.4.0 entry describing POCO read support and behavior.
ClickHouse.Driver/PublicAPI/PublicAPI.Unshipped.txt Declares newly added public APIs for compatibility tracking.
ClickHouse.Driver/IClickHouseClient.cs Extends the public client interface with POCO read registration and QueryAsync<T>().
ClickHouse.Driver/ClickHouseClient.cs Implements the new registry-backed registration methods and QueryAsync<T>(); passes registry into readers.
ClickHouse.Driver/ADO/Readers/ClickHouseDataReader.cs Adds GetRecord<T>() materialization and binding plan caching; stores the per-client registry.
ClickHouse.Driver/ADO/ClickHouseConnection.cs Exposes POCO read registration passthroughs for ADO.NET users.
ClickHouse.Driver/ADO/ClickHouseCommand.cs Ensures commands create readers using the client’s shared POCO registry.
ClickHouse.Driver/Copy/PocoTypeRegistry.cs New unified per-client registry building insert/read mappings and compiled delegates.
ClickHouse.Driver/Copy/PocoMappings.cs New mapping model types for insert/read paths.
ClickHouse.Driver/Copy/BinaryInsertPropertyInfo.cs Extends cached property metadata to support both insert and read scenarios.
ClickHouse.Driver/Copy/BinaryInsertTypeRegistry.cs Removed in favor of PocoTypeRegistry.
ClickHouse.Driver/Copy/BinaryInsertTypeMapping.cs Removed in favor of split insert/read mapping types.
ClickHouse.Driver.Tests/Copy/PocoRegistrationTests.cs Adds unit tests for POCO registration validation rules.
ClickHouse.Driver.Tests/Copy/PocoReadTests.cs Adds end-to-end tests for GetRecord<T>() and QueryAsync<T>().
ClickHouse.Driver.Tests/Copy/PocoEdgeCaseTests.cs Adds extensive edge-case coverage for supported/unsupported POCO shapes.
ClickHouse.Driver.Benchmark/PocoReadBenchmark.cs Adds a benchmark comparing POCO materialization vs manual GetValue.

Comment thread ClickHouse.Driver/ADO/Readers/ClickHouseDataReader.cs Outdated
Comment thread ClickHouse.Driver/ADO/Readers/ClickHouseDataReader.cs Outdated
Comment thread ClickHouse.Driver/IClickHouseClient.cs
Comment thread ClickHouse.Driver/ADO/Readers/ClickHouseDataReader.cs Outdated
Comment thread RELEASENOTES.md Outdated
Comment thread CHANGELOG.md Outdated
alex-clickhouse and others added 5 commits May 9, 2026 09:39
- Validate-once: column FrameworkType vs property type is checked at plan
  build (fail-fast for impossible mappings before any rows are read).
- Per-row: drop IsAssignableFrom; rely on the compiled setter, catching
  InvalidCastException to rethrow with rich diagnostics. Trims the per-row
  overhead vs. manual GetValue from ~25% to ~5% on 500k rows; allocations
  unchanged.
- Rename ClickHouseDataReader.GetRecord<T>() to MapTo<T>() to avoid
  confusion with C# `record` types and to read more naturally next to
  Read() ("Read advances; MapTo materializes the current row").
- Doc fixes per review: clarify that read materialization requires
  RegisterPocoReadType<T>() or RegisterPocoType<T>() (not
  RegisterBinaryInsertType<T>()), correct the `required` properties claim
  (they are supported), and split the CHANGELOG/RELEASENOTES entries from
  a wall of text into scannable sub-bullets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- RegisterPocoType<T> is now atomic: read registration runs first (its
  rules are stricter), insert only commits if read succeeds. A thrown
  exception leaves the registry untouched.
- Fix the misleading XML doc on IClickHouseClient.RegisterBinaryInsertType
  to clarify that only the insert side is registered.
- Defensively clear hasCurrentRow before the per-column Read loop so a
  mid-row throw cannot leave a stale CurrentRow visible to MapTo<T>.
- Document why QueryAsync<T> uses sync reader.Read() inside the async
  iterator (no async overload; underlying stream is buffered).
- Add tests: atomicity post-failure, QueryAsync cancellation
  mid-iteration with follow-up query, server-side error mid-stream
  surfaces ClickHouseServerException through QueryAsync.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Demonstrates the new POCO read APIs:
  - QueryAsync<T> streaming via IAsyncEnumerable<T>
  - [ClickHouseColumn(Name)] aliases and [ClickHouseNotMapped] on the read
    path (same semantics as on the write path)
  - ClickHouseDataReader.MapTo<T> on the current row alongside typed
    accessors, for ADO.NET-style consumers

Wired into Program.cs in the SELECTING DATA section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single query exercises all three rows of the nullable-property matrix:
Nullable(Int32) NULL, Nullable(Int32) non-null, and plain Int32 — each
mapped to a separate int? property. Closes the gap where existing tests
covered only the NULL and plain cases but not Nullable(Int32) non-null.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the existing null checks on httpResponse and reader. Internal-only
call sites all pass a non-null instance today; this is for defensive
symmetry, not a correctness fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.

Comment thread ClickHouse.Driver/ADO/Readers/ClickHouseDataReader.cs Outdated
Comment thread ClickHouse.Driver/ClickHouseClient.cs Outdated
Comment thread ClickHouse.Driver/IClickHouseClient.cs
Comment thread ClickHouse.Driver/ADO/ClickHouseConnection.cs
alex-clickhouse and others added 4 commits May 9, 2026 10:48
- XML doc on RegisterPocoReadType<T> and RegisterPocoType<T> now spells
  out that init-only, read-only, and non-public-setter properties are
  silently skipped on the read path and keep their default value even
  when a matching result column is present.
- New runtime test pins the guarantee: a POCO with one mapped property
  alongside init-only, read-only, and private-setter properties is
  materialized from a result that contains all four columns; only the
  set; property is filled, the others stay at their CLR default.
- Drop the redundant `Copy.` qualifier in ClickHouseDataReader (the
  using directive already imports ClickHouse.Driver.Copy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the POCO-only types (PocoMappings, PocoTypeRegistry, and the
property-info struct) out of ClickHouse.Driver.Copy — which historically
held bulk-copy machinery — into a dedicated ClickHouse.Driver.Poco
namespace and folder. Bulk-copy serializers stay in Copy/. Also rename
the BinaryInsertPropertyInfo type to PocoPropertyInfo now that it's
shared by both the insert and read paths and lives under Poco/, so the
old "BinaryInsert" prefix would be misleading.

Internal-only refactor; no public API change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Order alone wasn't enough: a type can pass read validation but fail
insert validation (e.g., a property with a private getter is insert-
unmappable but read-mappable). The previous implementation would have
left a tentative read mapping behind when insert later threw.

PocoTypeRegistry.RegisterForBoth<T>() now builds both mappings up front
and only commits them after both pass validation, so a thrown exception
cannot leave a partial registration behind regardless of which side
fails.

Add a test for the read-passes-insert-fails direction (the new path) to
sit alongside the existing test for the read-fails direction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Direct happy-path coverage was missing — every existing read-side test
registered through the RegisterPocoType convenience. Pin that read-only
registration is sufficient on its own: a fresh type is registered via
RegisterPocoReadType<T>(), QueryAsync<T> materializes its rows, and the
type is *not* registered for binary insert.

Uses a dedicated POCO so the "no insert mapping" assertion isn't
polluted by other tests in the shared fixture that go through
RegisterPocoType.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants