Add ODBC-native row-wise array fetch for fast bulk retrieval by christianparpart · Pull Request #511 · LASTRADA-Software/Lightweight

christianparpart · 2026-06-22T08:52:44Z

Customers on high-latency links wait the longest when retrieving many rows because each row is its own SQLFetch round-trip — fetching 1000 rows costs 1000 round-trips. This adds the read-side counterpart of the existing CreateAll/UpdateAll batch-write path: Query<Record>().All()/Range() (and two-record JOIN tuples) now bind result columns row-wise directly into the caller's std::vector<Record> storage and pull whole row blocks per SQLFetchScroll, collapsing the round-trips from N to ~ceil(N/depth). Values land in place (zero-copy); results are byte-identical to the per-row path.

The fast path is transparent and gated — a record qualifies when every result column is row-bindable (the same SqlRowBindableColumn set the write side uses: primitives, date/time/datetime, numeric, char fixed-capacity strings, and non-numeric optionals of those) and the driver supports row-array fetching. Anything else (growable strings/binary, GUID, variant) transparently falls back to the unchanged per-row path. Verified against SQLite, SQL Server 2022 and PostgreSQL.

Changes

SqlStatement::FetchAllRowWise — new low-level primitive mirroring ExecuteBatchNativeRowWise: sets SQL_ATTR_ROW_BIND_TYPE = sizeof(Record) + SQL_ATTR_ROW_ARRAY_SIZE, grows-and-rebinds the destination per block, clamps depth to a memory budget, and restores single-row statement state via a Finally guard on every exit (including exceptions). Nullable columns use an over-allocated row-strided NULL indicator; optionals are pre-engaged and reset on NULL. Char fixed strings bind SQL_C_CHAR inline with a per-row length/trim fixup.
SqlConnection::SupportsNativeRowArrayFetch and RoundTripsNarrowTextByteExact — driver capability checks kept on the connection. The latter carves out PostgreSQL (whose psqlODBC transcodes SQL_C_CHAR through the client codepage), so records carrying a fixed-capacity string fall back to the per-row wide path there rather than risk mangling non-ASCII bytes.
DataMapper retrieval wiring — compile-time eligibility (CanRowWiseFetchRecord/CanRowWiseFetchTuple plus the narrow-text carve-out) and the ReadResults branch for both single-record and two-record tuple result sets.
RowWiseFetchTests — covers fixed/nullable/temporal/fixed-string columns, NULL/empty/full-capacity values, multi-block boundaries, empty results, Where/Range, statement reuse and the std::string fallback, asserting via a block-fetch-counting logger that the fast path actually ran (and that fixed-string records fall back on PostgreSQL). A hidden opt-in benchmark ([.rowwisefetchbench]) reproduces the comparison below.

Performance

The structural win is round-trip count: for 200k rows the row-wise path issues ~196 SQLFetchScroll calls instead of 200,000 SQLFetch calls (~1000x fewer round-trips). Measured end-to-end (release build, median of 5 reps, Query<>().All() vs the per-row fallback, both materializing the same std::vector<Record>):

Backend	Transport	per-row `SQLFetch`	row-wise block fetch	speedup
SQLite	in-process	643 ms / 500k	541 ms	1.19x
PostgreSQL	localhost TCP	167 ms / 200k	109 ms	1.53x
SQL Server 2022	localhost TCP	134 ms / 200k	67 ms	2.01x

The speedup scales with per-round-trip transport cost. In-process SQLite has no socket, so its ~1.2x is purely reduced per-call CPU/ODBC overhead — the floor, and a latency-independent constant. Over a localhost socket (sub-millisecond RTT) it is already 1.5-2x.

On a high-latency link the round-trip term dominates: wall-clock ≈ round-trips × RTT, so the per-row path pays ~N × RTT while the block path pays ~ceil(N/depth) × RTT, and the speedup approaches the effective array depth (up to ~1000x for narrow records, less for wide rows that clamp to a smaller depth, and divided by whatever the driver already prefetches per fetch). For example, modelled at 50 ms RTT for 100k rows: ~83 min (per-row) vs ~5 s (block). The local numbers above are a conservative lower bound demonstrating the mechanism; the round-trip-count reduction is what carries the win to the WAN case.

…eval Query<Record>().All()/Range() (and two-record JOIN tuples) now bind result columns row-wise directly into the caller's std::vector<Record> storage and pull whole row blocks per SQLFetchScroll round-trip, instead of one SQLFetch per row. This is the read-side mirror of the CreateAll/UpdateAll batch-write path and collapses ODBC round-trips from N to ~ceil(N/depth) — the win on high-latency links where receiving e.g. 1000 rows previously cost 1000 fetches. The path is transparent and gated: a record qualifies when every result column is row-bindable (the write-side SqlRowBindableColumn set: primitives, date/ time/datetime, numeric, char fixed-capacity strings, and non-numeric optionals of those) and the driver supports row-array fetching. Anything else (growable strings/binary, GUID, variant) falls back to the unchanged per-row path with byte-identical results. Values land in place (zero-copy); nullable columns use an over-allocated row-strided NULL indicator, and optionals are pre-engaged and reset on NULL. Char fixed strings bind SQL_C_CHAR inline with a per-row length/trim fixup; on PostgreSQL (psqlODBC transcodes SQL_C_CHAR through the client codepage) records carrying one fall back to the per-row wide path, gated by the new SqlConnection::RoundTripsNarrowTextByteExact capability so the server-type decision stays on the connection. - SqlStatement::FetchAllRowWise + BindRowWiseValue/FinalizeRowWiseOutputColumn: SQL_ATTR_ROW_BIND_TYPE = sizeof(Record), grow-and-rebind per block, memory- budget-clamped depth, and a Finally guard restoring single-row state on every exit (incl. exceptions). - SqlConnection::SupportsNativeRowArrayFetch / RoundTripsNarrowTextByteExact. - DataMapper eligibility (CanRowWiseFetchRecord/CanRowWiseFetchTuple + narrow- text carve-out) and the ReadResults wiring for both single and tuple results. Tested against sqlite3, mssql2022 and postgres: new RowWiseFetchTests cover fixed/nullable/temporal/fixed-string types, NULL/empty/full-capacity values, multi-block boundaries, empty results, Where/Range, statement reuse and the std::string fallback, asserting via a block-fetch-counting logger that the fast path actually ran (and that fixed-string records fall back on PostgreSQL). Signed-off-by: Christian Parpart <c.parpart@lastrada.net>

A hidden ([.rowwisefetchbench]) benchmark times the shipped row-wise block fetch (Query<>().All()) against a faithful reproduction of the per-row fallback over a large dataset, materializing the same vector. Tunable via the ROWFETCH_BENCH_ROWS env var (default 500'000). Measured (release, median of 5 reps): - SQLite in-process : ~1.2x (per-SQLFetch overhead only; no socket) - PostgreSQL (TCP) : ~1.5x - SQL Server (TCP) : ~2.0x The win grows with per-round-trip transport cost: in-process gains little, a localhost socket already 1.5-2x. For 200k rows the row-wise path issues ~196 SQLFetchScroll calls vs 200k SQLFetch calls (~1000x fewer round-trips), so on a high-latency link — where wall-clock is dominated by RTT * round-trips — the speedup approaches the array depth. Signed-off-by: Christian Parpart <c.parpart@lastrada.net>

@ref

- RowWiseFetchTests: parenthesize depth*2+7 (readability-math-missing-parentheses) and use std::cmp_equal for the depth/total spot-checks (modernize-use-integer-sign-comparison). - Doc comments: doxygen cannot resolve @ref to private members or concepts, so the public comments now use @c for SqlRowBindableColumn, FetchAllRowWise, BindRowWiseOutputColumn, FinalizeRowWiseOutputColumn and RoundTripsNarrowTextByteExact; also fix a stale @ref to a renamed method. Signed-off-by: Christian Parpart <c.parpart@lastrada.net>

…ch loops Classic result iteration (while(cursor.FetchRow()){ GetColumn<T>(i) }, bound output columns, SqlRowIterator<T>, SqlVariantRowCursor) issued one SQLFetch per row -- one network round-trip per row, which dominates wall-clock for large result sets on TCP backends. The recent row-wise array fetch only sped up the materializing DataMapper paths (All/Range/ First); the lazy cursor loops were left on the per-row path and cannot be rewritten across all client code. Back the classic cursor transparently with the existing RowArrayCursor: on the first fetch of an eligible result set the statement arms a block buffer (SQL_ATTR_ROW_ARRAY_SIZE) and serves FetchRow()/GetColumn<T>() and bound-column scatters from it, cutting round-trips from N to ceil(N/depth). On by default; depth is a single connection-level knob (SqlConnection::SetDefaultPrefetchDepth, default PrefetchDepthDefault = 1000; <= 1 disables). Capability-gated by SupportsNativeRowArrayFetch(). Eligibility is restricted to fixed-width numeric, temporal and GUID columns, whose block reconstruction is byte-identical to the per-row binder on every backend. Result sets carrying character/text, NUMERIC, TIME, binary or LOB columns transparently stay on the per-row path (faithful materialization of those is not uniform across backends: MSSQL returns narrow text in the client codepage, SQLite's dynamic typing reports unreliable text sizes). A new SqlLogger::OnFetchBlock hook makes the round-trip reduction observable/testable. Performance: round-trips drop ~1000x at depth 1000 for eligible sets; no regression on the per-row path (no allocation when disabled/ineligible). Risk: an active cursor reads ahead up to one block (a few MB, budget- clamped); the connection knob is the global escape hatch. Tested: sqlite3, mssql2022 (Docker), postgres (Docker 16.4) -- full suite shows no regression vs the pre-change baseline; new [prefetch] suite green on all three. Build clean under clangcl-debug (PEDANTIC /WX). Signed-off-by: Christian Parpart <c.parpart@lastrada.net>

@ref

…g-tidy, docs, style Address the CI matrix failures from the block-prefetch commit: - Cross-type read regression (the PostgreSQL/Windows dbtool failures): reading a prefetched numeric/temporal/GUID column as a string (GetColumn<std::string>, as dbtool's generic `exec` printer does) returned an empty string because ConvertCell only rendered character-bound cells. RenderCellAsUtf8 now formats every bound type to text (integers byte-identical to the driver; floating/ temporal/GUID via std::formatter), matching the per-row SQLGetData(SQL_C_CHAR) behaviour. Adds a [prefetch] regression test for the all-numeric-read-as-text case. - clang-tidy (-warnings-as-errors): split is moot — fixed at source. Test file: math-missing-parentheses, integer-sign-comparison, nested conditional operator, std::move on trivially-copyable fixed strings, unchecked optional access. Header: unused-lambda-capture (explicit this-> on the member call). ConvertCell was also split into per-category helpers to stay under the cognitive-complexity threshold. - Doc coverage (doxygen): @ref PrefetchDepthDefault -> @c (it is a value, not a ref target) in SqlConnection.hpp and SqlConnectInfo.hpp; drop the @param naming an unnamed parameter on SqlLogger::OnFetchBlock (described in the brief instead). - C++ style (clang-format-22): restore the single-line empty deleter lambda. Verified: clangcl-debug builds clean; [prefetch] suite green on sqlite3, mssql2022 (Docker), postgres (Docker); dbtool `exec` renders numeric columns. Signed-off-by: Christian Parpart <c.parpart@lastrada.net>

The block-prefetch GUID test dereferenced the TryParse result after a Catch2 REQUIRE, which clang-tidy's bugprone-unchecked-optional-access does not track as a guard (-warnings-as-errors). Add an explicit `if (has_value())` so the optional access is statically checked while keeping the REQUIRE as the failure signal. Signed-off-by: Christian Parpart <c.parpart@lastrada.net>

Yaraslaut

Thanks a lot for the improvement

christianparpart requested a review from a team as a code owner June 22, 2026 08:52

github-actions Bot added Data Mapper tests Core API labels Jun 22, 2026

Christian Parpart added 3 commits June 22, 2026 11:13

github-actions Bot added the documentation Improvements or additions to documentation label Jun 22, 2026

Christian Parpart added 2 commits June 22, 2026 18:44

Yaraslaut approved these changes Jun 23, 2026

View reviewed changes

christianparpart merged commit 01db167 into master Jun 23, 2026
29 checks passed

christianparpart deleted the feature/odbc-native-fast-fetch branch June 23, 2026 06:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ODBC-native row-wise array fetch for fast bulk retrieval#511

Add ODBC-native row-wise array fetch for fast bulk retrieval#511
christianparpart merged 6 commits into
masterfrom
feature/odbc-native-fast-fetch

christianparpart commented Jun 22, 2026 •

edited

Loading

Uh oh!

Yaraslaut left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

christianparpart commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Performance

Uh oh!

Yaraslaut left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

christianparpart commented Jun 22, 2026 •

edited

Loading