feat(bigquery): expose bytes processed stats after ExecuteUpdate#125
Draft
dataders wants to merge 2 commits into
Draft
feat(bigquery): expose bytes processed stats after ExecuteUpdate#125dataders wants to merge 2 commits into
dataders wants to merge 2 commits into
Conversation
…names Enhances `LOAD_FLAG_SEARCH_SYSTEM` to also search well-known system library directories, and teaches `search_path_list` to try the platform-aware filename (e.g. `duckdb` → `libduckdb.dylib`) in each search directory. ## Changes **`system_lib_dirs()`** (new) — returns existing well-known lib paths: - macOS: `/opt/homebrew/lib`, `/usr/local/lib` - Linux: `/usr/lib`, arch-specific multiarch path, `/usr/local/lib` - Windows: empty (uses registry) **`get_search_paths()`** — extended the `LOAD_FLAG_SEARCH_SYSTEM` block to append `system_lib_dirs()` after the ADBC config dir. **`search_path_list()`** — after the bare-name attempt, also tries the platform-aware filename via `libloading::library_filename()`. Without this, searching `/opt/homebrew/lib` for `duckdb` would never find `libduckdb.dylib`. The comment explains the motivating constraint: macOS enforces matching Team IDs across all shared libraries in a process, so a CDN-bundled driver (signed with one key) blocks user-installed DuckDB extensions (signed with the DuckDB key); using the system library avoids the mismatch. **`driver_manifests.rst`** — documents the new system lib dir search step under `LOAD_FLAG_SEARCH_SYSTEM`. **Tests** — updates `test_get_search_paths` for the new behaviour; adds `test_system_lib_dirs_returns_expected_paths` and `test_search_path_list_uses_platform_filename`. ## Search order after this change (Unix/macOS) 1. `ADBC_DRIVER_PATH` env var (`LOAD_FLAG_SEARCH_ENV`) 2. Caller-provided `additional_search_paths` 3. `$CONDA_PREFIX/etc/adbc/drivers` (conda builds, `LOAD_FLAG_SEARCH_ENV`) 4. User config dir (`LOAD_FLAG_SEARCH_USER`) 5. System config dir (`LOAD_FLAG_SEARCH_SYSTEM`) 6. **NEW** System lib dirs — `/opt/homebrew/lib`, `/usr/local/lib`, etc. 7. OS dynamic linker fallback (`load_dynamic_from_name`) ## Motivation / downstream impact This change was motivated by dbt-labs/fs#8693, which adds ~170 lines of DuckDB-specific system library discovery to the `fs` repo because the driver manager didn't search standard lib paths. After this lands, that PR can be simplified to a version bump plus a one-call replacement of its custom discovery logic: ```rust // Before (~105 lines: try_discover_system_duckdb_driver, // try_load_duckdb_from_env_paths, system_duckdb_search_paths, // duckdb_library_filename, plus a bespoke test): if let Some(driver) = Self::try_discover_system_duckdb_driver(adbc_version) { return Ok(driver); } // After (single call; system lib dirs + platform filename handled in adbc): if let Ok(driver) = ManagedAdbcDriver::load_from_name( backend, "duckdb", entrypoint, adbc_version, LOAD_FLAG_SEARCH_SYSTEM, None, ) { return Ok(driver); } ``` The one minor behavioural difference: `fs` currently explicitly walks `DYLD_LIBRARY_PATH`/`LD_LIBRARY_PATH` before the well-known paths; the driver manager's `LOAD_FLAG_SEARCH_ENV` only covers `ADBC_DRIVER_PATH`. In practice this is not a regression — the OS dynamic-linker fallback (step 7) naturally honours those env vars — the only loss is a per-path tracing log line. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cuteUpdate After ExecuteUpdate completes (e.g. CREATE TABLE AS SELECT), store TotalBytesProcessed and TotalBytesBilled from the BigQuery JobStatistics on the statement struct. These are accessible via GetOptionInt with the new OptionIntStatBytesProcessed and OptionIntStatBytesBilled keys. This allows consumers like dbt-fusion to log `X.X GiB processed` annotations in execution output, matching dbt-core's bigquery adapter behavior. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After a BigQuery
ExecuteUpdatecall (e.g.CREATE TABLE AS SELECT), the driver now waits for the job to complete and storesTotalBytesProcessedandTotalBytesBilledfrom the BigQueryJobStatisticson the statement. These values are accessible viaGetOptionIntusing two new option keys:adbc.bigquery.sql.stat.bytes_processedadbc.bigquery.sql.stat.bytes_billedValues reset to
0at the start of eachExecuteUpdatecall.Motivation
dbt-fusion users running BigQuery models currently have no visibility into query cost during development. The dbt-core BigQuery adapter has always surfaced
...GiB processedannotations in execution logs (fromjob.total_bytes_processed). This change unblocks dbt-fusion from doing the same.Tracked in: dbt-labs/dbt-core#14462
Changes
driver.go: AddOptionIntStatBytesProcessedandOptionIntStatBytesBilledconstantsrecord_reader.go: AddjobStatsstruct; extendrunQuerysignature with optional*jobStatsout-param; populate it by callingjob.Wait()on the DDL path (previously returned immediately without waiting)statement.go: AddlastBytesProcessed/lastBytesBilledfields; wire upExecuteUpdateto capture stats; expose viaGetOptionIntTest plan
go build ./go/adbc/driver/bigquery/...go test ./go/adbc/driver/bigquery/...ExecuteUpdateEOF
)