Skip to content

Commit ea5cabe

Browse files
anandgupta42claude
andauthored
fix: BigQuery finops SQL — correct INFORMATION_SCHEMA columns + multi-region support (#739)
`finops_query_history` was failing 100% on BigQuery with `Unrecognized name: error_message at [11:5]`. Telemetry traced it to one session looping 76 times over 2.5 hours. Three separate bugs in the BQ query-history template plus a broader region-locking issue across the whole finops module. Fixes in `packages/opencode/src/altimate/native/finops/query-history.ts` BIGQUERY_HISTORY_SQL: - `error_message` does not exist as a top-level column in INFORMATION_SCHEMA.JOBS. Replace with `error_result.message AS error_message`. - `error_result.reason AS error_code` in place of the previous hardcoded NULL (more useful and matches the struct that actually exists). - `total_rows` is a PARTITIONS column, not JOBS. Replace with `CAST(NULL AS INT64) AS rows_produced` so the downstream summary loop doesn't error. - BQ's `state` returns `'DONE'`, not `'SUCCESS'`. `getQueryHistory()` counts any `execution_status != 'SUCCESS'` as a failure, so bare `state AS execution_status` was flagging every completed BQ job as failed. Derive `CASE WHEN error_result IS NULL THEN 'SUCCESS' ELSE 'FAILED' END` to match the other warehouse templates. Multi-region support across all 5 finops modules: - All BQ `INFORMATION_SCHEMA` queries were hardcoded to `` `region-US.INFORMATION_SCHEMA.*` ``, making the tools unusable for non-US BigQuery projects. Replace with `{region}` placeholder and interpolate from the connection's configured `location` at runtime. - New shared helper `finops/bq-utils.ts` exposes `sanitizeBqRegion` (allowlist `[a-z0-9-]`, trim hyphens, cap at 64 chars, fall back to `us`), `interpolateBqRegion` (uses `replaceAll` so future JOIN-across-views templates stay safe), and `bqRegionFor(warehouse)`. - Covers query-history, credit-analyzer (3 templates), warehouse-advisor (2 templates), role-access, and unused-resources. Snowflake and Databricks paths are untouched. The new `bqRegion?` parameter is optional on every build helper and only consumed inside `whType === "bigquery"` branches; regression tests assert Snowflake and Databricks SQL still contains `QUERY_HISTORY` / `system.query.history` and never `region-`. Tests: - New `schema-finops-dbt.test.ts` assertions: column-level regression guards (no bare `error_message`/`total_rows`/`state as execution_status`), a table-driven test that no finops BQ template contains `region-US`, region-threading tests for all 5 build helpers, sanitizer coverage (injection, hyphen-trim, length cap, non-string), Snowflake/Databricks regression guards. 75 pass (up from 66). - New `finops-bigquery-e2e.test.ts`: skipIf-gated against `ALTIMATE_CODE_CONN_BIGQUERY_TEST`, mirrors the Snowflake pattern. Verifies shape, no SQL parse errors, derived `execution_status`, error_count math, and a bad-region graceful-failure path. 10/10 pass against real BigQuery (US region). - Full altimate suite: 2917/2917 pass (was 2908). Typecheck clean. Reviewed via /consensus:code-review by GPT-5.4, Gemini 3.1 Pro, and DeepSeek V3.2 — all three APPROVE on round 1 after incorporating their MAJOR findings (missing tests, missed files, sanitizer hardening). Closes #738 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent a3503b0 commit ea5cabe

8 files changed

Lines changed: 496 additions & 33 deletions

File tree

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
/**
2+
* Shared BigQuery helpers for the finops module.
3+
*
4+
* INFORMATION_SCHEMA queries on BigQuery must be region-qualified —
5+
* `` `region-<location>.INFORMATION_SCHEMA.<view>` ``. The finops tools read
6+
* the connection's configured `location` (e.g. "us", "eu", "us-central1") and
7+
* interpolate it into each SQL template via a `{region}` placeholder.
8+
*
9+
* Kept in its own file so the sanitize + interpolate logic doesn't drift across
10+
* query-history, credit-analyzer, warehouse-advisor, role-access, and
11+
* unused-resources.
12+
*/
13+
14+
import * as Registry from "../connections/registry"
15+
16+
/**
17+
* Sanitize a BigQuery region/location string for safe interpolation into a
18+
* region-qualified INFORMATION_SCHEMA reference. The result is always safe to
19+
* inject into `` `region-<result>.INFORMATION_SCHEMA.X` `` — the allow-list
20+
* `[a-z0-9-]` cannot close the backtick context or introduce SQL delimiters.
21+
*
22+
* Transformations:
23+
* - lowercase + trim
24+
* - strip anything outside [a-z0-9-]
25+
* - trim leading/trailing hyphens (BQ region names never start or end with -)
26+
* - cap length at 64 chars (BQ region names are short; this guards against
27+
* pathological inputs)
28+
* - fall back to "us" on empty input (historical default)
29+
*/
30+
export function sanitizeBqRegion(location: unknown): string {
31+
const raw = typeof location === "string" ? location : ""
32+
const cleaned = raw
33+
.toLowerCase()
34+
.trim()
35+
.replace(/[^a-z0-9-]/g, "")
36+
.replace(/^-+|-+$/g, "")
37+
.slice(0, 64)
38+
return cleaned || "us"
39+
}
40+
41+
/**
42+
* Substitute the `{region}` placeholder in a BQ SQL template with the sanitized
43+
* region for a given warehouse config. Uses replaceAll so future templates that
44+
* reference multiple region-qualified views (e.g. JOINs) are handled safely.
45+
*/
46+
export function interpolateBqRegion(sqlTemplate: string, bqRegion?: unknown): string {
47+
return sqlTemplate.replaceAll("{region}", sanitizeBqRegion(bqRegion))
48+
}
49+
50+
/**
51+
* Resolve the BigQuery `location` for a registered warehouse. Returns undefined
52+
* when the warehouse is not BigQuery or has no location set — callers pass the
53+
* result through `sanitizeBqRegion`, which defaults to "us".
54+
*/
55+
export function bqRegionFor(warehouse: string): unknown {
56+
return Registry.getConfig(warehouse)?.location
57+
}

packages/opencode/src/altimate/native/finops/credit-analyzer.ts

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
*/
66

77
import * as Registry from "../connections/registry"
8+
import { bqRegionFor, interpolateBqRegion } from "./bq-utils"
89
import type {
910
CreditAnalysisParams,
1011
CreditAnalysisResult,
@@ -80,7 +81,7 @@ SELECT
8081
0 as credits_cloud,
8182
COUNT(*) as query_count,
8283
AVG(total_bytes_billed) / 1099511627776.0 * 5.0 as avg_credits_per_query
83-
FROM \`region-US.INFORMATION_SCHEMA.JOBS\`
84+
FROM \`region-{region}.INFORMATION_SCHEMA.JOBS\`
8485
WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL ? DAY)
8586
AND job_type = 'QUERY'
8687
AND state = 'DONE'
@@ -97,7 +98,7 @@ SELECT
9798
0 as total_cloud_credits,
9899
COUNT(DISTINCT DATE(creation_time)) as active_days,
99100
AVG(total_bytes_billed) / 1099511627776.0 * 5.0 as avg_daily_credits
100-
FROM \`region-US.INFORMATION_SCHEMA.JOBS\`
101+
FROM \`region-{region}.INFORMATION_SCHEMA.JOBS\`
101102
WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL ? DAY)
102103
AND job_type = 'QUERY'
103104
AND state = 'DONE'
@@ -115,7 +116,7 @@ SELECT
115116
0 as rows_produced,
116117
total_bytes_billed / 1099511627776.0 * 5.0 as credits_used,
117118
start_time
118-
FROM \`region-US.INFORMATION_SCHEMA.JOBS\`
119+
FROM \`region-{region}.INFORMATION_SCHEMA.JOBS\`
119120
WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL ? DAY)
120121
AND job_type = 'QUERY'
121122
AND state = 'DONE'
@@ -191,7 +192,7 @@ function getWhType(warehouse: string): string {
191192
}
192193

193194
function buildCreditUsageSql(
194-
whType: string, days: number, limit: number, warehouseFilter?: string,
195+
whType: string, days: number, limit: number, warehouseFilter?: string, bqRegion?: unknown,
195196
): { sql: string; binds: any[] } | null {
196197
if (whType === "snowflake") {
197198
const binds: any[] = [-days]
@@ -203,33 +204,42 @@ function buildCreditUsageSql(
203204
}
204205
}
205206
if (whType === "bigquery") {
206-
return { sql: BIGQUERY_CREDIT_USAGE_SQL, binds: [days, limit] }
207+
return {
208+
sql: interpolateBqRegion(BIGQUERY_CREDIT_USAGE_SQL, bqRegion),
209+
binds: [days, limit],
210+
}
207211
}
208212
if (whType === "databricks") {
209213
return { sql: DATABRICKS_CREDIT_USAGE_SQL, binds: [days, limit] }
210214
}
211215
return null
212216
}
213217

214-
function buildCreditSummarySql(whType: string, days: number): { sql: string; binds: any[] } | null {
218+
function buildCreditSummarySql(whType: string, days: number, bqRegion?: unknown): { sql: string; binds: any[] } | null {
215219
if (whType === "snowflake") {
216220
return { sql: SNOWFLAKE_CREDIT_SUMMARY_SQL, binds: [-days] }
217221
}
218222
if (whType === "bigquery") {
219-
return { sql: BIGQUERY_CREDIT_SUMMARY_SQL, binds: [days] }
223+
return {
224+
sql: interpolateBqRegion(BIGQUERY_CREDIT_SUMMARY_SQL, bqRegion),
225+
binds: [days],
226+
}
220227
}
221228
if (whType === "databricks") {
222229
return { sql: DATABRICKS_CREDIT_SUMMARY_SQL, binds: [days] }
223230
}
224231
return null
225232
}
226233

227-
function buildExpensiveSql(whType: string, days: number, limit: number): { sql: string; binds: any[] } | null {
234+
function buildExpensiveSql(whType: string, days: number, limit: number, bqRegion?: unknown): { sql: string; binds: any[] } | null {
228235
if (whType === "snowflake") {
229236
return { sql: SNOWFLAKE_EXPENSIVE_SQL, binds: [-days, limit] }
230237
}
231238
if (whType === "bigquery") {
232-
return { sql: BIGQUERY_EXPENSIVE_SQL, binds: [days, limit] }
239+
return {
240+
sql: interpolateBqRegion(BIGQUERY_EXPENSIVE_SQL, bqRegion),
241+
binds: [days, limit],
242+
}
233243
}
234244
if (whType === "databricks") {
235245
return { sql: DATABRICKS_EXPENSIVE_SQL, binds: [days, limit] }
@@ -295,9 +305,10 @@ export async function analyzeCredits(params: CreditAnalysisParams): Promise<Cred
295305
const whType = getWhType(params.warehouse)
296306
const days = params.days ?? 30
297307
const limit = params.limit ?? 50
308+
const bqRegion = whType === "bigquery" ? bqRegionFor(params.warehouse) : undefined
298309

299-
const dailyBuilt = buildCreditUsageSql(whType, days, limit, params.warehouse_filter)
300-
const summaryBuilt = buildCreditSummarySql(whType, days)
310+
const dailyBuilt = buildCreditUsageSql(whType, days, limit, params.warehouse_filter, bqRegion)
311+
const summaryBuilt = buildCreditSummarySql(whType, days, bqRegion)
301312

302313
if (!dailyBuilt || !summaryBuilt) {
303314
return {
@@ -346,8 +357,9 @@ export async function getExpensiveQueries(params: ExpensiveQueriesParams): Promi
346357
const whType = getWhType(params.warehouse)
347358
const days = params.days ?? 7
348359
const limit = params.limit ?? 20
360+
const bqRegion = whType === "bigquery" ? bqRegionFor(params.warehouse) : undefined
349361

350-
const built = buildExpensiveSql(whType, days, limit)
362+
const built = buildExpensiveSql(whType, days, limit, bqRegion)
351363
if (!built) {
352364
return {
353365
success: false,

packages/opencode/src/altimate/native/finops/query-history.ts

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
*/
66

77
import * as Registry from "../connections/registry"
8+
import { bqRegionFor, interpolateBqRegion, sanitizeBqRegion } from "./bq-utils"
89
import type { QueryHistoryParams, QueryHistoryResult } from "../types"
910

1011
// ---------------------------------------------------------------------------
@@ -59,6 +60,12 @@ ORDER BY total_exec_time DESC
5960
LIMIT {limit}
6061
`
6162

63+
// BigQuery's INFORMATION_SCHEMA.JOBS does not have top-level `error_message`,
64+
// `error_code`, or `total_rows` columns. Errors live under the `error_result`
65+
// STRUCT ({reason, location, message, debug_info}) and per-row counts aren't
66+
// exposed at the jobs level. `state` returns 'DONE' (not 'SUCCESS'), so we
67+
// derive execution_status from error_result to stay consistent with the other
68+
// warehouse templates and the summary loop in getQueryHistory().
6269
const BIGQUERY_HISTORY_SQL = `
6370
SELECT
6471
job_id as query_id,
@@ -67,16 +74,16 @@ SELECT
6774
user_email as user_name,
6875
'' as warehouse_name,
6976
reservation_id as warehouse_size,
70-
state as execution_status,
71-
NULL as error_code,
72-
error_message,
77+
CASE WHEN error_result IS NULL THEN 'SUCCESS' ELSE 'FAILED' END as execution_status,
78+
error_result.reason as error_code,
79+
error_result.message as error_message,
7380
start_time,
7481
end_time,
7582
TIMESTAMP_DIFF(end_time, start_time, SECOND) as execution_time_sec,
7683
total_bytes_billed as bytes_scanned,
77-
total_rows as rows_produced,
84+
CAST(NULL AS INT64) as rows_produced,
7885
0 as credits_used_cloud_services
79-
FROM \`region-US.INFORMATION_SCHEMA.JOBS\`
86+
FROM \`region-{region}.INFORMATION_SCHEMA.JOBS\`
8087
WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL ? DAY)
8188
ORDER BY creation_time DESC
8289
LIMIT ?
@@ -146,6 +153,7 @@ function buildHistoryQuery(
146153
limit: number,
147154
user?: string,
148155
warehouseFilter?: string,
156+
bqRegion?: unknown,
149157
): { sql: string; binds: any[] } | null {
150158
if (whType === "snowflake") {
151159
const binds: any[] = [-days]
@@ -161,7 +169,10 @@ function buildHistoryQuery(
161169
return { sql: POSTGRES_HISTORY_SQL.replace("{limit}", String(Math.floor(Number(limit)))), binds: [] }
162170
}
163171
if (whType === "bigquery") {
164-
return { sql: BIGQUERY_HISTORY_SQL, binds: [days, limit] }
172+
return {
173+
sql: interpolateBqRegion(BIGQUERY_HISTORY_SQL, bqRegion),
174+
binds: [days, limit],
175+
}
165176
}
166177
if (whType === "databricks") {
167178
return { sql: DATABRICKS_HISTORY_SQL, binds: [days, limit] }
@@ -202,8 +213,9 @@ export async function getQueryHistory(params: QueryHistoryParams): Promise<Query
202213
const whType = getWhType(params.warehouse)
203214
const days = params.days ?? 7
204215
const limit = params.limit ?? 100
216+
const bqRegion = whType === "bigquery" ? bqRegionFor(params.warehouse) : undefined
205217

206-
const built = buildHistoryQuery(whType, days, limit, params.user, params.warehouse_filter)
218+
const built = buildHistoryQuery(whType, days, limit, params.user, params.warehouse_filter, bqRegion)
207219
if (!built) {
208220
return {
209221
success: false,

packages/opencode/src/altimate/native/finops/role-access.ts

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
*/
66

77
import * as Registry from "../connections/registry"
8+
import { bqRegionFor, interpolateBqRegion } from "./bq-utils"
89
import type {
910
RoleGrantsParams,
1011
RoleGrantsResult,
@@ -75,7 +76,7 @@ SELECT
7576
'NO' as grant_option,
7677
'' as granted_by,
7778
'' as created_on
78-
FROM \`region-US.INFORMATION_SCHEMA.OBJECT_PRIVILEGES\`
79+
FROM \`region-{region}.INFORMATION_SCHEMA.OBJECT_PRIVILEGES\`
7980
WHERE 1=1
8081
{grantee_filter}
8182
ORDER BY object_type, object_name
@@ -123,7 +124,7 @@ function rowsToRecords(result: { columns: string[]; rows: any[][] }): Record<str
123124
}
124125

125126
function buildGrantsSql(
126-
whType: string, role?: string, objectName?: string, limit: number = 100,
127+
whType: string, role?: string, objectName?: string, limit: number = 100, bqRegion?: unknown,
127128
): { sql: string; binds: any[] } | null {
128129
if (whType === "snowflake") {
129130
const binds: any[] = []
@@ -142,7 +143,7 @@ function buildGrantsSql(
142143
const granteeF = role ? (binds.push(role), "AND grantee = ?") : ""
143144
binds.push(limit)
144145
return {
145-
sql: BIGQUERY_GRANTS_SQL.replace("{grantee_filter}", granteeF),
146+
sql: interpolateBqRegion(BIGQUERY_GRANTS_SQL.replace("{grantee_filter}", granteeF), bqRegion),
146147
binds,
147148
}
148149
}
@@ -165,8 +166,9 @@ function buildGrantsSql(
165166
export async function queryGrants(params: RoleGrantsParams): Promise<RoleGrantsResult> {
166167
const whType = getWhType(params.warehouse)
167168
const limit = params.limit ?? 100
169+
const bqRegion = whType === "bigquery" ? bqRegionFor(params.warehouse) : undefined
168170

169-
const built = buildGrantsSql(whType, params.role, params.object_name, limit)
171+
const built = buildGrantsSql(whType, params.role, params.object_name, limit, bqRegion)
170172
if (!built) {
171173
return {
172174
success: false,

packages/opencode/src/altimate/native/finops/unused-resources.ts

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
*/
66

77
import * as Registry from "../connections/registry"
8+
import { bqRegionFor, interpolateBqRegion } from "./bq-utils"
89
import type {
910
UnusedResourcesParams,
1011
UnusedResourcesResult,
@@ -90,7 +91,7 @@ SELECT
9091
size_bytes,
9192
TIMESTAMP_MILLIS(last_modified_time) as last_altered,
9293
creation_time as created
93-
FROM \`region-US.INFORMATION_SCHEMA.TABLE_STORAGE\`
94+
FROM \`region-{region}.INFORMATION_SCHEMA.TABLE_STORAGE\`
9495
WHERE NOT deleted
9596
AND last_modified_time < UNIX_MILLIS(TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL ? DAY))
9697
ORDER BY size_bytes DESC
@@ -186,7 +187,8 @@ export async function findUnusedResources(params: UnusedResourcesParams): Promis
186187
}
187188
} else if (whType === "bigquery") {
188189
try {
189-
const result = await connector.execute(BIGQUERY_UNUSED_TABLES_SQL, limit, [days, limit])
190+
const sql = interpolateBqRegion(BIGQUERY_UNUSED_TABLES_SQL, bqRegionFor(params.warehouse))
191+
const result = await connector.execute(sql, limit, [days, limit])
190192
unusedTables = rowsToRecords(result)
191193
} catch (e) {
192194
errors.push(`Could not query unused tables: ${e}`)

packages/opencode/src/altimate/native/finops/warehouse-advisor.ts

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
*/
66

77
import * as Registry from "../connections/registry"
8+
import { bqRegionFor, interpolateBqRegion } from "./bq-utils"
89
import type {
910
WarehouseAdvisorParams,
1011
WarehouseAdvisorResult,
@@ -58,7 +59,7 @@ SELECT
5859
0 as avg_queue_load,
5960
MAX(period_slot_ms / 1000.0) as peak_queue_load,
6061
COUNT(*) as sample_count
61-
FROM \`region-US.INFORMATION_SCHEMA.JOBS_TIMELINE\`
62+
FROM \`region-{region}.INFORMATION_SCHEMA.JOBS_TIMELINE\`
6263
WHERE period_start >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL {days} DAY)
6364
GROUP BY reservation_id
6465
ORDER BY avg_concurrency DESC
@@ -73,7 +74,7 @@ SELECT
7374
APPROX_QUANTILES(TIMESTAMP_DIFF(end_time, start_time, MILLISECOND), 100)[OFFSET(95)] / 1000.0 as p95_time_sec,
7475
AVG(total_bytes_billed) as avg_bytes_scanned,
7576
SUM(total_bytes_billed) / 1099511627776.0 * 5.0 as total_credits
76-
FROM \`region-US.INFORMATION_SCHEMA.JOBS\`
77+
FROM \`region-{region}.INFORMATION_SCHEMA.JOBS\`
7778
WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL {days} DAY)
7879
AND job_type = 'QUERY'
7980
AND state = 'DONE'
@@ -127,16 +128,20 @@ function getWhType(warehouse: string): string {
127128
return wh?.type || "unknown"
128129
}
129130

130-
function buildLoadSql(whType: string, days: number): string | null {
131+
function buildLoadSql(whType: string, days: number, bqRegion?: unknown): string | null {
131132
if (whType === "snowflake") return SNOWFLAKE_LOAD_SQL.replace("{days}", String(days))
132-
if (whType === "bigquery") return BIGQUERY_LOAD_SQL.replace("{days}", String(days))
133+
if (whType === "bigquery") {
134+
return interpolateBqRegion(BIGQUERY_LOAD_SQL.replace("{days}", String(days)), bqRegion)
135+
}
133136
if (whType === "databricks") return DATABRICKS_LOAD_SQL.replace(/{days}/g, String(days))
134137
return null
135138
}
136139

137-
function buildSizingSql(whType: string, days: number): string | null {
140+
function buildSizingSql(whType: string, days: number, bqRegion?: unknown): string | null {
138141
if (whType === "snowflake") return SNOWFLAKE_SIZING_SQL.replace("{days}", String(days))
139-
if (whType === "bigquery") return BIGQUERY_SIZING_SQL.replace("{days}", String(days))
142+
if (whType === "bigquery") {
143+
return interpolateBqRegion(BIGQUERY_SIZING_SQL.replace("{days}", String(days)), bqRegion)
144+
}
140145
if (whType === "databricks") return DATABRICKS_SIZING_SQL.replace(/{days}/g, String(days))
141146
return null
142147
}
@@ -217,9 +222,10 @@ function generateSizingRecommendations(
217222
export async function adviseWarehouse(params: WarehouseAdvisorParams): Promise<WarehouseAdvisorResult> {
218223
const whType = getWhType(params.warehouse)
219224
const days = params.days ?? 14
225+
const bqRegion = whType === "bigquery" ? bqRegionFor(params.warehouse) : undefined
220226

221-
const loadSql = buildLoadSql(whType, days)
222-
const sizingSql = buildSizingSql(whType, days)
227+
const loadSql = buildLoadSql(whType, days, bqRegion)
228+
const sizingSql = buildSizingSql(whType, days, bqRegion)
223229

224230
if (!loadSql || !sizingSql) {
225231
return {

0 commit comments

Comments
 (0)