Skip to content

Commit 4ce1014

Browse files
anandgupta42claude
andcommitted
fix: [#691] harden sql_explain and altimate_core_validate input handling
`sql_explain` and `altimate_core_validate` produced high failure rates and unhelpful errors because their input validation did not match the parameter contracts they advertised. **`sql_explain`** - Reject empty, whitespace-only, and placeholder-only SQL (`?`, `:var`, `$1`) before hitting the warehouse. Verbatim driver errors like "SQL compilation error: syntax error line 1 at position 8 unexpected ?." are unrecoverable for LLM callers — catch the common cases early with actionable guidance. - Reject empty / placeholder warehouse names (`?`, `:wh`, `""`) with a pointer to `warehouse_list` rather than letting the Registry emit "Connection ? not found. Available: (none)". - Introduce `buildExplainPlan()` that returns `{prefix, actuallyAnalyzed}` so the handler can pick dialect-correct syntax AND report the true plan mode. Fixes the previous lie where Snowflake users who asked for ANALYZE saw `analyzed: true` in metadata even though we silently sent plain `EXPLAIN USING TEXT`. - Warehouse coverage is now explicit: Snowflake (`EXPLAIN USING TEXT`), PostgreSQL (`EXPLAIN (ANALYZE, BUFFERS)`), Redshift (plain `EXPLAIN` — does NOT support ANALYZE), MySQL / MariaDB / DuckDB (`EXPLAIN ANALYZE`), Databricks / Spark (`EXPLAIN FORMATTED`), ClickHouse (`EXPLAIN`). BigQuery, Oracle, and SQL Server return an empty prefix and a clear "not supported via statement prefix — this warehouse needs a different plan mechanism" error instead of issuing a broken statement. - New `translateExplainError()` helper rewrites common driver errors into actionable messages: connection-not-found becomes "Available warehouses: a, b, c", unsubstituted-`?` compile errors become "inline the literal value", permission denials call out role grants, etc. - Intentional non-change: we do NOT scan mid-query for stray `?` because PostgreSQL JSONB uses `?`, `?|`, `?&` as legitimate operators. Only the bare-placeholder check runs, which has no false positives. **`altimate_core_validate`** - Remove the hard-gate that returned "No schema provided..." before the engine ran. The parameter schema already declared `schema_path` and `schema_context` as optional, and the Rust core (`altimate-core`) has accepted empty schemas for a long time via the existing `schemaOrEmpty` helper — the hard-gate was gratuitous and every call without schema failed identically with no recovery path. - When schema is absent, the engine still runs so syntax / dialect findings come through, and the tool surfaces a clear warning in the output and `has_schema: false` in metadata so callers can distinguish full validation from schema-less runs. Title becomes `Validate: VALID (no schema)` for unambiguous signaling. - Errors thrown by the engine are now classified as `error_class: engine_failure` to keep them distinct from user-input problems. **Tests** - New `test/altimate/tools/sql-explain.test.ts`: input validation (empty, placeholder, short), warehouse-name validation, full `buildExplainPlan` matrix (including the Redshift / BigQuery / Oracle regressions above), error translation paths, and Tool.execute integration with a mocked dispatcher. - New `test/altimate/tools/altimate-core-validate.test.ts`: schema resolution (none / `schema_context` / `schema_path` / empty object), validation error classification, output formatting, and a regression guard asserting the dispatcher is called even when no schema is given. - Updated `test/altimate/tool-error-propagation.test.ts` to reflect the new contract: the validate tool now runs the engine without schema and returns `success: true` with a warning, instead of early-returning an error. The regression fixture keeps telemetry coverage. Closes #691 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent f030bf8 commit 4ce1014

6 files changed

Lines changed: 1070 additions & 37 deletions

File tree

packages/opencode/src/altimate/native/connections/register.ts

Lines changed: 168 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,147 @@ function getWarehouseType(warehouseName?: string): string {
182182
return Registry.getConfig(warehouseName)?.type ?? "unknown"
183183
}
184184

185+
/**
186+
* Dialect-aware EXPLAIN plan describing how a warehouse should be asked for
187+
* a query plan.
188+
*
189+
* - `prefix`: the statement prefix to prepend to the SQL ("" means this
190+
* warehouse cannot be EXPLAIN'd via a simple statement prefix — the
191+
* handler will return a clear error instead of issuing a broken query)
192+
* - `actuallyAnalyzed`: whether the prefix actually runs the query and
193+
* reports runtime stats. The caller requested `analyze=true`, but some
194+
* warehouses silently downgrade to plan-only (e.g. Snowflake), so we
195+
* reflect the true mode back to the user in the result envelope.
196+
*/
197+
export interface ExplainPlan {
198+
prefix: string
199+
actuallyAnalyzed: boolean
200+
}
201+
202+
/**
203+
* Build a dialect-aware EXPLAIN plan for the given warehouse type.
204+
*
205+
* Warehouse-specific notes:
206+
* - Snowflake: `EXPLAIN USING TEXT`. No ANALYZE variant — silently downgraded.
207+
* - PostgreSQL: `EXPLAIN` or `EXPLAIN (ANALYZE, BUFFERS)` for runtime stats.
208+
* - Redshift: plain `EXPLAIN` only. Redshift does NOT support ANALYZE.
209+
* - MySQL / MariaDB: `EXPLAIN` or `EXPLAIN ANALYZE` (MySQL 8+).
210+
* - DuckDB: `EXPLAIN` or `EXPLAIN ANALYZE`.
211+
* - Databricks / Spark: `EXPLAIN` or `EXPLAIN FORMATTED`.
212+
* - ClickHouse: `EXPLAIN` (no ANALYZE form accepted via statement prefix).
213+
* - BigQuery: uses a dry-run API instead of any EXPLAIN statement. Not
214+
* supported via this code path — return empty prefix.
215+
* - Oracle: `EXPLAIN PLAN FOR` stores the plan in PLAN_TABLE rather than
216+
* returning it, so a bare prefix does not produce output. Not supported.
217+
* - SQL Server: requires `SET SHOWPLAN_TEXT ON` as a session setting.
218+
* Not supported via statement prefix.
219+
*/
220+
export function buildExplainPlan(warehouseType: string | undefined, analyze: boolean): ExplainPlan {
221+
const type = (warehouseType ?? "").toLowerCase()
222+
switch (type) {
223+
case "snowflake":
224+
// Snowflake: no ANALYZE; USING TEXT returns a readable plan.
225+
return { prefix: "EXPLAIN USING TEXT", actuallyAnalyzed: false }
226+
case "postgres":
227+
case "postgresql":
228+
return analyze
229+
? { prefix: "EXPLAIN (ANALYZE, BUFFERS)", actuallyAnalyzed: true }
230+
: { prefix: "EXPLAIN", actuallyAnalyzed: false }
231+
case "redshift":
232+
// Redshift supports only plain EXPLAIN — no ANALYZE/BUFFERS.
233+
return { prefix: "EXPLAIN", actuallyAnalyzed: false }
234+
case "mysql":
235+
case "mariadb":
236+
// MySQL 8+ supports `EXPLAIN ANALYZE`. Older versions will reject it
237+
// at the warehouse and the error will be surfaced to the caller.
238+
return analyze
239+
? { prefix: "EXPLAIN ANALYZE", actuallyAnalyzed: true }
240+
: { prefix: "EXPLAIN", actuallyAnalyzed: false }
241+
case "duckdb":
242+
return analyze
243+
? { prefix: "EXPLAIN ANALYZE", actuallyAnalyzed: true }
244+
: { prefix: "EXPLAIN", actuallyAnalyzed: false }
245+
case "databricks":
246+
case "spark":
247+
// Databricks/Spark `EXPLAIN FORMATTED` returns a more detailed plan but
248+
// does not actually execute the query — still a plan-only mode.
249+
return { prefix: analyze ? "EXPLAIN FORMATTED" : "EXPLAIN", actuallyAnalyzed: false }
250+
case "clickhouse":
251+
return { prefix: "EXPLAIN", actuallyAnalyzed: false }
252+
case "bigquery":
253+
// BigQuery has no EXPLAIN statement — the correct answer is a dry-run
254+
// job via the BigQuery API, which this tool does not support today.
255+
return { prefix: "", actuallyAnalyzed: false }
256+
case "oracle":
257+
// Oracle's `EXPLAIN PLAN FOR` stores the plan in PLAN_TABLE and returns
258+
// no rows, so a statement-prefix approach does not work. Callers would
259+
// need a follow-up `SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY)`.
260+
return { prefix: "", actuallyAnalyzed: false }
261+
case "mssql":
262+
case "sqlserver":
263+
// SQL Server requires `SET SHOWPLAN_TEXT ON` as a session setting,
264+
// not a statement prefix. Not supported via this code path.
265+
return { prefix: "", actuallyAnalyzed: false }
266+
default:
267+
// Unknown warehouse — fall back to plain EXPLAIN and let the driver
268+
// surface a real error if it does not understand the syntax.
269+
return analyze
270+
? { prefix: "EXPLAIN ANALYZE", actuallyAnalyzed: true }
271+
: { prefix: "EXPLAIN", actuallyAnalyzed: false }
272+
}
273+
}
274+
275+
/** @deprecated Use buildExplainPlan for richer metadata about the plan mode. */
276+
export function buildExplainPrefix(warehouseType: string | undefined, analyze: boolean): string {
277+
return buildExplainPlan(warehouseType, analyze).prefix
278+
}
279+
280+
/**
281+
* Translate a raw warehouse/Registry error into an actionable message.
282+
*
283+
* Verbatim driver errors like "Connection ? not found. Available: (none)" are
284+
* useless to an LLM caller that has no way to fix them. Detect common patterns
285+
* and rewrite them with concrete next steps.
286+
*/
287+
export function translateExplainError(
288+
raw: unknown,
289+
warehouseName: string | undefined,
290+
availableWarehouses: string[],
291+
): string {
292+
const msg = raw instanceof Error ? raw.message : String(raw)
293+
294+
// Connection/warehouse lookup failures.
295+
if (/Connection\s+.+\s+not found/i.test(msg) || /warehouse\s+.+\s+not found/i.test(msg)) {
296+
if (availableWarehouses.length > 0) {
297+
return `Warehouse ${JSON.stringify(warehouseName ?? "")} is not configured. Available warehouses: ${availableWarehouses.join(", ")}. Pass one of these as the 'warehouse' parameter, or omit it to use the default.`
298+
}
299+
return "No warehouses are configured. Run `warehouse_add` to set one up before calling sql_explain."
300+
}
301+
302+
// Unsubstituted-placeholder compilation errors from Snowflake / PG / MySQL.
303+
if (/syntax error.*unexpected\s*\?/i.test(msg) || /syntax error.*position\s+\d+.*\?/i.test(msg)) {
304+
return "SQL compilation error: the query contains an unsubstituted `?` placeholder. sql_explain does not support parameterized queries — inline the literal value before calling."
305+
}
306+
307+
// Snowflake-specific: ANALYZE not supported.
308+
if (/EXPLAIN\s+ANALYZE/i.test(msg) && /not\s+supported/i.test(msg)) {
309+
return "This warehouse does not support EXPLAIN ANALYZE. Retry with analyze=false to get a plan-only EXPLAIN."
310+
}
311+
312+
// Permission denials.
313+
if (/permission\s+denied/i.test(msg) || /access\s+denied/i.test(msg) || /insufficient\s+privilege/i.test(msg)) {
314+
return `The warehouse user lacks permission to EXPLAIN this query. Check role grants for the objects referenced in the SQL. Original error: ${msg}`
315+
}
316+
317+
// Generic SQL compilation error fallback.
318+
if (/SQL compilation error/i.test(msg) || /syntax error/i.test(msg)) {
319+
return `SQL compilation error in the query passed to sql_explain. Fix the SQL and retry. Original error: ${msg}`
320+
}
321+
322+
// Fall through: return the original but with a hint to aid debugging.
323+
return msg
324+
}
325+
185326
/** Register all connection-related handlers. Exported for test re-registration. */
186327
export function registerAll(): void {
187328

@@ -261,10 +402,11 @@ register("sql.execute", async (params: SqlExecuteParams): Promise<SqlExecuteResu
261402

262403
// --- sql.explain ---
263404
register("sql.explain", async (params: SqlExplainParams): Promise<SqlExplainResult> => {
405+
let warehouseName: string | undefined
406+
let warehouseType: string | undefined
264407
try {
265-
const warehouseName = params.warehouse
408+
warehouseName = params.warehouse
266409
let connector
267-
let warehouseType: string | undefined
268410

269411
if (warehouseName) {
270412
connector = await Registry.get(warehouseName)
@@ -276,13 +418,25 @@ register("sql.explain", async (params: SqlExplainParams): Promise<SqlExplainResu
276418
}
277419
connector = await Registry.get(warehouses[0].name)
278420
warehouseType = warehouses[0].type
421+
warehouseName = warehouses[0].name
279422
}
280423

281-
const explainPrefix = params.analyze ? "EXPLAIN ANALYZE" : "EXPLAIN"
282-
const result = await connector.execute(
283-
`${explainPrefix} ${params.sql}`,
284-
10000,
285-
)
424+
const plan = buildExplainPlan(warehouseType, params.analyze ?? false)
425+
if (plan.prefix === "") {
426+
// Warehouse does not support EXPLAIN via a simple statement prefix —
427+
// return a clear error rather than sending a bare statement to the
428+
// driver. BigQuery needs a dry-run job, SQL Server needs SHOWPLAN_TEXT,
429+
// Oracle needs DBMS_XPLAN, etc.
430+
return {
431+
success: false,
432+
plan_rows: [],
433+
error: `sql_explain is not supported for warehouse type ${JSON.stringify(warehouseType)}. This warehouse requires a different plan mechanism (e.g. dry-run API, SET SHOWPLAN_TEXT ON, or DBMS_XPLAN) that sql_explain cannot issue directly.`,
434+
warehouse_type: warehouseType,
435+
analyzed: false,
436+
}
437+
}
438+
439+
const result = await connector.execute(`${plan.prefix} ${params.sql}`, 10000)
286440

287441
const planText = result.rows.map((r) => String(r[0])).join("\n")
288442
const planRows = result.rows.map((r, i) => ({
@@ -295,13 +449,18 @@ register("sql.explain", async (params: SqlExplainParams): Promise<SqlExplainResu
295449
plan_text: planText,
296450
plan_rows: planRows,
297451
warehouse_type: warehouseType,
298-
analyzed: params.analyze ?? false,
452+
// Reflect the true mode: if Snowflake silently downgraded ANALYZE to a
453+
// plan-only EXPLAIN, the caller should know the plan is estimated, not
454+
// observed.
455+
analyzed: plan.actuallyAnalyzed,
299456
}
300457
} catch (e) {
458+
const available = Registry.list().warehouses.map((w) => w.name)
301459
return {
302460
success: false,
303461
plan_rows: [],
304-
error: String(e),
462+
error: translateExplainError(e, warehouseName, available),
463+
warehouse_type: warehouseType,
305464
analyzed: params.analyze ?? false,
306465
}
307466
}

packages/opencode/src/altimate/tools/altimate-core-validate.ts

Lines changed: 20 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -5,24 +5,14 @@ import type { Telemetry } from "../telemetry"
55

66
export const AltimateCoreValidateTool = Tool.define("altimate_core_validate", {
77
description:
8-
"Validate SQL syntax and schema references. Checks if tables/columns exist in the schema and if SQL is valid for the target dialect. IMPORTANT: Provide schema_context or schema_path — without schema, all table/column references will report as 'not found'.",
8+
"Validate SQL syntax and schema references. Checks if tables/columns exist in the schema and if SQL is valid for the target dialect. If no schema_path or schema_context is provided, validation still runs but schema-dependent checks (table/column existence) are skipped — syntax and dialect checks still apply. For full validation, run `schema_inspect` first on the referenced tables or pass `schema_context` inline.",
99
parameters: z.object({
1010
sql: z.string().describe("SQL query to validate"),
1111
schema_path: z.string().optional().describe("Path to YAML/JSON schema file"),
1212
schema_context: z.record(z.string(), z.any()).optional().describe("Inline schema definition"),
1313
}),
14-
async execute(args, ctx) {
14+
async execute(args, _ctx) {
1515
const hasSchema = !!(args.schema_path || (args.schema_context && Object.keys(args.schema_context).length > 0))
16-
const noSchema = !hasSchema
17-
if (noSchema) {
18-
const error =
19-
"No schema provided. Provide schema_context or schema_path so table/column references can be resolved."
20-
return {
21-
title: "Validate: NO SCHEMA",
22-
metadata: { success: false, valid: false, has_schema: false, error },
23-
output: `Error: ${error}`,
24-
}
25-
}
2616
try {
2717
const result = await Dispatcher.call("altimate_core.validate", {
2818
sql: args.sql,
@@ -38,27 +28,34 @@ export const AltimateCoreValidateTool = Tool.define("altimate_core_validate", {
3828
}))
3929
// altimate_change end
4030
return {
41-
title: `Validate: ${data.valid ? "VALID" : "INVALID"}`,
31+
title: `Validate: ${data.valid ? "VALID" : "INVALID"}${hasSchema ? "" : " (no schema)"}`,
4232
metadata: {
4333
success: true, // engine ran — validation errors are findings, not failures
4434
valid: data.valid,
4535
has_schema: hasSchema,
4636
...(error && { error }),
4737
...(findings.length > 0 && { findings }),
4838
},
49-
output: formatValidate(data),
39+
output: formatValidate(data, hasSchema),
5040
}
5141
} catch (e) {
5242
const msg = e instanceof Error ? e.message : String(e)
5343
return {
5444
title: "Validate: ERROR",
55-
metadata: { success: false, valid: false, has_schema: hasSchema, error: msg },
45+
metadata: { success: false, valid: false, has_schema: hasSchema, error: msg, error_class: "engine_failure" },
5646
output: `Failed: ${msg}`,
5747
}
5848
}
5949
},
6050
})
6151

52+
// Exported for unit tests.
53+
export const _altimateCoreValidateInternal = {
54+
extractValidationErrors,
55+
classifyValidationError,
56+
formatValidate,
57+
}
58+
6259
function extractValidationErrors(data: Record<string, any>): string | undefined {
6360
if (Array.isArray(data.errors) && data.errors.length > 0) {
6461
const msgs = data.errors.map((e: any) => e.message ?? String(e)).filter(Boolean)
@@ -77,14 +74,20 @@ function classifyValidationError(message: string): string {
7774
return "validation_error"
7875
}
7976

80-
function formatValidate(data: Record<string, any>): string {
77+
function formatValidate(data: Record<string, any>, hasSchema: boolean): string {
8178
if (data.error) return `Error: ${data.error}`
82-
if (data.valid) return "SQL is valid."
79+
const schemaNote = hasSchema
80+
? ""
81+
: "\n\nNote: No schema was provided, so table/column existence checks were skipped. " +
82+
"To catch missing tables or columns, run `schema_inspect` on the referenced tables first " +
83+
"or pass `schema_context` inline with {tables: {...}}."
84+
if (data.valid) return `SQL is valid.${schemaNote}`
8385

8486
const lines = ["Validation failed:\n"]
8587
for (const err of data.errors ?? []) {
8688
lines.push(` • ${err.message}`)
8789
if (err.location) lines.push(` at line ${err.location.line}`)
8890
}
91+
if (schemaNote) lines.push(schemaNote)
8992
return lines.join("\n")
9093
}

0 commit comments

Comments
 (0)