Skip to content

Fix SQLInterpolator treating ? in comments and literals as placeholders#1424

Open
samikshya-db wants to merge 5 commits intodatabricks:mainfrom
samikshya-db:fix/issue-1331-sql-interpolator-comments
Open

Fix SQLInterpolator treating ? in comments and literals as placeholders#1424
samikshya-db wants to merge 5 commits intodatabricks:mainfrom
samikshya-db:fix/issue-1331-sql-interpolator-comments

Conversation

@samikshya-db
Copy link
Copy Markdown
Collaborator

Summary

Fixes #1331.

When supportManyParameters=1, SQLInterpolator.interpolateSQL split the SQL on every ? and counted every occurrence as a placeholder. Question marks inside line/block comments, single-quoted strings, double-quoted identifiers, and backtick-quoted identifiers were all counted, producing spurious "Parameter count does not match" errors for queries such as:

-- does this work?
select 'hello?', * from mytable where id = ?

This had 3 ? characters but only 1 real placeholder, so the driver rejected the perfectly valid binding of 1 parameter.

What changed

  • Added SqlCommentParser.findPlaceholderPositions(sql) which walks the SQL with the existing comment/literal state machine and returns the source indices of ? characters that appear in State.NORMAL only.
  • Reworked SQLInterpolator.interpolateSQL to use those positions: a single pass that slices the original SQL between placeholders and substitutes formatted parameter values. Comments, string literals, and quoted identifiers are preserved verbatim in the output.
  • Removed the now-redundant countPlaceholders helper that scanned every ?.

DatabricksParameterMetaData.countParameters already used SqlCommentParser.forEachNonCommentChar and is unaffected.

Test plan

  • Existing SQLInterpolatorTest cases (basic interpolation, mixed types, escaped quotes, mismatch errors, etc.) all pass.
  • Added unit tests for SQLInterpolator covering ? inside line comments, block comments, single-quoted strings, double-quoted identifiers, backtick identifiers, and a combined case.
  • Added unit tests for SqlCommentParser.findPlaceholderPositions covering null/empty input, basic positions, comments, literals, quoted identifiers, and escaped quotes.
  • mvn spotless:apply clean.

samikshya-db and others added 2 commits April 27, 2026 15:36
When supportManyParameters=1, SQLInterpolator split the SQL on every '?',
so question marks inside line/block comments, string literals, and quoted
identifiers were counted as parameter placeholders. This caused
"Parameter count does not match" errors for queries like:

  -- does this work?
  select 'hello?', * from mytable where id = ?

Add SqlCommentParser.findPlaceholderPositions to locate only real '?'
markers (state == NORMAL) and use it from SQLInterpolator. Fixes databricks#1331.

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>
- Add IndexedSqlCharConsumer overload to forEachNonCommentChar so callers
  can recover the source-string index of each emitted character. Refactor
  findPlaceholderPositions to delegate to it, removing ~70 lines of
  duplicated state-machine logic.
- Apply the same comment/literal-aware fix to surroundPlaceholdersWithQuotes,
  which previously used a regex that ignored comments, double-quoted
  identifiers, and backtick identifiers. A '?' inside any of those is now
  preserved instead of being wrapped in single quotes.
- Drop the now-unused Matcher/Pattern imports from SQLInterpolator.
- Add tests for CRLF line comments, nested block comments containing '?',
  adjacent placeholders, leading/trailing placeholders, escaped backticks,
  and the new surroundPlaceholdersWithQuotes cases.

Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Question marks in comments and string literals interpreted as parameters with supportManyParameters

1 participant