Skip to content

Commit 3b5d8af

Browse files
committed
Replace property descriptors with conventional bindparam_string override
The previous revision used property descriptors (with a no-op setter) to force bindtemplate / compilation_bindtemplate. That pattern has zero precedent in SQLAlchemy's built-in dialects (none of MySQL, PostgreSQL, SQLite, MSSQL, or Oracle override these templates), and making the instance attribute un-settable is subtle enough to slow down a future reader. Swap to the conventional shape Oracle uses (cx_oracle.py:781): override bindparam_string for the compile-time render path. For the execute-time IN-clause expansion path — which bypasses bindparam_string and reads self.bindtemplate directly from _literal_execute_expanding_parameter — plain attribute assignment in __init__ after super() is sufficient, because super() sets self.bindtemplate near the end of its __init__ (line 1466 in sqlalchemy 2.0.43) after compilation has already run with compilation_bindtemplate. Result: two well-understood extension points, no descriptors, same end-to-end behavior verified in the comprehensive empirical suite (29/29 passing against the live warehouse, including the IN-clause post-compile expansion case that motivated the two-path coverage). Co-authored-by: Isaac
1 parent 9c736f0 commit 3b5d8af

2 files changed

Lines changed: 83 additions & 63 deletions

File tree

src/databricks/sqlalchemy/_ddl.py

Lines changed: 67 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -84,55 +84,75 @@ def get_column_specification(self, column, **kwargs):
8484

8585

8686
class DatabricksStatementCompiler(compiler.SQLCompiler):
87-
# Override the rendered marker format so every bind parameter is
88-
# wrapped in backticks (`` :`name` ``) at render time. Databricks
89-
# named parameter markers accept two identifier forms per
90-
# ``SqlBaseParser.g4``: a bare ``IDENTIFIER`` (``[A-Za-z_][A-Za-z0-9_]*``)
91-
# or a ``quotedIdentifier`` wrapped in backticks. DataFrame-origin
92-
# column names frequently contain hyphens (e.g. ``col-with-hyphen``),
93-
# which SQLAlchemy would otherwise render verbatim as an invalid bare
94-
# marker ``:col-with-hyphen`` — the parser splits on ``-`` and reports
95-
# UNBOUND_SQL_PARAMETER.
96-
#
97-
# Backticks are valid for *every* identifier (plain names included),
98-
# verified empirically against a Databricks SQL warehouse. Setting the
99-
# template here rather than overriding ``bindparam_string`` ensures the
100-
# quoting applies uniformly across every rendering path — the normal
101-
# bindparam_string, the escape-from path, and crucially the
102-
# ``_literal_execute_expanding_parameter`` path used for IN clauses,
103-
# which builds its own expanded markers directly from this template.
104-
#
105-
# The backticks are SQL-side *quoting* only: the parameter's logical
106-
# name is still the text between them, so the params dict passed to
107-
# the driver keeps the original unquoted key — ``escaped_bind_names``
108-
# is left empty and ``construct_params`` passes keys through unchanged.
109-
110-
# Fixed template for this dialect. We use properties (with a setter
111-
# that ignores the incoming value) because SQLAlchemy's SQLCompiler
112-
# assigns ``self.bindtemplate`` / ``self.compilation_bindtemplate``
113-
# from ``BIND_TEMPLATES[dialect.paramstyle]`` inside its own
114-
# ``__init__`` — which is also where statement compilation runs. A
115-
# subclass override in ``__init__`` runs too late, and a class-level
116-
# attribute is shadowed by super's instance assignment. A property
117-
# descriptor intercepts both the read (forcing our value) and the
118-
# write (no-op), so the template is fixed regardless of order.
119-
_BACKTICKED_BIND_TEMPLATE = ":`%(name)s`"
120-
121-
@property
122-
def bindtemplate(self):
123-
return self._BACKTICKED_BIND_TEMPLATE
124-
125-
@bindtemplate.setter
126-
def bindtemplate(self, _value):
127-
pass
87+
"""Render every bind parameter marker wrapped in backticks.
88+
89+
Databricks named parameter markers accept two forms (per the Spark
90+
SQL grammar ``SqlBaseParser.g4``): a bare ``IDENTIFIER``
91+
(``[A-Za-z_][A-Za-z0-9_]*``) or a ``quotedIdentifier`` wrapped in
92+
backticks. DataFrame-origin column names frequently contain hyphens
93+
(e.g. ``col-with-hyphen``), which SQLAlchemy would otherwise render
94+
verbatim as an invalid bare marker ``:col-with-hyphen`` — the parser
95+
splits on ``-`` and reports ``UNBOUND_SQL_PARAMETER``.
96+
97+
Backticks are valid for *every* identifier (verified end-to-end
98+
against a Databricks SQL warehouse), so we wrap unconditionally.
99+
This mirrors Oracle's ``:"name"`` approach to the same grammar
100+
constraint (see ``dialects/oracle/cx_oracle.py::OracleCompiler_cx_oracle``).
101+
The backticks are SQL-side *quoting* only: the parameter's logical
102+
name is still the text between them, so the params dict passed to
103+
the driver keeps the original unquoted key. We leave
104+
``escaped_bind_names`` untouched, so ``construct_params`` passes
105+
keys through unchanged.
106+
107+
Two render paths need covering:
108+
109+
* **Compile-time rendering** — statement compilation calls
110+
``bindparam_string`` via ``self.process(statement)``. Oracle
111+
overrides this same method (``cx_oracle.py:781``) to quote-wrap
112+
names, and we do the same here.
113+
* **Execute-time IN expansion** — SQLAlchemy's
114+
``_literal_execute_expanding_parameter`` builds expanded markers
115+
(``:col-name_1, :col-name_2, ...``) directly from
116+
``self.bindtemplate``, bypassing ``bindparam_string``. We swap
117+
``bindtemplate`` after super's ``__init__`` to ensure that path
118+
also emits backticked markers.
119+
"""
128120

129-
@property
130-
def compilation_bindtemplate(self):
131-
return self._BACKTICKED_BIND_TEMPLATE
121+
_BACKTICKED_BIND_TEMPLATE = ":`%(name)s`"
132122

133-
@compilation_bindtemplate.setter
134-
def compilation_bindtemplate(self, _value):
135-
pass
123+
def __init__(self, *args, **kwargs):
124+
super().__init__(*args, **kwargs)
125+
# Super sets self.bindtemplate from BIND_TEMPLATES[paramstyle]
126+
# near the end of its __init__ (for execute-time use, including
127+
# IN-clause expansion). Override it here so the expansion path
128+
# renders backticked markers too.
129+
self.bindtemplate = self._BACKTICKED_BIND_TEMPLATE
130+
131+
def bindparam_string(self, name, **kw):
132+
# Fall through to super for the specialized render paths it
133+
# already handles (POSTCOMPILE placeholder; escape-map translation
134+
# for chars like '.', '[', ']', etc. that super rewrites before
135+
# rendering). For those cases super's own rendering is correct;
136+
# we only intercept the primary path where the name is passed
137+
# through unmodified into the standard bindtemplate.
138+
if kw.get("post_compile", False) or kw.get("escaped_from"):
139+
return super().bindparam_string(name, **kw)
140+
141+
accumulate = kw.get("accumulate_bind_names")
142+
if accumulate is not None:
143+
accumulate.add(name)
144+
visited = kw.get("visited_bindparam")
145+
if visited is not None:
146+
visited.append(name)
147+
148+
ret = self._BACKTICKED_BIND_TEMPLATE % {"name": name}
149+
150+
bindparam_type = kw.get("bindparam_type")
151+
if bindparam_type is not None and self.dialect._bind_typing_render_casts:
152+
type_impl = bindparam_type._unwrapped_dialect_impl(self.dialect)
153+
if type_impl.render_bind_cast:
154+
ret = self.render_bind_cast(bindparam_type, type_impl, ret)
155+
return ret
136156

137157
def limit_clause(self, select, **kw):
138158
"""Identical to the default implementation of SQLCompiler.limit_clause except it writes LIMIT ALL instead of LIMIT -1,

tests/test_local/test_ddl.py

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -246,12 +246,13 @@ def test_many_special_characters_in_column_names(self):
246246
assert f":`{n}`" in sql, f"bind marker missing for {n!r}"
247247
assert params[n] == values[n]
248248

249-
def test_sqlalchemy_escape_map_chars_still_work(self):
250-
"""SQLAlchemy's default ``bindname_escape_characters`` translates
251-
a few chars (``.`` → ``_``, ``[`` → ``_``, ``]`` → ``_``, ``:`` →
252-
``C``, ``%`` → ``P``) before our backtick wrapping applies. That's
253-
fine: the translated bind name is still backtick-quoted, and
254-
``escaped_bind_names`` translates the params dict key to match.
249+
def test_chars_in_sqlalchemy_default_escape_map_still_work(self):
250+
"""Characters that SQLAlchemy's default ``bindname_escape_characters``
251+
would normally pre-translate (``.``, ``[``, ``]``, ``:``, ``%``)
252+
render through our override verbatim inside the backtick-quoted
253+
marker. Backticks make the pre-translation unnecessary — the
254+
params dict key sent to the driver matches the column name
255+
exactly, which is simpler than the escape-map indirection.
255256
Verified end-to-end against a live warehouse.
256257
"""
257258
metadata = MetaData()
@@ -273,17 +274,16 @@ def test_sqlalchemy_escape_map_chars_still_work(self):
273274
},
274275
)
275276
sql = str(compiled)
276-
# The bind name is translated by the escape map, then backticked
277-
assert ":`col_with_dot`" in sql
278-
assert ":`col_bracket_`" in sql
279-
assert ":`colCcolon`" in sql
280-
assert ":`colPpercent`" in sql
281-
282-
# The driver receives translated keys (escaped_bind_names tells
283-
# construct_params how to rewrite the incoming dict).
277+
assert ":`col.with.dot`" in sql
278+
assert ":`col[bracket]`" in sql
279+
assert ":`col:colon`" in sql
280+
assert ":`col%percent`" in sql
281+
284282
params = compiled.construct_params()
285-
assert params["col_with_dot"] == "d"
286-
assert params["colCcolon"] == "c"
283+
assert params["col.with.dot"] == "d"
284+
assert params["col:colon"] == "c"
285+
assert params["col[bracket]"] == "b"
286+
assert params["col%percent"] == "p"
287287

288288
def test_unicode_column_names(self):
289289
"""Databricks allows arbitrary Unicode inside backtick-quoted

0 commit comments

Comments
 (0)