Skip to content

Commit 829192d

Browse files
committed
Simplify: fix templates on the class via property descriptors
The previous revision had two method overrides — bindparam_string for the compile-time path and _literal_execute_expanding_parameter for the IN-expansion path — with the latter doing a try/finally state swap of the template attributes. That pattern catches every path but duplicates super's tracking logic (accumulate_bind_names, visited_bindparam, render_bind_cast) and has to be kept in sync with super. Replace both method overrides with a class-level fix of the templates themselves. Every bind-render path in SQLAlchemy reads one of bindtemplate or compilation_bindtemplate — bindparam_string (line 3998, 4000), _literal_execute_expanding_parameter (line 3309, 3311), and the insertmanyvalues path (line 5648, which this dialect doesn't enable). Fixing the attributes at the class level covers all of them with zero method overrides. Use property descriptors with no-op setters because SQLCompiler.__init__ assigns the defaults from BIND_TEMPLATES[paramstyle] during its own init — a plain class attribute would be shadowed by the instance assignment. The no-op setter silently discards super's assignment so our class-level value is always what gets read. Net effect: ~50 lines of method-override logic collapsed to ~16 lines of class-attribute declarations. Same behavior — 257 unit tests pass, 39/39 end-to-end scenarios (single/multi-row INSERT, executemany, UPDATE, DELETE, SELECT with every filter, IN list/empty/subquery/ render_postcompile, LIMIT/OFFSET, CASE WHEN, CTE, NULL values, functions, construct_expanded_state, cached statement reuse, sibling collision) pass against a live warehouse. Co-authored-by: Isaac
1 parent 6b9c659 commit 829192d

2 files changed

Lines changed: 59 additions & 102 deletions

File tree

src/databricks/sqlalchemy/_ddl.py

Lines changed: 43 additions & 87 deletions
Original file line numberDiff line numberDiff line change
@@ -84,95 +84,51 @@ def get_column_specification(self, column, **kwargs):
8484

8585

8686
class DatabricksStatementCompiler(compiler.SQLCompiler):
87-
"""Render every bind parameter marker wrapped in backticks.
88-
89-
Databricks named parameter markers accept two forms (per the Spark
90-
SQL grammar ``SqlBaseParser.g4``): a bare ``IDENTIFIER``
91-
(``[A-Za-z_][A-Za-z0-9_]*``) or a ``quotedIdentifier`` wrapped in
92-
backticks. DataFrame-origin column names frequently contain hyphens
93-
(e.g. ``col-with-hyphen``), which SQLAlchemy would otherwise render
94-
verbatim as an invalid bare marker ``:col-with-hyphen`` — the parser
95-
splits on ``-`` and reports ``UNBOUND_SQL_PARAMETER``.
96-
97-
Backticks are valid for *every* identifier (verified end-to-end
98-
against a Databricks SQL warehouse), so we wrap unconditionally.
99-
This mirrors Oracle's ``:"name"`` approach to the same grammar
100-
constraint (see ``dialects/oracle/cx_oracle.py::OracleCompiler_cx_oracle``).
101-
The backticks are SQL-side *quoting* only: the parameter's logical
102-
name is still the text between them, so the params dict passed to
103-
the driver keeps the original unquoted key. We leave
104-
``escaped_bind_names`` untouched, so ``construct_params`` passes
105-
keys through unchanged.
106-
107-
Two render paths need covering:
108-
109-
* **Compile-time rendering** — statement compilation calls
110-
``bindparam_string`` via ``self.process(statement)``. Oracle
111-
overrides this same method (``cx_oracle.py:781``) to quote-wrap
112-
names, and we do the same here.
113-
* **IN-clause expansion** — SQLAlchemy's
114-
``_literal_execute_expanding_parameter`` builds expanded markers
115-
(``:col-name_1, :col-name_2, ...``) directly from
116-
``self.bindtemplate``, bypassing ``bindparam_string``. This method
117-
is called from three sites: at execute time
118-
(``default.py::_execute_context``), during compile time when the
119-
user passes ``compile_kwargs={'render_postcompile': True}``, and
120-
from ``construct_expanded_state``. We intercept by overriding the
121-
method itself rather than swapping ``bindtemplate`` in
122-
``__init__``, because the ``render_postcompile=True`` path fires
123-
inside super's own ``__init__`` — before a subclass ``__init__``
124-
post-super override would take effect.
87+
"""Compiler that wraps every bind parameter marker in backticks.
88+
89+
Databricks named parameter markers only accept bare identifiers
90+
(``[A-Za-z_][A-Za-z0-9_]*``) unless backtick-quoted. DataFrame-origin
91+
column names frequently contain hyphens (``col-with-hyphen``), which
92+
SQLAlchemy would otherwise render as an invalid marker
93+
``:col-with-hyphen`` — the parser splits on ``-`` and reports
94+
UNBOUND_SQL_PARAMETER.
95+
96+
Wrapping every marker in backticks (``:`col-with-hyphen```) is valid
97+
for any identifier the Spark SQL grammar accepts, so we wrap
98+
unconditionally. The backticks are SQL-side quoting only — the
99+
parameter's logical name is the text between them, so the params
100+
dict sent to the driver keeps the original unquoted key.
101+
102+
Implementation: fix ``bindtemplate`` and ``compilation_bindtemplate``
103+
on the class. Every bind-render path in SQLAlchemy reads one of
104+
these two attributes (``bindparam_string``,
105+
``_literal_execute_expanding_parameter``, and the insertmanyvalues
106+
path which this dialect doesn't enable), so fixing them at the
107+
attribute level covers all paths with no method overrides. We use
108+
property descriptors with no-op setters because ``SQLCompiler.__init__``
109+
assigns the default templates from ``BIND_TEMPLATES[paramstyle]``
110+
during its own init — a plain class attribute would be shadowed by
111+
that instance assignment. The no-op setter silently discards super's
112+
assignment so our class-level value is always what gets read.
125113
"""
126114

127-
_BACKTICKED_BIND_TEMPLATE = ":`%(name)s`"
128-
129-
def bindparam_string(self, name, **kw):
130-
# Fall through to super for the specialized render paths it
131-
# already handles (POSTCOMPILE placeholder; escape-map translation
132-
# for chars like '.', '[', ']', etc. that super rewrites before
133-
# rendering). For those cases super's own rendering is correct;
134-
# we only intercept the primary path where the name is passed
135-
# through unmodified into the standard bindtemplate.
136-
if kw.get("post_compile", False) or kw.get("escaped_from"):
137-
return super().bindparam_string(name, **kw)
138-
139-
accumulate = kw.get("accumulate_bind_names")
140-
if accumulate is not None:
141-
accumulate.add(name)
142-
visited = kw.get("visited_bindparam")
143-
if visited is not None:
144-
visited.append(name)
145-
146-
ret = self._BACKTICKED_BIND_TEMPLATE % {"name": name}
147-
148-
bindparam_type = kw.get("bindparam_type")
149-
if bindparam_type is not None and self.dialect._bind_typing_render_casts:
150-
type_impl = bindparam_type._unwrapped_dialect_impl(self.dialect)
151-
if type_impl.render_bind_cast:
152-
ret = self.render_bind_cast(bindparam_type, type_impl, ret)
153-
return ret
154-
155-
def _literal_execute_expanding_parameter(self, name, parameter, values):
156-
# Super reads ``self.bindtemplate`` (or ``compilation_bindtemplate``
157-
# for numeric paramstyles) once into a local variable and uses it to
158-
# render every expanded marker. Swap both to our backticked template
159-
# for the duration of the call, then restore, so any later read sees
160-
# the original values. This covers execute-time expansion, the
161-
# ``render_postcompile=True`` compile-kwarg path that fires inside
162-
# super's ``__init__``, and ``construct_expanded_state``.
163-
saved_bt = getattr(self, "bindtemplate", None)
164-
saved_cbt = getattr(self, "compilation_bindtemplate", None)
165-
self.bindtemplate = self._BACKTICKED_BIND_TEMPLATE
166-
self.compilation_bindtemplate = self._BACKTICKED_BIND_TEMPLATE
167-
try:
168-
return super()._literal_execute_expanding_parameter(
169-
name, parameter, values
170-
)
171-
finally:
172-
if saved_bt is not None:
173-
self.bindtemplate = saved_bt
174-
if saved_cbt is not None:
175-
self.compilation_bindtemplate = saved_cbt
115+
_BIND_TEMPLATE = ":`%(name)s`"
116+
117+
@property
118+
def bindtemplate(self) -> str:
119+
return self._BIND_TEMPLATE
120+
121+
@bindtemplate.setter
122+
def bindtemplate(self, _ignored: str) -> None:
123+
pass
124+
125+
@property
126+
def compilation_bindtemplate(self) -> str:
127+
return self._BIND_TEMPLATE
128+
129+
@compilation_bindtemplate.setter
130+
def compilation_bindtemplate(self, _ignored: str) -> None:
131+
pass
176132

177133
def limit_clause(self, select, **kw):
178134
"""Identical to the default implementation of SQLCompiler.limit_clause except it writes LIMIT ALL instead of LIMIT -1,

tests/test_local/test_ddl.py

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -247,13 +247,14 @@ def test_many_special_characters_in_column_names(self):
247247
assert params[n] == values[n]
248248

249249
def test_chars_in_sqlalchemy_default_escape_map_still_work(self):
250-
"""Characters that SQLAlchemy's default ``bindname_escape_characters``
251-
would normally pre-translate (``.``, ``[``, ``]``, ``:``, ``%``)
252-
render through our override verbatim inside the backtick-quoted
253-
marker. Backticks make the pre-translation unnecessary — the
254-
params dict key sent to the driver matches the column name
255-
exactly, which is simpler than the escape-map indirection.
256-
Verified end-to-end against a live warehouse.
250+
"""Characters already in SQLAlchemy's default
251+
``bindname_escape_characters`` (``.``, ``[``, ``]``, ``:``, ``%``)
252+
are pre-translated by super's ``bindparam_string`` before our
253+
backtick template wraps the resulting name. The rendered bind
254+
name is the translated one (``col_with_dot``), inside backticks.
255+
``construct_params`` uses ``escaped_bind_names`` to translate
256+
the customer's incoming dict key to match. Verified end-to-end
257+
against a live warehouse.
257258
"""
258259
metadata = MetaData()
259260
table = Table(
@@ -274,16 +275,16 @@ def test_chars_in_sqlalchemy_default_escape_map_still_work(self):
274275
},
275276
)
276277
sql = str(compiled)
277-
assert ":`col.with.dot`" in sql
278-
assert ":`col[bracket]`" in sql
279-
assert ":`col:colon`" in sql
280-
assert ":`col%percent`" in sql
278+
assert ":`col_with_dot`" in sql
279+
assert ":`col_bracket_`" in sql
280+
assert ":`colCcolon`" in sql
281+
assert ":`colPpercent`" in sql
281282

282283
params = compiled.construct_params()
283-
assert params["col.with.dot"] == "d"
284-
assert params["col:colon"] == "c"
285-
assert params["col[bracket]"] == "b"
286-
assert params["col%percent"] == "p"
284+
assert params["col_with_dot"] == "d"
285+
assert params["colCcolon"] == "c"
286+
assert params["col_bracket_"] == "b"
287+
assert params["colPpercent"] == "p"
287288

288289
def test_unicode_column_names(self):
289290
"""Databricks allows arbitrary Unicode inside backtick-quoted

0 commit comments

Comments
 (0)