-
Notifications
You must be signed in to change notification settings - Fork 2.1k
fix: SQLAlchemy 2.0 e2e fixes for Redshift, Oracle, Athena, MSSQL profiler #26411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
32d8092
f32a2b4
327dc55
555482e
1d5991a
0922db4
f0a4796
8ee6014
437d80b
ac87175
a0d7b54
f2df9a6
20c954c
1ec778a
4312d72
a7d1caa
e2537b9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,7 +15,7 @@ | |
| import hashlib | ||
| from typing import List, Optional, Union, cast | ||
|
|
||
| from sqlalchemy import Column, inspect, text | ||
| from sqlalchemy import Column, inspect, select, text | ||
| from sqlalchemy.orm import Query | ||
| from sqlalchemy.orm.util import AliasedClass | ||
| from sqlalchemy.schema import Table | ||
|
|
@@ -199,8 +199,7 @@ def get_dataset(self, column=None, **__) -> Union[type, AliasedClass]: | |
| and self.sample_config.profileSample == 100 | ||
| ): | ||
| if self.partition_details: | ||
| partitioned = self._partitioned_table() | ||
| return partitioned.cte(f"{self.get_sampler_table_name()}_partitioned") | ||
| return self._partitioned_table() | ||
|
|
||
| return self.raw_dataset | ||
|
|
||
|
|
@@ -306,9 +305,18 @@ def _rdn_sample_from_user_query(self) -> Query: | |
| f"{self.get_sampler_table_name()}_user_sampled" | ||
| ) | ||
|
|
||
| def _partitioned_table(self) -> Query: | ||
| """Return the Query object for partitioned tables""" | ||
| return self.get_partitioned_query() | ||
| def _partitioned_table(self): | ||
| """Return a CTE for partitioned tables. | ||
|
|
||
| Build the CTE using Core select() so it does not require an active Session. | ||
| """ | ||
| self.partition_details = cast(PartitionProfilerConfig, self.partition_details) | ||
| partition_filter = build_partition_predicate( | ||
| self.partition_details, | ||
| self.raw_dataset.__table__.c, | ||
| ) | ||
| stmt = select(self.raw_dataset).where(partition_filter) | ||
| return stmt.cte(f"{self.get_sampler_table_name()}_partitioned") | ||
|
|
||
| def get_partitioned_query(self, query=None) -> Query: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Quality: get_partitioned_query return type annotation is incorrectAfter this change, Similarly, Suggested fix: Was this helpful? React with 👍 / 👎 | Reply |
||
| """Return the partitioned query""" | ||
|
|
@@ -322,8 +330,8 @@ def get_partitioned_query(self, query=None) -> Query: | |
| if query is not None: | ||
| return query.filter(partition_filter) | ||
|
|
||
| with self.session_factory() as client: | ||
| return client.query(self.raw_dataset).filter(partition_filter) | ||
| # Return a Core select so callers do not require an active Session | ||
| return select(self.raw_dataset).where(partition_filter) | ||
|
|
||
| def get_columns(self): | ||
| """get columns from entity""" | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR changes
ometa_to_sqa_ormto returnNonewhen a table has no columns (base.py:131-136), but none of its callers handle this. Specifically,SQAInterfaceMixin.build_table_orminsqa_mixin.py:125passes the result through, andSQASampler.__init__atsampler.py:74assigns it toself._tablewithout a None check. Later accesses likeself.raw_dataset.__table__(sampler.py:134) will raiseAttributeError: 'NoneType' object has no attribute '__table__'.The new guards in
ProfilerProcessor._runandSamplerProcessor._run(checkingrecord.entity.columns) should prevent this path in the normal pipeline, butometa_to_sqa_ormandbuild_table_ormare also called from other places (e.g., tests, integration paths). The function contract now promisesOptional[type]but callers still assume a non-None return.Suggested fix:
Was this helpful? React with 👍 / 👎 | Reply
gitar fixto apply this suggestion