Skip to content

fix(bigframes): Fix bugs compiling ambiguous ids and in subqueries#16617

Open
TrevorBergeron wants to merge 6 commits intomainfrom
tbergeron_isin_test_fixing
Open

fix(bigframes): Fix bugs compiling ambiguous ids and in subqueries#16617
TrevorBergeron wants to merge 6 commits intomainfrom
tbergeron_isin_test_fixing

Conversation

@TrevorBergeron
Copy link
Copy Markdown
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

@TrevorBergeron TrevorBergeron requested review from a team as code owners April 10, 2026 22:53
@TrevorBergeron TrevorBergeron requested review from sycai and removed request for a team April 10, 2026 22:53
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several updates, including adding a fillna(False) to isin operations, refactoring BigQuery query generation to use explicit table aliasing (_bf_source), and adjusting the SQL parser to handle table aliases before time travel clauses. Feedback suggests qualifying the wildcard selector in BigQuery queries for consistency and warns that the parser change might cause regressions for SQL dialects that expect aliases after table samples.

select_clause = "SELECT " + ", ".join(f"`{column}`" for column in columns)
select_clause = "SELECT " + ", ".join(f"`_bf_source`.`{column}`" for column in columns)
else:
select_clause = "SELECT *"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To maintain consistency with the qualified column selection in the if block and to further prevent ambiguity when this query is used as a subquery, consider qualifying the wildcard selector as well.

Suggested change
select_clause = "SELECT *"
select_clause = "SELECT _bf_source.*"

Comment on lines +4474 to +4478
alias = self._parse_table_alias(
alias_tokens=alias_tokens or self.TABLE_ALIAS_TOKENS
)
if alias:
this.set("alias", alias)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Moving the alias parsing to the top of _parse_table correctly addresses BigQuery's syntax requirements (where the alias precedes the FOR SYSTEM_TIME AS OF clause). However, this change unconditionally parses the alias before the table sample, which may break dialects that set ALIAS_POST_TABLESAMPLE = True. Since this is a vendored parser, if it's intended to support multiple dialects, consider making this move conditional on the dialect's settings or ensuring that the alias can still be parsed after the sample if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant