Skip to content
This repository was archived by the owner on Apr 1, 2026. It is now read-only.

Commit 2d9dad5

Browse files
Merge remote-tracking branch 'github/main' into cte_extract2
2 parents 97a5a7f + d29a609 commit 2d9dad5

File tree

63 files changed

+1501
-1304
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+1501
-1304
lines changed

.librarian/state.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
image: us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/python-librarian-generator@sha256:1a2a85ab507aea26d787c06cc7979decb117164c81dd78a745982dfda80d4f68
1+
image: us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/python-librarian-generator@sha256:160860d189ff1c2f7515638478823712fa5b243e27ccc33a2728669fa1e2ed0c
22
libraries:
33
- id: bigframes
4-
version: 2.36.0
4+
version: 2.37.0
55
last_generated_commit: ""
66
apis: []
77
source_roots:

CHANGELOG.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,36 @@
44

55
[1]: https://pypi.org/project/bigframes/#history
66

7+
## [2.37.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.36.0...v2.37.0) (2026-03-03)
8+
9+
10+
### Documentation
11+
12+
* Fix recall_score doc example (#2477) ([a6f499c1e225a962b53621158f9d4a19ca220ccd](https://github.com/googleapis/python-bigquery-dataframes/commit/a6f499c1e225a962b53621158f9d4a19ca220ccd))
13+
* add code sample and docstring for bpd.options.experiments.sql_compiler (#2474) ([867951bcabcff12e2fce88143b45d929d3237088](https://github.com/googleapis/python-bigquery-dataframes/commit/867951bcabcff12e2fce88143b45d929d3237088))
14+
* use direct API for image (#2465) ([8a1a82f7a0fd224f2b075c68ab116d1f580d1d82](https://github.com/googleapis/python-bigquery-dataframes/commit/8a1a82f7a0fd224f2b075c68ab116d1f580d1d82))
15+
* add bigframes default connection warning (#2471) ([f1bbba23667f01d3b8e7c51b18fe64641a4b135f](https://github.com/googleapis/python-bigquery-dataframes/commit/f1bbba23667f01d3b8e7c51b18fe64641a4b135f))
16+
* Move readme content to new User Guide section (#2464) ([61a948451baeb1caa323e721ad88b31c7cd0b3cb](https://github.com/googleapis/python-bigquery-dataframes/commit/61a948451baeb1caa323e721ad88b31c7cd0b3cb))
17+
* Skip inherited methods, use autosummary only for big classes (#2470) ([a9512498ef39b9d5260cad2ca0513c701a6d3592](https://github.com/googleapis/python-bigquery-dataframes/commit/a9512498ef39b9d5260cad2ca0513c701a6d3592))
18+
* Add code examples to configuration docstrings (#2352) ([3c21993e6fca474c32f3c2371c41ef2be146267e](https://github.com/googleapis/python-bigquery-dataframes/commit/3c21993e6fca474c32f3c2371c41ef2be146267e))
19+
20+
21+
### Features
22+
23+
* Add cloud_function_cpus option to remote_function (#2475) ([4caf74ccaeb9608d91da864bb80eddf1148a1502](https://github.com/googleapis/python-bigquery-dataframes/commit/4caf74ccaeb9608d91da864bb80eddf1148a1502))
24+
* Support pd.col simple aggregates (#2480) ([cb00daabce49f067be8e16627166dda00d5d8134](https://github.com/googleapis/python-bigquery-dataframes/commit/cb00daabce49f067be8e16627166dda00d5d8134))
25+
* add display.render_mode to control DataFrame/Series visualization (#2413) ([7813eaa6fa2ae42943b90583e600c95beaf5d75e](https://github.com/googleapis/python-bigquery-dataframes/commit/7813eaa6fa2ae42943b90583e600c95beaf5d75e))
26+
* add support for Python 3.14 (#2232) ([c25a6d0151380dde74368a35e13deb7a930b494f](https://github.com/googleapis/python-bigquery-dataframes/commit/c25a6d0151380dde74368a35e13deb7a930b494f))
27+
* Support pd.col expressions with .loc and getitem (#2473) ([ae5c8b322765aef51eed016bfacaff5a7a917a7b](https://github.com/googleapis/python-bigquery-dataframes/commit/ae5c8b322765aef51eed016bfacaff5a7a917a7b))
28+
* add dt.tz_localize() (#2469) ([f70f93a1227add1627d522d7e55a37f42fc3549e](https://github.com/googleapis/python-bigquery-dataframes/commit/f70f93a1227add1627d522d7e55a37f42fc3549e))
29+
* Update bigquery.ai.generate_table output_schema to allow Mapping type (#2463) ([f7fd1895e64a133fe63eddeb90f57a42a35c29b2](https://github.com/googleapis/python-bigquery-dataframes/commit/f7fd1895e64a133fe63eddeb90f57a42a35c29b2))
30+
31+
32+
### Bug Fixes
33+
34+
* upload local data through write API if nested JSONs detected (#2478) ([01dc5a34e09171351575d5cbdc9f301e505e1567](https://github.com/googleapis/python-bigquery-dataframes/commit/01dc5a34e09171351575d5cbdc9f301e505e1567))
35+
* allow IsInOp with same dtypes regardless nullable (#2466) ([1d81b414acbc964502ca624eae72cdb8c14e1576](https://github.com/googleapis/python-bigquery-dataframes/commit/1d81b414acbc964502ca624eae72cdb8c14e1576))
36+
737
## [2.36.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.35.0...v2.36.0) (2026-02-17)
838

939

GEMINI.md

Lines changed: 0 additions & 5 deletions
This file was deleted.

bigframes/bigquery/_operations/ml.py

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -480,6 +480,39 @@ def generate_text(
480480
return session.read_gbq_query(sql)
481481

482482

483+
@log_adapter.method_logger(custom_base_name="bigquery_ml")
484+
def get_insights(
485+
model: Union[bigframes.ml.base.BaseEstimator, str, pd.Series],
486+
) -> dataframe.DataFrame:
487+
"""
488+
Gets insights from a BigQuery ML model.
489+
490+
See the `BigQuery ML GET_INSIGHTS function syntax
491+
<https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-get-insights>`_
492+
for additional reference.
493+
494+
Args:
495+
model (bigframes.ml.base.BaseEstimator, str, or pd.Series):
496+
The model to get insights from.
497+
498+
Returns:
499+
bigframes.pandas.DataFrame:
500+
The insights.
501+
"""
502+
import bigframes.pandas as bpd
503+
504+
model_name, session = utils.get_model_name_and_session(model)
505+
506+
sql = bigframes.core.sql.ml.get_insights(
507+
model_name=model_name,
508+
)
509+
510+
if session is None:
511+
return bpd.read_gbq_query(sql)
512+
else:
513+
return session.read_gbq_query(sql)
514+
515+
483516
@log_adapter.method_logger(custom_base_name="bigquery_ml")
484517
def generate_embedding(
485518
model: Union[bigframes.ml.base.BaseEstimator, str, pd.Series],

bigframes/bigquery/_operations/sql.py

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020

2121
import google.cloud.bigquery
2222

23-
import bigframes.core.compile.sqlglot.sqlglot_ir as sqlglot_ir
23+
from bigframes.core.compile.sqlglot import sql
2424
import bigframes.dtypes
2525
import bigframes.operations
2626
import bigframes.series
@@ -68,10 +68,7 @@ def sql_scalar(
6868
# Another benefit of this is that if there is a syntax error in the SQL
6969
# template, then this will fail with an error earlier in the process,
7070
# aiding users in debugging.
71-
literals_sql = [
72-
sqlglot_ir._literal(None, column.dtype).sql(dialect="bigquery")
73-
for column in columns
74-
]
71+
literals_sql = [sql.to_sql(sql.literal(None, column.dtype)) for column in columns]
7572
select_sql = sql_template.format(*literals_sql)
7673
dry_run_sql = f"SELECT {select_sql}"
7774

bigframes/bigquery/ml.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
explain_predict,
2626
generate_embedding,
2727
generate_text,
28+
get_insights,
2829
global_explain,
2930
predict,
3031
transform,
@@ -39,4 +40,5 @@
3940
"transform",
4041
"generate_text",
4142
"generate_embedding",
43+
"get_insights",
4244
]

bigframes/core/col.py

Lines changed: 80 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,18 @@
1414
from __future__ import annotations
1515

1616
import dataclasses
17-
from typing import Any, Hashable
17+
from typing import Any, Hashable, Literal, TYPE_CHECKING
1818

1919
import bigframes_vendored.pandas.core.col as pd_col
2020

21+
from bigframes.core import agg_expressions, window_spec
2122
import bigframes.core.expression as bf_expression
2223
import bigframes.operations as bf_ops
24+
import bigframes.operations.aggregations as agg_ops
25+
26+
if TYPE_CHECKING:
27+
import bigframes.operations.datetimes as datetimes
28+
import bigframes.operations.strings as strings
2329

2430

2531
# Not to be confused with the Expression class in `bigframes.core.expressions`
@@ -30,10 +36,26 @@ class Expression:
3036

3137
_value: bf_expression.Expression
3238

33-
def _apply_unary(self, op: bf_ops.UnaryOp) -> Expression:
39+
def _apply_unary_op(self, op: bf_ops.UnaryOp) -> Expression:
3440
return Expression(op.as_expr(self._value))
3541

36-
def _apply_binary(self, other: Any, op: bf_ops.BinaryOp, reverse: bool = False):
42+
def _apply_unary_agg(self, op: agg_ops.UnaryAggregateOp) -> Expression:
43+
# We probably shouldn't need to windowize here, but block apis expect pre-windowized expressions
44+
# Later on, we will probably have col expressions in windowed context, so will need to defer windowization
45+
# instead of automatically applying the default unbound window
46+
agg_expr = op.as_expr(self._value)
47+
return Expression(
48+
agg_expressions.WindowExpression(agg_expr, window_spec.unbound())
49+
)
50+
51+
# alignment is purely for series compatibility, and is ignored here
52+
def _apply_binary_op(
53+
self,
54+
other: Any,
55+
op: bf_ops.BinaryOp,
56+
alignment: Literal["outer", "left"] = "outer",
57+
reverse: bool = False,
58+
):
3759
if isinstance(other, Expression):
3860
other_value = other._value
3961
else:
@@ -44,79 +66,109 @@ def _apply_binary(self, other: Any, op: bf_ops.BinaryOp, reverse: bool = False):
4466
return Expression(op.as_expr(self._value, other_value))
4567

4668
def __add__(self, other: Any) -> Expression:
47-
return self._apply_binary(other, bf_ops.add_op)
69+
return self._apply_binary_op(other, bf_ops.add_op)
4870

4971
def __radd__(self, other: Any) -> Expression:
50-
return self._apply_binary(other, bf_ops.add_op, reverse=True)
72+
return self._apply_binary_op(other, bf_ops.add_op, reverse=True)
5173

5274
def __sub__(self, other: Any) -> Expression:
53-
return self._apply_binary(other, bf_ops.sub_op)
75+
return self._apply_binary_op(other, bf_ops.sub_op)
5476

5577
def __rsub__(self, other: Any) -> Expression:
56-
return self._apply_binary(other, bf_ops.sub_op, reverse=True)
78+
return self._apply_binary_op(other, bf_ops.sub_op, reverse=True)
5779

5880
def __mul__(self, other: Any) -> Expression:
59-
return self._apply_binary(other, bf_ops.mul_op)
81+
return self._apply_binary_op(other, bf_ops.mul_op)
6082

6183
def __rmul__(self, other: Any) -> Expression:
62-
return self._apply_binary(other, bf_ops.mul_op, reverse=True)
84+
return self._apply_binary_op(other, bf_ops.mul_op, reverse=True)
6385

6486
def __truediv__(self, other: Any) -> Expression:
65-
return self._apply_binary(other, bf_ops.div_op)
87+
return self._apply_binary_op(other, bf_ops.div_op)
6688

6789
def __rtruediv__(self, other: Any) -> Expression:
68-
return self._apply_binary(other, bf_ops.div_op, reverse=True)
90+
return self._apply_binary_op(other, bf_ops.div_op, reverse=True)
6991

7092
def __floordiv__(self, other: Any) -> Expression:
71-
return self._apply_binary(other, bf_ops.floordiv_op)
93+
return self._apply_binary_op(other, bf_ops.floordiv_op)
7294

7395
def __rfloordiv__(self, other: Any) -> Expression:
74-
return self._apply_binary(other, bf_ops.floordiv_op, reverse=True)
96+
return self._apply_binary_op(other, bf_ops.floordiv_op, reverse=True)
7597

7698
def __ge__(self, other: Any) -> Expression:
77-
return self._apply_binary(other, bf_ops.ge_op)
99+
return self._apply_binary_op(other, bf_ops.ge_op)
78100

79101
def __gt__(self, other: Any) -> Expression:
80-
return self._apply_binary(other, bf_ops.gt_op)
102+
return self._apply_binary_op(other, bf_ops.gt_op)
81103

82104
def __le__(self, other: Any) -> Expression:
83-
return self._apply_binary(other, bf_ops.le_op)
105+
return self._apply_binary_op(other, bf_ops.le_op)
84106

85107
def __lt__(self, other: Any) -> Expression:
86-
return self._apply_binary(other, bf_ops.lt_op)
108+
return self._apply_binary_op(other, bf_ops.lt_op)
87109

88110
def __eq__(self, other: object) -> Expression: # type: ignore
89-
return self._apply_binary(other, bf_ops.eq_op)
111+
return self._apply_binary_op(other, bf_ops.eq_op)
90112

91113
def __ne__(self, other: object) -> Expression: # type: ignore
92-
return self._apply_binary(other, bf_ops.ne_op)
114+
return self._apply_binary_op(other, bf_ops.ne_op)
93115

94116
def __mod__(self, other: Any) -> Expression:
95-
return self._apply_binary(other, bf_ops.mod_op)
117+
return self._apply_binary_op(other, bf_ops.mod_op)
96118

97119
def __rmod__(self, other: Any) -> Expression:
98-
return self._apply_binary(other, bf_ops.mod_op, reverse=True)
120+
return self._apply_binary_op(other, bf_ops.mod_op, reverse=True)
99121

100122
def __and__(self, other: Any) -> Expression:
101-
return self._apply_binary(other, bf_ops.and_op)
123+
return self._apply_binary_op(other, bf_ops.and_op)
102124

103125
def __rand__(self, other: Any) -> Expression:
104-
return self._apply_binary(other, bf_ops.and_op, reverse=True)
126+
return self._apply_binary_op(other, bf_ops.and_op, reverse=True)
105127

106128
def __or__(self, other: Any) -> Expression:
107-
return self._apply_binary(other, bf_ops.or_op)
129+
return self._apply_binary_op(other, bf_ops.or_op)
108130

109131
def __ror__(self, other: Any) -> Expression:
110-
return self._apply_binary(other, bf_ops.or_op, reverse=True)
132+
return self._apply_binary_op(other, bf_ops.or_op, reverse=True)
111133

112134
def __xor__(self, other: Any) -> Expression:
113-
return self._apply_binary(other, bf_ops.xor_op)
135+
return self._apply_binary_op(other, bf_ops.xor_op)
114136

115137
def __rxor__(self, other: Any) -> Expression:
116-
return self._apply_binary(other, bf_ops.xor_op, reverse=True)
138+
return self._apply_binary_op(other, bf_ops.xor_op, reverse=True)
117139

118140
def __invert__(self) -> Expression:
119-
return self._apply_unary(bf_ops.invert_op)
141+
return self._apply_unary_op(bf_ops.invert_op)
142+
143+
def sum(self) -> Expression:
144+
return self._apply_unary_agg(agg_ops.sum_op)
145+
146+
def mean(self) -> Expression:
147+
return self._apply_unary_agg(agg_ops.mean_op)
148+
149+
def var(self) -> Expression:
150+
return self._apply_unary_agg(agg_ops.var_op)
151+
152+
def std(self) -> Expression:
153+
return self._apply_unary_agg(agg_ops.std_op)
154+
155+
def min(self) -> Expression:
156+
return self._apply_unary_agg(agg_ops.min_op)
157+
158+
def max(self) -> Expression:
159+
return self._apply_unary_agg(agg_ops.max_op)
160+
161+
@property
162+
def dt(self) -> datetimes.DatetimeSimpleMethods:
163+
import bigframes.operations.datetimes as datetimes
164+
165+
return datetimes.DatetimeSimpleMethods(self)
166+
167+
@property
168+
def str(self) -> strings.StringMethods:
169+
import bigframes.operations.strings as strings
170+
171+
return strings.StringMethods(self)
120172

121173

122174
def col(col_name: Hashable) -> Expression:

bigframes/core/compile/compiled.py

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -23,12 +23,12 @@
2323
import bigframes_vendored.ibis.expr.datatypes as ibis_dtypes
2424
import bigframes_vendored.ibis.expr.operations as ibis_ops
2525
import bigframes_vendored.ibis.expr.types as ibis_types
26+
import bigframes_vendored.sqlglot.expressions as sge
2627
from google.cloud import bigquery
2728
import pyarrow as pa
2829

2930
from bigframes.core import agg_expressions, rewrite
3031
import bigframes.core.agg_expressions as ex_types
31-
import bigframes.core.compile.googlesql
3232
import bigframes.core.compile.ibis_compiler.aggregate_compiler as agg_compiler
3333
import bigframes.core.compile.ibis_compiler.scalar_op_compiler as op_compilers
3434
import bigframes.core.compile.ibis_types
@@ -82,13 +82,21 @@ def to_sql(
8282
)
8383

8484
if order_by or limit or not is_noop_selection:
85-
sql = ibis_bigquery.Backend().compile(ibis_table)
86-
sql = (
87-
bigframes.core.compile.googlesql.Select()
88-
.from_(sql)
89-
.select(selection_strings)
90-
.sql()
91-
)
85+
# selections are (ref.id.sql, name) where ref.id.sql is escaped identifier
86+
to_select = [
87+
sge.Alias(
88+
this=sge.to_identifier(src, quoted=True),
89+
alias=sge.to_identifier(alias, quoted=True),
90+
)
91+
if src != alias
92+
else sge.to_identifier(src, quoted=True)
93+
for src, alias in selection_strings
94+
]
95+
# Use string formatting for FROM clause to avoid re-parsing potentially complex SQL (like ARRAY<STRUCT<...>>)
96+
# that sqlglot might not handle perfectly when parsing BigQuery dialect strings.
97+
select_sql = sge.Select().select(*to_select).sql(dialect="bigquery")
98+
ibis_sql = ibis_bigquery.Backend().compile(ibis_table)
99+
sql = f"{select_sql} FROM ({ibis_sql}) AS `t`"
92100

93101
# Single row frames may not have any ordering columns
94102
if len(order_by) > 0:
@@ -99,7 +107,7 @@ def to_sql(
99107
raise TypeError(f"Limit param: {limit} must be an int.")
100108
sql += f"\nLIMIT {limit}"
101109
else:
102-
sql = ibis_bigquery.Backend().compile(self._to_ibis_expr())
110+
sql = ibis_bigquery.Backend().compile(ibis_table)
103111
return typing.cast(str, sql)
104112

105113
@property

0 commit comments

Comments
 (0)