Skip to content
This repository was archived by the owner on Apr 1, 2026. It is now read-only.

Commit 3d1dad9

Browse files
committed
docs: improve more index-y pages
1 parent 2f11b88 commit 3d1dad9

File tree

4 files changed

+131
-9
lines changed

4 files changed

+131
-9
lines changed

bigframes/bigquery/__init__.py

Lines changed: 32 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,38 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
"""This module integrates BigQuery built-in functions for use with DataFrame objects,
16-
such as array functions:
17-
https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions. """
15+
"""
16+
Access BigQuery-specific operations and namespaces within BigQuery DataFrames.
17+
18+
This module provides specialized functions and sub-modules that expose BigQuery's
19+
advanced capabilities to DataFrames and Series. It acts as a bridge between the
20+
pandas-compatible API and the full power of BigQuery SQL.
21+
22+
Key sub-modules include:
23+
24+
* :mod:`bigframes.bigquery.ai`: Generative and predictive AI functions (Gemini, BQML).
25+
* :mod:`bigframes.bigquery.ml`: Direct access to BigQuery ML model operations.
26+
* :mod:`bigframes.bigquery.obj`: Support for BigQuery object tables.
27+
28+
This module also provides direct access to optimized BigQuery functions for:
29+
30+
* **JSON Processing:** High-performance functions like ``json_extract``, ``json_value``,
31+
and ``parse_json`` for handling semi-structured data.
32+
* **Geospatial Analysis:** Comprehensive geographic functions such as ``st_area``,
33+
``st_distance``, and ``st_centroid`` (``ST_`` prefixed functions).
34+
* **Array Operations:** Tools for working with BigQuery arrays, including ``array_agg``
35+
and ``array_length``.
36+
* **Vector Search:** Integration with BigQuery's vector search and indexing
37+
capabilities for high-dimensional data.
38+
* **Custom SQL:** The ``sql_scalar`` function allows embedding raw SQL snippets for
39+
advanced operations not yet directly mapped in the API.
40+
41+
By using these functions, you can leverage BigQuery's high-performance engine for
42+
domain-specific tasks while maintaining a Python-centric development experience.
43+
44+
For the full list of BigQuery standard SQL functions, see:
45+
https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-reference
46+
"""
1847

1948
import sys
2049

bigframes/bigquery/ai.py

Lines changed: 39 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,45 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
"""This module integrates BigQuery built-in AI functions for use with Series/DataFrame objects,
16-
such as AI.GENERATE_BOOL:
17-
https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-ai-generate-bool"""
15+
"""
16+
Integrate BigQuery built-in AI functions into your BigQuery DataFrames workflow.
17+
18+
The ``bigframes.bigquery.ai`` module provides a Pythonic interface to leverage BigQuery ML's
19+
generative AI and predictive functions directly on BigQuery DataFrames and Series objects.
20+
These functions enable you to perform advanced AI tasks at scale without moving data
21+
out of BigQuery.
22+
23+
Key capabilities include:
24+
25+
* **Generative AI:** Use :func:`bigframes.bigquery.ai.generate` (Gemini) to
26+
perform text analysis, translation, or
27+
content generation. Specialized versions like
28+
:func:`~bigframes.bigquery.ai.generate_bool`,
29+
:func:`~bigframes.bigquery.ai.generate_int`, and
30+
:func:`~bigframes.bigquery.ai.generate_double` are available for structured
31+
outputs.
32+
* **Embeddings:** Generate vector embeddings for text using
33+
:func:`~bigframes.bigquery.ai.generate_embedding`, which are essential for
34+
semantic search and retrieval-augmented generation (RAG) workflows.
35+
* **Classification and Scoring:** Apply machine learning models to your data for
36+
predictive tasks with :func:`~bigframes.bigquery.ai.classify` and
37+
:func:`~bigframes.bigquery.ai.score`.
38+
* **Forecasting:** Predict future values in time-series data using
39+
:func:`~bigframes.bigquery.ai.forecast`.
40+
41+
**Example usage:**
42+
43+
>>> import bigframes.pandas as bpd
44+
>>> import bigframes.bigquery as bbq
45+
46+
>>> df = bpd.DataFrame({"text_input": ["Is this a positive review?", "The food was terrible."]}) # doctest: +SKIP
47+
48+
>>> # Assuming a Gemini model has been created in BigQuery as 'my_gemini_model'
49+
>>> result = bq.ai.generate_text(df["text_input"], model_name="my_gemini_model") # doctest: +SKIP
50+
51+
For more information on the underlying BigQuery ML syntax, see:
52+
https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-ai-generate-bool
53+
"""
1854

1955
from bigframes.bigquery._operations.ai import (
2056
classify,

bigframes/pandas/__init__.py

Lines changed: 58 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,64 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
"""BigQuery DataFrames provides a DataFrame API backed by the BigQuery engine."""
15+
"""
16+
The primary entry point for the BigQuery DataFrames (BigFrames) pandas-compatible API.
17+
18+
**BigQuery DataFrames** provides a Pythonic DataFrame and machine learning (ML) API
19+
powered by the BigQuery engine. The ``bigframes.pandas`` module implements a large
20+
subset of the pandas API, allowing you to perform large-scale data analysis
21+
using familiar pandas syntax while the computations are executed in the cloud.
22+
23+
**Key Features:**
24+
25+
* **Petabyte-Scale Scalability:** Handle datasets that exceed local memory by
26+
offloading computation to the BigQuery distributed engine.
27+
* **Pandas Compatibility:** Use common pandas methods like
28+
:func:`~bigframes.pandas.DataFrame.groupby`,
29+
:func:`~bigframes.pandas.DataFrame.merge`,
30+
:func:`~bigframes.pandas.DataFrame.pivot_table`, and more on BigQuery-backed
31+
:class:`~bigframes.pandas.DataFrame` objects.
32+
* **Direct BigQuery Integration:** Read from and write to BigQuery tables and
33+
queries with :func:`bigframes.pandas.read_gbq` and
34+
:func:`bigframes.pandas.to_gbq`.
35+
* **User-defined Functions (UDFs):** Effortlessly deploy Python functions
36+
functions using the :func:`bigframes.pandas.remote_function` and
37+
:func:`bigframes.pandas.udf` decorators.
38+
* **Data Ingestion:** Support for various formats including CSV, Parquet, JSON,
39+
and Arrow via :func:`bigrames.pandas.read_csv`,
40+
:func:`bigframes.pandas.read_parquet`, etc., which are automatically uploaded
41+
to BigQuery for processing. Convert any pandas DataFrame into a BigQuery
42+
DataFrame using :func:`bigframes.pandas.read_pandas`.
43+
44+
**Example usage:**
45+
46+
>>> import bigframes.pandas as bpd
47+
48+
Initialize session and set options.
49+
50+
>>> bpd.options.bigquery.project = "your-project-id" # doctest: +SKIP
51+
52+
Load data from a BigQuery public dataset.
53+
54+
>>> df = bpd.read_gbq("bigquery-public-data.usa_names.usa_1910_2013") # doctest: +SKIP
55+
56+
Perform familiar pandas operations that execute in the cloud.
57+
58+
>>> top_names = (
59+
... df.groupby("name")
60+
... .agg({"number": "sum"})
61+
... .sort_values("number", ascending=False)
62+
... .head(10)
63+
... ) # doctest: +SKIP
64+
65+
Bring the final, aggregated results back to local memory if needed.
66+
67+
>>> local_df = top_names.to_pandas() # doctest: +SKIP
68+
69+
BigQuery DataFrames is designed for data scientists and analysts who need the
70+
power of BigQuery with the ease of use of pandas. It eliminates the "data
71+
movement bottleneck" by keeping your data in BigQuery for processing.
72+
"""
1673

1774
from __future__ import annotations
1875

docs/index.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,14 +68,14 @@ Explore the Documentation
6868
user_guide/index
6969

7070
.. toctree::
71-
:maxdepth: 3
71+
:maxdepth: 2
7272
:caption: API Reference
7373

7474
reference/index
7575
supported_pandas_apis
7676

7777
.. toctree::
78-
:maxdepth: 2
78+
:maxdepth: 1
7979
:caption: Community & Updates
8080

8181
changelog

0 commit comments

Comments
 (0)