|
17 | 17 |
|
18 | 18 | **BigQuery DataFrames** provides a Pythonic DataFrame and machine learning (ML) API |
19 | 19 | powered by the BigQuery engine. The ``bigframes.pandas`` module implements a large |
20 | | -subset of the pandas API, allowing you to perform large-scale data analysis |
21 | | -using familiar pandas syntax while the computations are executed in the cloud. |
| 20 | +subset of the pandas API, allowing you to perform large-scale data analysis, |
| 21 | +data engineering, and AI/ML workflows using familiar pandas syntax while the computations |
| 22 | +are seamlessly executed in the cloud. |
22 | 23 |
|
23 | | -**Key Features:** |
| 24 | +**Key Features for Data Scientists, Data Engineers, and Data Analysts:** |
24 | 25 |
|
25 | | -* **Petabyte-Scale Scalability:** Handle datasets that exceed local memory by |
26 | | - offloading computation to the BigQuery distributed engine. |
| 26 | +* **Petabyte-Scale Scalability:** Handle huge datasets that exceed local memory limits by |
| 27 | + offloading big data computation directly to the BigQuery distributed engine. |
27 | 28 | * **Pandas Compatibility:** Use common pandas methods like |
28 | 29 | :func:`~bigframes.pandas.DataFrame.groupby`, |
29 | 30 | :func:`~bigframes.pandas.DataFrame.merge`, |
30 | 31 | :func:`~bigframes.pandas.DataFrame.pivot_table`, and more on BigQuery-backed |
31 | | - :class:`~bigframes.pandas.DataFrame` objects. |
| 32 | + :class:`~bigframes.pandas.DataFrame` objects without rewriting existing pandas pipelines. |
32 | 33 | * **Direct BigQuery Integration:** Read from and write to BigQuery tables and |
33 | 34 | queries with :func:`bigframes.pandas.read_gbq` and |
34 | | - :func:`bigframes.pandas.DataFrame.to_gbq`. |
35 | | -* **User-defined Functions (UDFs):** Effortlessly deploy Python functions |
36 | | - functions using the :func:`bigframes.pandas.remote_function` and |
37 | | - :func:`bigframes.pandas.udf` decorators. |
| 35 | + :func:`bigframes.pandas.DataFrame.to_gbq`. Perfect for data engineers constructing scalable ETL pipelines. |
| 36 | +* **Seamless AI and Machine Learning:** Rapidly train models or use Generative AI (like Gemini) directly on large datasets, reducing data movement and time-to-insight for data scientists. |
| 37 | +* **User-defined Functions (UDFs):** Effortlessly deploy custom Python functions |
| 38 | + using the :func:`bigframes.pandas.remote_function` and |
| 39 | + :func:`bigframes.pandas.udf` decorators for custom business logic. |
38 | 40 | * **Data Ingestion:** Support for various formats including CSV, Parquet, JSON, |
39 | 41 | and Arrow via :func:`bigframes.pandas.read_csv`, |
40 | 42 | :func:`bigframes.pandas.read_parquet`, etc., which are automatically uploaded |
|
66 | 68 |
|
67 | 69 | >>> local_df = top_names.to_pandas() # doctest: +SKIP |
68 | 70 |
|
69 | | -BigQuery DataFrames is designed for data scientists and analysts who need the |
70 | | -power of BigQuery with the ease of use of pandas. It eliminates the "data |
71 | | -movement bottleneck" by keeping your data in BigQuery for processing. |
| 71 | +BigQuery DataFrames is designed for data scientists, data engineers, and data analysts who need the |
| 72 | +power of BigQuery's distributed compute with the ease of use of pandas. It eliminates the "data |
| 73 | +movement bottleneck" by keeping your big data within BigQuery for secure, scalable processing. |
72 | 74 | """ |
73 | 75 |
|
74 | 76 | from __future__ import annotations |
|
0 commit comments