Skip to content

Latest commit

 

History

History
61 lines (41 loc) · 1.9 KB

File metadata and controls

61 lines (41 loc) · 1.9 KB

Basic Operations

In this section, you will learn how to display essential details of DataFrames using specific functions.

.. ipython:: python

    from datafusion import SessionContext
    import random

    ctx = SessionContext()
    df = ctx.from_pydict({
        "nrs": [1, 2, 3, 4, 5],
        "names": ["python", "ruby", "java", "haskell", "go"],
        "random": random.sample(range(1000), 5),
        "groups": ["A", "A", "B", "C", "B"],
    })
    df

Use :py:func:`~datafusion.dataframe.DataFrame.limit` to view the top rows of the frame:

.. ipython:: python

    df.limit(2)

Display the columns of the DataFrame using :py:func:`~datafusion.dataframe.DataFrame.schema`:

.. ipython:: python

    df.schema()

The method :py:func:`~datafusion.dataframe.DataFrame.to_pandas` uses pyarrow to convert to pandas DataFrame, by collecting the batches, passing them to an Arrow table, and then converting them to a pandas DataFrame.

.. ipython:: python

    df.to_pandas()

:py:func:`~datafusion.dataframe.DataFrame.describe` shows a quick statistic summary of your data:

.. ipython:: python

    df.describe()