Skip to content

Add missing SessionContext read/write and registration methods #1458

@timsaucer

Description

@timsaucer

Summary

Several SessionContext methods for reading data sources and registering tables from upstream DataFusion v53 are not yet exposed in datafusion-python.

Missing Methods

Read methods:

  • read_arrow — read an Arrow IPC file into a DataFrame
  • read_batch — read a single RecordBatch into a DataFrame
  • read_batches — read multiple RecordBatches into a DataFrame
  • read_empty — create an empty DataFrame with a given schema

Write methods:

  • write_csv — write query results to CSV directly from context
  • write_json — write query results to JSON directly from context
  • write_parquet — write query results to Parquet directly from context

Registration:

  • register_arrow — register an Arrow IPC file as a table
  • register_batch — register a single RecordBatch as a table

Upstream Reference

Implementation

  • Rust bindings: crates/core/src/context.rs
  • Python wrappers: python/datafusion/context.py

Note: This gap analysis was performed using an AI agent comparing upstream DataFusion v53 documentation against the current datafusion-python codebase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions