Reading Tables

IceFrame provides a simple API to read Iceberg tables into Polars DataFrames.

Basic Reading

df = ice.read_table("my_table")

Column Selection

Read only specific columns to improve performance.

df = ice.read_table("users", columns=["id", "email"])

Filtering

Filter data at the source (predicate pushdown) or locally.

Local Filtering (String)

# Filter expression using SQL-like syntax (applied locally by Polars)
df = ice.read_table("sales", filter_expr="amount > 100 AND region = 'US'")

Predicate Pushdown (Expression)

For better performance, use IceFrame Expressions to filter at the source (Iceberg).

from iceframe.expressions import col

# Filter applied by Iceberg (scans less data)
df = ice.read_table(
    "sales", 
    filter_expr=(col("amount") > 100) & (col("region") == "US")
)

Limiting Results

Limit the number of rows returned.

df = ice.read_table("logs", limit=100)

Time Travel

Read the table as it existed at a specific point in time.

By Snapshot ID

df = ice.read_table("my_table", snapshot_id=123456789012345)

By Timestamp

# Read as of 1 hour ago
timestamp_ms = int((time.time() - 3600) * 1000)
df = ice.read_table("my_table", as_of_timestamp=timestamp_ms)

Accessing Underlying Table

For advanced operations, you can access the underlying PyIceberg Table object.

table = ice.get_table("my_table")
# Use PyIceberg API directly
scan = table.scan()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reading Tables

Basic Reading

Column Selection

Filtering

Local Filtering (String)

Predicate Pushdown (Expression)

Limiting Results

Time Travel

By Snapshot ID

By Timestamp

Accessing Underlying Table

FilesExpand file tree

reading_tables.md

Latest commit

History

reading_tables.md

File metadata and controls

Reading Tables

Basic Reading

Column Selection

Filtering

Local Filtering (String)

Predicate Pushdown (Expression)

Limiting Results

Time Travel

By Snapshot ID

By Timestamp

Accessing Underlying Table