Skip to content

Latest commit

 

History

History
78 lines (52 loc) · 1.53 KB

File metadata and controls

78 lines (52 loc) · 1.53 KB

Reading Tables

IceFrame provides a simple API to read Iceberg tables into Polars DataFrames.

Basic Reading

df = ice.read_table("my_table")

Column Selection

Read only specific columns to improve performance.

df = ice.read_table("users", columns=["id", "email"])

Filtering

Filter data at the source (predicate pushdown) or locally.

Local Filtering (String)

# Filter expression using SQL-like syntax (applied locally by Polars)
df = ice.read_table("sales", filter_expr="amount > 100 AND region = 'US'")

Predicate Pushdown (Expression)

For better performance, use IceFrame Expressions to filter at the source (Iceberg).

from iceframe.expressions import col

# Filter applied by Iceberg (scans less data)
df = ice.read_table(
    "sales", 
    filter_expr=(col("amount") > 100) & (col("region") == "US")
)

Limiting Results

Limit the number of rows returned.

df = ice.read_table("logs", limit=100)

Time Travel

Read the table as it existed at a specific point in time.

By Snapshot ID

df = ice.read_table("my_table", snapshot_id=123456789012345)

By Timestamp

# Read as of 1 hour ago
timestamp_ms = int((time.time() - 3600) * 1000)
df = ice.read_table("my_table", as_of_timestamp=timestamp_ms)

Accessing Underlying Table

For advanced operations, you can access the underlying PyIceberg Table object.

table = ice.get_table("my_table")
# Use PyIceberg API directly
scan = table.scan()