You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: Count Recipe - Efficiently Count Rows in Iceberg Tables
3
3
---
4
4
5
5
# Counting Rows in an Iceberg Table
6
6
7
-
This recipe demonstrates how to use the `count()` function to efficiently count rows in an Iceberg table using PyIceberg.
7
+
This recipe demonstrates how to use the `count()` function to efficiently count rows in an Iceberg table using PyIceberg. The count operation is optimized for performance by reading file metadata rather than scanning actual data.
8
+
9
+
## How Count Works
10
+
11
+
The `count()` method leverages Iceberg's metadata architecture to provide fast row counts by:
12
+
13
+
1.**Reading file manifests**: Examines metadata about data files without loading the actual data
14
+
2.**Aggregating record counts**: Sums up record counts stored in Parquet file footers
15
+
3.**Applying filters at metadata level**: Pushes down predicates to skip irrelevant files
16
+
4.**Handling deletes**: Automatically accounts for delete files and tombstones
8
17
9
18
## Basic Usage
10
19
11
-
To count all rows in a table:
20
+
Count all rows in a table:
12
21
13
22
```python
14
23
from pyiceberg.catalog import load_catalog
15
24
16
25
catalog = load_catalog("default")
17
26
table = catalog.load_table("default.cities")
18
27
19
-
row_count = table.count()
28
+
# Get total row count
29
+
row_count = table.scan().count()
20
30
print(f"Total rows in table: {row_count}")
21
31
```
22
32
23
-
## Count with a Filter
33
+
## Count with Filters
24
34
25
-
To count only rows matching a filter:
35
+
Count rows matching specific conditions:
26
36
27
37
```python
28
-
from pyiceberg.expressions import EqualTo
38
+
from pyiceberg.expressions importGreaterThan, EqualTo, And
0 commit comments