Commit d61f46f
authored
feat: Add ArrivalOrder to ArrowScan for bounded-memory concurrent reads (#44)
> Backport of apache/iceberg-python#3046
## Summary
Addresses #3036 — ArrowScan.to_record_batches() uses executor.map + list() which eagerly materializes all record batches per file into memory, causing OOM on large tables.
This PR adds a new `order` parameter to `to_arrow_batch_reader()` with two implementations:
- `TaskOrder` (default) — preserves existing behavior: batches grouped by file in task submission order, each file fully materialized before proceeding to the next.
- `ArrivalOrder` — yields batches as they are produced across files without materializing entire files into memory. Accepts three sub-parameters:
- `concurrent_streams: int` — number of files to read concurrently (default: 8). A per-scan `ThreadPoolExecutor(max_workers=concurrent_streams)` bounds concurrency.
- `batch_size: int | None` — number of rows per batch passed to PyArrow's ds.Scanner (default: PyArrow's built-in 131,072).
- `max_buffered_batches: int` — size of the bounded queue between producers and consumer (default: 16), providing backpressure to cap memory usage.
## Problem
The current implementation materializes all batches from each file via list() inside executor.map, which runs up to min(32, cpu_count+4) files in parallel. For large files this means all batches from ~20 files are held in memory simultaneously before any are yielded to the consumer.
## Solution
### Before: OOM on large tables
```python
batches = table.scan().to_arrow_batch_reader()
```
### After: bounded memory, tunable parallelism
```python
from pyiceberg.table import ArrivalOrder
batches = table.scan().to_arrow_batch_reader(
order=ArrivalOrder(concurrent_streams=4, batch_size=10000),
)
```
Default behavior is unchanged — `TaskOrder` preserves the existing executor.map + list() path for backwards compatibility.
## Architecture
When `order=ArrivalOrder(...)`, batches flow through `_bounded_concurrent_batches`:
1. All file tasks are submitted to a per-scan `ThreadPoolExecutor(max_workers=concurrent_streams)`
2. Workers push batches into a bounded `Queue(maxsize=max_buffered_batches)` — when full, workers block (backpressure)
3. The consumer yields batches from the queue via blocking `queue.get()`
4. A sentinel value signals completion — no timeout-based polling
5. On early termination (consumer stops), a cancel event is set and the queue is drained until the sentinel to unblock all stuck workers
6. The executor context manager handles deterministic shutdown
Refactored `to_record_batches` into helpers: `_prepare_tasks_and_deletes`, `_iter_batches_arrival`, `_iter_batches_materialized`, `_apply_limit`.
## Ordering semantics
| Configuration | File ordering | Within-file ordering |
|---|---|---|
| `TaskOrder()` (default) | Batches grouped by file, in task submission order | Row order |
| `ArrivalOrder(concurrent_streams=1)` | Grouped by file, sequential | Row order |
| `ArrivalOrder(concurrent_streams>1)` | **Interleaved** (no grouping guarantee) | Row order within each file |
## Benchmark results
32 files × 500K rows, 5 columns (int64, float64, string, bool, timestamp), batch_size=131,072 (PyArrow default):
| Config | Throughput (rows/s) | TTFR (ms) | Peak Arrow Memory |
|---|---|---|---|
| default (TaskOrder) | 190,250,192 | 73.4 | 642.2 MB |
| ArrivalOrder(cs=1) | 59,317,085 | 27.7 | 10.3 MB |
| ArrivalOrder(cs=2) | 105,414,909 | 28.8 | 42.0 MB |
| ArrivalOrder(cs=4) | 175,840,782 | 28.4 | 105.5 MB |
| ArrivalOrder(cs=8) | 211,922,538 | 32.3 | 271.7 MB |
| ArrivalOrder(cs=16) | 209,011,424 | 45.0 | 473.3 MB |
*TTFR = Time to First Record, cs = concurrent_streams*
## Are these changes tested?
Yes. 25 new unit tests across two test files, plus a micro-benchmark.
## Are there any user-facing changes?
Yes. New `order` parameter on `DataScan.to_arrow_batch_reader()`:
- `order: ScanOrder | None` — controls batch ordering. Pass `TaskOrder()` (default) or `ArrivalOrder(concurrent_streams=N, batch_size=B, max_buffered_batches=M)`.
New public classes `TaskOrder` and `ArrivalOrder` (subclasses of `ScanOrder`) exported from `pyiceberg.table`.
All parameters are optional with backwards-compatible defaults. Existing code is unaffected.
Documentation updated in `mkdocs/docs/api.md` with usage examples, ordering semantics, and configuration guidance table.1 parent 714a804 commit d61f46f
File tree
6 files changed
+1074
-40
lines changed- mkdocs/docs
- pyiceberg
- io
- table
- tests
- benchmark
- io
6 files changed
+1074
-40
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
355 | 355 | | |
356 | 356 | | |
357 | 357 | | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
358 | 414 | | |
359 | 415 | | |
360 | 416 | | |
| |||
1619 | 1675 | | |
1620 | 1676 | | |
1621 | 1677 | | |
| 1678 | + | |
| 1679 | + | |
| 1680 | + | |
| 1681 | + | |
| 1682 | + | |
| 1683 | + | |
| 1684 | + | |
| 1685 | + | |
| 1686 | + | |
| 1687 | + | |
| 1688 | + | |
| 1689 | + | |
| 1690 | + | |
| 1691 | + | |
| 1692 | + | |
| 1693 | + | |
| 1694 | + | |
| 1695 | + | |
| 1696 | + | |
| 1697 | + | |
| 1698 | + | |
| 1699 | + | |
| 1700 | + | |
| 1701 | + | |
1622 | 1702 | | |
1623 | 1703 | | |
1624 | 1704 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| 36 | + | |
36 | 37 | | |
| 38 | + | |
37 | 39 | | |
38 | 40 | | |
39 | 41 | | |
40 | | - | |
| 42 | + | |
| 43 | + | |
41 | 44 | | |
42 | 45 | | |
43 | 46 | | |
| |||
141 | 144 | | |
142 | 145 | | |
143 | 146 | | |
144 | | - | |
| 147 | + | |
145 | 148 | | |
146 | 149 | | |
147 | 150 | | |
| |||
1581 | 1584 | | |
1582 | 1585 | | |
1583 | 1586 | | |
| 1587 | + | |
1584 | 1588 | | |
1585 | 1589 | | |
1586 | 1590 | | |
| |||
1612 | 1616 | | |
1613 | 1617 | | |
1614 | 1618 | | |
1615 | | - | |
1616 | | - | |
1617 | | - | |
| 1619 | + | |
| 1620 | + | |
| 1621 | + | |
1618 | 1622 | | |
1619 | 1623 | | |
1620 | | - | |
1621 | | - | |
1622 | | - | |
| 1624 | + | |
| 1625 | + | |
| 1626 | + | |
| 1627 | + | |
| 1628 | + | |
| 1629 | + | |
| 1630 | + | |
1623 | 1631 | | |
1624 | 1632 | | |
1625 | 1633 | | |
| |||
1677 | 1685 | | |
1678 | 1686 | | |
1679 | 1687 | | |
| 1688 | + | |
| 1689 | + | |
| 1690 | + | |
| 1691 | + | |
| 1692 | + | |
| 1693 | + | |
| 1694 | + | |
| 1695 | + | |
| 1696 | + | |
| 1697 | + | |
| 1698 | + | |
| 1699 | + | |
| 1700 | + | |
| 1701 | + | |
| 1702 | + | |
| 1703 | + | |
| 1704 | + | |
| 1705 | + | |
| 1706 | + | |
| 1707 | + | |
| 1708 | + | |
| 1709 | + | |
| 1710 | + | |
| 1711 | + | |
| 1712 | + | |
| 1713 | + | |
| 1714 | + | |
| 1715 | + | |
| 1716 | + | |
| 1717 | + | |
| 1718 | + | |
| 1719 | + | |
| 1720 | + | |
| 1721 | + | |
| 1722 | + | |
| 1723 | + | |
| 1724 | + | |
| 1725 | + | |
| 1726 | + | |
| 1727 | + | |
| 1728 | + | |
| 1729 | + | |
| 1730 | + | |
| 1731 | + | |
| 1732 | + | |
| 1733 | + | |
| 1734 | + | |
| 1735 | + | |
| 1736 | + | |
| 1737 | + | |
| 1738 | + | |
| 1739 | + | |
| 1740 | + | |
| 1741 | + | |
| 1742 | + | |
| 1743 | + | |
| 1744 | + | |
| 1745 | + | |
| 1746 | + | |
| 1747 | + | |
| 1748 | + | |
| 1749 | + | |
| 1750 | + | |
| 1751 | + | |
| 1752 | + | |
| 1753 | + | |
| 1754 | + | |
| 1755 | + | |
| 1756 | + | |
| 1757 | + | |
| 1758 | + | |
| 1759 | + | |
| 1760 | + | |
| 1761 | + | |
| 1762 | + | |
| 1763 | + | |
| 1764 | + | |
| 1765 | + | |
| 1766 | + | |
| 1767 | + | |
1680 | 1768 | | |
1681 | 1769 | | |
1682 | 1770 | | |
| |||
1756 | 1844 | | |
1757 | 1845 | | |
1758 | 1846 | | |
1759 | | - | |
| 1847 | + | |
| 1848 | + | |
| 1849 | + | |
| 1850 | + | |
| 1851 | + | |
1760 | 1852 | | |
1761 | 1853 | | |
1762 | 1854 | | |
1763 | 1855 | | |
1764 | 1856 | | |
1765 | 1857 | | |
| 1858 | + | |
| 1859 | + | |
| 1860 | + | |
| 1861 | + | |
| 1862 | + | |
1766 | 1863 | | |
1767 | 1864 | | |
| 1865 | + | |
| 1866 | + | |
| 1867 | + | |
| 1868 | + | |
| 1869 | + | |
| 1870 | + | |
| 1871 | + | |
| 1872 | + | |
| 1873 | + | |
1768 | 1874 | | |
1769 | 1875 | | |
1770 | 1876 | | |
1771 | 1877 | | |
1772 | 1878 | | |
1773 | 1879 | | |
1774 | 1880 | | |
1775 | | - | |
| 1881 | + | |
| 1882 | + | |
1776 | 1883 | | |
1777 | | - | |
| 1884 | + | |
| 1885 | + | |
| 1886 | + | |
| 1887 | + | |
| 1888 | + | |
| 1889 | + | |
| 1890 | + | |
| 1891 | + | |
| 1892 | + | |
| 1893 | + | |
1778 | 1894 | | |
1779 | | - | |
| 1895 | + | |
| 1896 | + | |
| 1897 | + | |
| 1898 | + | |
| 1899 | + | |
| 1900 | + | |
| 1901 | + | |
| 1902 | + | |
| 1903 | + | |
| 1904 | + | |
| 1905 | + | |
| 1906 | + | |
| 1907 | + | |
| 1908 | + | |
| 1909 | + | |
| 1910 | + | |
| 1911 | + | |
| 1912 | + | |
| 1913 | + | |
| 1914 | + | |
| 1915 | + | |
| 1916 | + | |
| 1917 | + | |
| 1918 | + | |
| 1919 | + | |
| 1920 | + | |
| 1921 | + | |
| 1922 | + | |
| 1923 | + | |
| 1924 | + | |
| 1925 | + | |
1780 | 1926 | | |
1781 | 1927 | | |
1782 | 1928 | | |
1783 | | - | |
1784 | | - | |
1785 | | - | |
1786 | 1929 | | |
1787 | 1930 | | |
1788 | | - | |
1789 | | - | |
1790 | | - | |
1791 | | - | |
1792 | | - | |
1793 | | - | |
| 1931 | + | |
| 1932 | + | |
1794 | 1933 | | |
1795 | | - | |
1796 | | - | |
1797 | | - | |
1798 | | - | |
1799 | | - | |
| 1934 | + | |
| 1935 | + | |
| 1936 | + | |
| 1937 | + | |
| 1938 | + | |
1800 | 1939 | | |
1801 | | - | |
1802 | | - | |
1803 | | - | |
| 1940 | + | |
| 1941 | + | |
| 1942 | + | |
| 1943 | + | |
| 1944 | + | |
| 1945 | + | |
| 1946 | + | |
| 1947 | + | |
| 1948 | + | |
| 1949 | + | |
1804 | 1950 | | |
1805 | 1951 | | |
1806 | | - | |
| 1952 | + | |
1807 | 1953 | | |
1808 | 1954 | | |
1809 | 1955 | | |
| |||
1822 | 1968 | | |
1823 | 1969 | | |
1824 | 1970 | | |
| 1971 | + | |
1825 | 1972 | | |
1826 | 1973 | | |
1827 | 1974 | | |
| |||
0 commit comments