Commit 22baf60
authored
[python] Fix duplicate _ROW_ID when file exceeds read batch size (#7626)
When a data file contains more than 1024 rows, upsert_by_arrow_with_key
fails with: `ValueError: Input data contains duplicate _ROW_ID values`.
This PR fixes above issue by advancing first_row_id in
DataFileBatchReader._assign_row_tracking after each batch.1 parent 6a8167f commit 22baf60
File tree
2 files changed
+41
-0
lines changed- paimon-python/pypaimon
- read/reader
- tests
2 files changed
+41
-0
lines changedLines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
211 | 211 | | |
212 | 212 | | |
213 | 213 | | |
| 214 | + | |
214 | 215 | | |
215 | 216 | | |
216 | 217 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1593 | 1593 | | |
1594 | 1594 | | |
1595 | 1595 | | |
| 1596 | + | |
| 1597 | + | |
| 1598 | + | |
| 1599 | + | |
| 1600 | + | |
| 1601 | + | |
| 1602 | + | |
| 1603 | + | |
| 1604 | + | |
| 1605 | + | |
| 1606 | + | |
| 1607 | + | |
| 1608 | + | |
| 1609 | + | |
| 1610 | + | |
| 1611 | + | |
| 1612 | + | |
| 1613 | + | |
| 1614 | + | |
| 1615 | + | |
| 1616 | + | |
| 1617 | + | |
| 1618 | + | |
| 1619 | + | |
| 1620 | + | |
| 1621 | + | |
| 1622 | + | |
| 1623 | + | |
| 1624 | + | |
| 1625 | + | |
| 1626 | + | |
| 1627 | + | |
| 1628 | + | |
| 1629 | + | |
| 1630 | + | |
| 1631 | + | |
| 1632 | + | |
| 1633 | + | |
| 1634 | + | |
| 1635 | + | |
0 commit comments