Commit 85fe1c5
authored
sinks: walk the input arrangement via cursor per batch (#36165)
## Summary
Refactor the sink rendering path so sinks walk their input arrangement's
cursors directly, eliminating the `Vec<DiffPair<Row>>`-per-group
materialization and the trace-reader overhead that `zip_into_diff_pairs`
→ `combine_at_timestamp` → flat_map previously incurred. Motivated by
profiles of very large snapshot sinks where that pipeline dominated
allocation and caused major page faults.
## What this buys
- **No per-`(key, time)` `Vec<DiffPair<Row>>` allocation.** Biggest
direct win — this was the dominant allocator in `zip_into_diff_pairs` on
the profile.
- **Key owned once per key, not per `(key, time)` group.**
- **Fewer operator boundaries.** Kafka: arrangement → encode (was:
arrangement → `combine_at_timestamp` → flat_map → encode). Iceberg: one
fewer hop too.
- **Spine compacts aggressively.** Dropping the `TraceAgent` lets spine
compaction advance its frontier, releasing historical batch state rather
than accumulating it.
## What this does *not* fix
- The arrangement's **pre-spine batcher** still buffers un-sealed
updates (unavoidable at this layer).
- For a pure-insertion snapshot with no retractions, spine compaction
has limited work to do — savings come primarily from the `Vec`/operator
overhead, not from the spine itself shrinking.
- Persist snapshot forwarding semantics and sink commit-on-frontier
semantics are unchanged.
Follow-ups (not in this PR) worth considering:
- A batcher-only operator that skips the spine entirely for sinks that
don't need a trace.
- An append-mode Iceberg fast-path that skips arrangement altogether
when no key is configured.
## Test plan
- [x] `cargo check --workspace --all-targets`
- [x] `cargo clippy -p mz-interchange -p mz-storage --tests`
- [x] `bin/lint` (check-no-diff fails locally due to jj colocation; all
substantive checks pass)
- [x] `cargo test -p mz-interchange --lib envelopes::` (7 tests, all
pass)
- [ ] Full CI (pending)
- [ ] Kafka sink testdrive (snapshot + steady-state, Upsert and Debezium
envelopes)
- [ ] Iceberg sink testdrive (snapshot + steady-state, Upsert and Append
envelopes)
- [ ] Re-profile a very large snapshot sink to confirm the
\`zip_into_diff_pairs\` hotspot is gone
🤖 Generated with [Claude Code](https://claude.com/claude-code)1 parent 8ef14a9 commit 85fe1c5
5 files changed
Lines changed: 637 additions & 258 deletions
File tree
- misc/python/materialize/feature_benchmark/scenarios
- src
- interchange/src
- storage/src
- render
- sink
Lines changed: 113 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1538 | 1538 | | |
1539 | 1539 | | |
1540 | 1540 | | |
| 1541 | + | |
| 1542 | + | |
| 1543 | + | |
| 1544 | + | |
| 1545 | + | |
| 1546 | + | |
| 1547 | + | |
| 1548 | + | |
| 1549 | + | |
| 1550 | + | |
| 1551 | + | |
| 1552 | + | |
| 1553 | + | |
| 1554 | + | |
| 1555 | + | |
| 1556 | + | |
| 1557 | + | |
| 1558 | + | |
| 1559 | + | |
| 1560 | + | |
| 1561 | + | |
| 1562 | + | |
| 1563 | + | |
| 1564 | + | |
| 1565 | + | |
| 1566 | + | |
| 1567 | + | |
| 1568 | + | |
| 1569 | + | |
| 1570 | + | |
| 1571 | + | |
| 1572 | + | |
| 1573 | + | |
| 1574 | + | |
| 1575 | + | |
| 1576 | + | |
| 1577 | + | |
| 1578 | + | |
| 1579 | + | |
| 1580 | + | |
| 1581 | + | |
| 1582 | + | |
| 1583 | + | |
| 1584 | + | |
| 1585 | + | |
| 1586 | + | |
| 1587 | + | |
| 1588 | + | |
| 1589 | + | |
| 1590 | + | |
| 1591 | + | |
| 1592 | + | |
| 1593 | + | |
| 1594 | + | |
| 1595 | + | |
| 1596 | + | |
| 1597 | + | |
| 1598 | + | |
| 1599 | + | |
| 1600 | + | |
| 1601 | + | |
| 1602 | + | |
| 1603 | + | |
| 1604 | + | |
| 1605 | + | |
| 1606 | + | |
| 1607 | + | |
| 1608 | + | |
| 1609 | + | |
| 1610 | + | |
| 1611 | + | |
| 1612 | + | |
| 1613 | + | |
| 1614 | + | |
| 1615 | + | |
| 1616 | + | |
| 1617 | + | |
| 1618 | + | |
| 1619 | + | |
| 1620 | + | |
| 1621 | + | |
| 1622 | + | |
| 1623 | + | |
| 1624 | + | |
| 1625 | + | |
| 1626 | + | |
| 1627 | + | |
| 1628 | + | |
| 1629 | + | |
| 1630 | + | |
| 1631 | + | |
| 1632 | + | |
| 1633 | + | |
| 1634 | + | |
| 1635 | + | |
| 1636 | + | |
| 1637 | + | |
| 1638 | + | |
| 1639 | + | |
| 1640 | + | |
| 1641 | + | |
| 1642 | + | |
| 1643 | + | |
| 1644 | + | |
| 1645 | + | |
| 1646 | + | |
| 1647 | + | |
| 1648 | + | |
| 1649 | + | |
| 1650 | + | |
| 1651 | + | |
| 1652 | + | |
| 1653 | + | |
1541 | 1654 | | |
1542 | 1655 | | |
1543 | 1656 | | |
| |||
0 commit comments