Commit 675adc3
authored
feat(sqs): live-queue reaper enumerates partitioned keyspace (Phase 3.D PR 6b) (#736)
## Summary
Phase 3.D PR 6b: live-queue reaper enumerates partitioned keyspace.
Follow-up to PR #735 (6a) — the tombstone-driven sweep already walks
partitioned data on DeleteQueue / PurgeQueue, but the live-queue
retention reaper still only saw the legacy keyspace, so:
- retention-expired messages on partitioned queues leaked their data /
vis / byage / group rows forever (`reapQueue` walked
`sqsMsgByAgePrefixAllGenerations` only),
- expired dedup records on partitioned FIFO queues leaked forever
(`reapExpiredDedup` scanned `SqsMsgDedupPrefix` only — empty for
partitioned queues since `sqsMsgDedupKeyDispatch` routes their writes
under `SqsPartitionedMsgDedupPrefix`).
Closes the live-queue half of the Codex P2 from PR #732 round 0; PR 6a
covered the tombstoned-cohort half.
## What changes
- **`reapQueue`**: legacy byage walk extracted as `reapQueueLegacy`
(byte-identical to pre-PR-6b for non-partitioned queues). Adds
`reapQueuePartition` step that runs once per partition for
`PartitionCount > 1` queues. Per-partition budget per the §6 design
("partitions × budget per cycle"); 30s tick interval comfortably
absorbs.
- **`reapPartitionedPage`**: partitioned twin of `reapPage`. Same
live-vs-orphan classification, but parses each entry with
`parseSqsPartitionedMsgByAgeKey` and routes the dispatch through
`reapOneRecordPartitioned`.
- **`classifyPartitionedByAgeEntry`**: helper extracted from
`reapPartitionedPage` so the loop body stays under the cyclop ceiling.
Returns `(parsedKey, reapable bool)`.
- **`reapExpiredDedup`** (signature changed): now takes `*sqsQueueMeta`
and routes by `PartitionCount`. Legacy meta → `reapExpiredDedupLegacy`
(byte-identical). Partitioned meta → `reapExpiredDedupPartitioned`
(NEW), iterates each partition's dedup prefix under its own
per-partition budget.
## Caller audit
- `reapQueue` — one production caller (`reapAllQueues`); signature
unchanged. Non-partitioned queues byte-identical; partitioned get the
extra per-partition pass.
- `reapExpiredDedup` — signature changed to take `*sqsQueueMeta`; one
production caller (`reapAllQueues`), updated. No tests called it
directly.
- New helpers (`reapQueueLegacy` / `reapQueuePartition` /
`reapPartitionedPage` / `reapExpiredDedupLegacy` /
`reapExpiredDedupPartitioned` / `classifyPartitionedByAgeEntry`) each
have exactly one production caller in the new live-queue reap path.
- `reapOneRecordPartitioned` (existing PR 6a helper): previously called
from `reapDeadByAgePartitionPage` (tombstone path); now also from
`reapPartitionedPage` (live-queue path). Same dispatch semantics.
## Tests
- New `TestSQSServer_PartitionedFIFO_LiveQueueDedupReaperPartitions`:
4-partition queue, send across 6 distinct groups, backdate every
partitioned dedup record's `ExpiresAtMillis`, run `reapAllQueues`,
assert every partitioned dedup row across `[0, 4)` is gone. Pre-PR-6b
reaper would leave every row in place.
## Self-review (CLAUDE.md)
1. **Data loss** — Closes the live-queue dedup leak + partitioned
retention-expired-message leak. Legacy queues unchanged.
2. **Concurrency / distributed failures** — Reaper still runs only on
the leader. Per-partition pass is sequential; per-partition budget
bounds the pass. OCC semantics on each record reap unchanged.
3. **Performance** — Per-tick partitioned-queue cost grows from O(1
walk) to O(partition_count walks) on byage AND dedup. Each partition
bounded by `sqsReaperPerQueueBudget`. 30s tick interval comfortably
absorbs 32-partition × per-queue budget per design.
4. **Data consistency** — Live-vs-orphan classification on partitioned
byage mirrors the legacy branch exactly (`reapPage` /
`reapPartitionedPage` share the rules through
`classifyPartitionedByAgeEntry`). `PartitionCount` immutability means
the meta-driven iteration bound matches the on-disk keys.
5. **Test coverage** — One new wire-level integration test for the
partitioned dedup walk; the partitioned byage walk reuses parsing /
dispatch helpers already covered by PR 6a's tombstone-reap integration
test.
## Test plan
- [x] `make lint` — 0 issues
- [x] Targeted reaper / retention / dedup / HTFIFO / PartitionedFIFO
suites (-race, clean)
- [x] Wider regression on Send/Receive/Delete +
CreateQueue/DeleteQueue/PurgeQueue (-race, clean)
- [ ] CI: full Jepsen + race
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Fixed deduplication record cleanup in partitioned FIFO queues to
properly remove expired records across all partitions.
* Enhanced cleanup mechanism to correctly handle partition-aware
behavior and prevent memory leaks in partitioned queue scenarios.
* **Tests**
* Added comprehensive test for deduplication record cleanup across
partitioned FIFO queue partitions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->2 files changed
Lines changed: 408 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
| |||
648 | 649 | | |
649 | 650 | | |
650 | 651 | | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
| 683 | + | |
| 684 | + | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
| 694 | + | |
| 695 | + | |
| 696 | + | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
0 commit comments