Commit 796a42f
committed
backup: SQS encoder for queue meta and messages.jsonl (Phase 0a)
Builds on PR #713. Adds the SQS encoder for the Phase 0 logical-
backup decoder.
Snapshot prefixes handled:
- !sqs|queue|meta|<base64url(queue)> -> sqs/<queue>/_queue.json
(dump-format projection of the live sqsQueueMeta with AWS-style
field names; FormatVersion stamped, throttle / partition / dedup-
scope fields elided -- cluster-internal state, not user-visible
config)
- !sqs|msg|data|<base64url(queue)><gen 8B BE><base64url(msgID)> ->
sqs/<queue>/messages.jsonl (one record per line, sorted at
Finalize-time by (SendTimestampMillis, SequenceNumber, MessageID)).
- !sqs|msg|vis | byage | dedup | group, !sqs|queue|tombstone:
excluded by default; --include-sqs-side-records routes them to
sqs/<queue>/_internals/side_records.jsonl as a structured bag.
- !sqs|queue|gen, !sqs|queue|seq: not handled by Phase 0 (operational
counters, not user-visible state).
Implementation choices:
- Lex-order for the snapshot is m < q < ... so msg|data records
arrive BEFORE queue|meta. Encoder buffers per encoded-queue-prefix
and resolves the human-readable queue name at Finalize via the
queue|meta records that arrive later.
- Boundary detection: the encoded queue segment is base64url-no-
padding (alphabet [A-Za-z0-9-_]); the first byte of the 8-byte
BE gen is 0x00 for any production gen value (< 2^56), so the
first non-alphabet byte is the queue/gen boundary. The msgID
segment is also validated by attempting a base64url decode -- a
failed decode is surfaced as ErrSQSMalformedKey rather than
routed to the wrong queue.
- Visibility-state on emitted messages is zeroed by default;
--preserve-visibility passes the live values through.
- Orphan messages (data records with no matching queue meta) emit
a structured warning at Finalize and are dropped from the dump.
Restoring orphans without a queue config would silently create a
default-attribute queue, which is rarely what the operator wants.
- Memory: O(messages-per-queue) buffer at Finalize for the sort
pass. Queues with hundreds of millions of messages will need a
future stream-and-merge variant; documented as a known limit.
Tests: queue meta round-trip, message ordering by (ts, seq, id) with
ties resolved by message_id, default visibility-state zeroing,
--preserve-visibility round-trip, orphan-message warning,
magic-prefix rejection, JSON-decode rejection, wrong-prefix
rejection, peekMsgDataKey component round-trip, side-records
include/exclude.1 parent ebe4f36 commit 796a42f
2 files changed
Lines changed: 979 additions & 0 deletions
0 commit comments