Skip to content

[v25.2.x] [CORE-6913] Cloud Storage Scrubber: Fix false positive for replaced segments#30120

Open
vbotbuildovich wants to merge 2 commits intoredpanda-data:v25.2.xfrom
vbotbuildovich:backport-pr-30062-v25.2.x-139
Open

[v25.2.x] [CORE-6913] Cloud Storage Scrubber: Fix false positive for replaced segments#30120
vbotbuildovich wants to merge 2 commits intoredpanda-data:v25.2.xfrom
vbotbuildovich:backport-pr-30062-v25.2.x-139

Conversation

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

Backport of PR #30062

oleiman added 2 commits April 9, 2026 23:43
The scrub false-positive filter in process_anomalies() only checked
whether a segment with the same offset range existed in the manifest.
A compacted reupload produces a replacement segment at the same
offset range but with a different name (different size). When GC
deleted the old segment from cloud storage while the scrubber was
still referencing a stale manifest, the filter kept the anomaly
because the offset range still matched—even though the current
segment at that range was a different (replacement) object that
existed in cloud storage.

Compare generate_remote_segment_name() for the manifest entry and
the reported-missing segment so that replacements with the same
offset range but different identity are correctly recognized as
false positives.

Fixes CORE-6913.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
(cherry picked from commit c3d965a)
Test for race between scrubber and compacted segment reupload:
1. Create manifest with 3 segments, remove the middle one
   from cloud storage so the detector reports it missing
2. Replace it in the manifest with a compacted version at
   the same offset range but different size_bytes
3. Assert generate_remote_segment_name() differs for the
   original vs compacted segment (v2/v3 names encode size)
4. Call process_anomalies() and assert the anomaly is
   filtered out as a false positive

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
(cherry picked from commit 34185af)
@vbotbuildovich vbotbuildovich added this to the v25.2.x-next milestone Apr 9, 2026
@vbotbuildovich vbotbuildovich added the kind/backport PRs targeting a stable branch label Apr 9, 2026
@vbotbuildovich vbotbuildovich requested a review from oleiman April 9, 2026 23:43
@vbotbuildovich vbotbuildovich added the kind/backport PRs targeting a stable branch label Apr 9, 2026
@oleiman oleiman self-assigned this Apr 10, 2026
@vbotbuildovich
Copy link
Copy Markdown
Collaborator Author

CI test results

test results on build#82993
test_status test_class test_method test_arguments test_kind job_url passed reason test_history
FLAKY(PASS) CloudRetentionTest test_cloud_retention {"cloud_storage_type": 2, "max_consume_rate_mb": null} integration https://buildkite.com/redpanda/redpanda/builds/82993#019d74c6-3edc-4403-bafa-62fa69a9294f 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0000, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=CloudRetentionTest&test_method=test_cloud_retention

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/redpanda kind/backport PRs targeting a stable branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants