Skip to content

Add metric tracking total number of open shards, stop setting ending sequence number once open shards are discovered#6260

Merged
graytaylor0 merged 1 commit into
opensearch-project:mainfrom
graytaylor0:DdbShardSkipFix
Nov 13, 2025
Merged

Add metric tracking total number of open shards, stop setting ending sequence number once open shards are discovered#6260
graytaylor0 merged 1 commit into
opensearch-project:mainfrom
graytaylor0:DdbShardSkipFix

Conversation

@graytaylor0

@graytaylor0 graytaylor0 commented Nov 12, 2025

Copy link
Copy Markdown
Member

Description

Adds a metric totalOpenShards as a DistributionSummary that will be reported roughly every 10 minutes. This will show how many shards without an endingSequenceNumber are contained in the DynamoDB streams, and is useful to track due to the limitation of each Data Prepper instance only being able to process 150 open shards in parallel.

This change also fixes a data loss scenario where shards could be skipped if there were no records returned in a call to GetRecords using the endingSequenceNumber of the shard to check AT_SHARD_ITERATOR if there were any records in the shard, and was incorrectly assuming this meant that shards are 100% empty in this case. Now, instead of assuming the shard is empty, we will paginate fully through it to look for records, and filter out records from before the export or start time. This could result on some extra pagination through shards during and right after the export, but it will not result in duplicate processing due to the filtering based on timestamp.

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…ust because there is no record at the ending sequence number of that shard

Signed-off-by: Taylor Gray <tylgry@amazon.com>
@graytaylor0 graytaylor0 merged commit 4528f4e into opensearch-project:main Nov 13, 2025
44 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants