Add TokenPaginationCrawler for SAAS plugins#6008
Merged
dlvenable merged 4 commits intoAug 25, 2025
Merged
Conversation
Signed-off-by: Brendan Benner <bbenner@amazon.com>
Signed-off-by: Brendan Benner <bbenner@amazon.com>
cb6ddcf to
0e48019
Compare
san81
reviewed
Aug 22, 2025
| @JsonProperty("last_token") | ||
| private String lastToken; | ||
|
|
||
| private Instant lastPollTime; |
Contributor
Author
There was a problem hiding this comment.
Yes, because TokenPaginationCrawlerLeaderProgressState needs to implement LeaderProgressState which has required abstract method setLastPollTime for compatibility with existing PaginationCrawler.
…issues Signed-off-by: Brendan Benner <bbenner@amazon.com>
b07991d to
b0b0f09
Compare
san81
previously approved these changes
Aug 22, 2025
san81
left a comment
Collaborator
There was a problem hiding this comment.
There is a lot of overlap between this and current last poll time based crawler. We can think of a way to generalize these approaches to minimize the redundant code using a custom state object passed to the crawler. If we define the behavior and have different implementations for the state object, that should fit both the use cases with backward compatibility.
dlvenable
reviewed
Aug 22, 2025
san81
previously approved these changes
Aug 22, 2025
dlvenable
previously approved these changes
Aug 22, 2025
Signed-off-by: Brendan Benner <bbenner@amazon.com>
143eff7 to
bf84302
Compare
san81
approved these changes
Aug 23, 2025
dlvenable
approved these changes
Aug 25, 2025
dlvenable
left a comment
Member
There was a problem hiding this comment.
Thank you for the contribution @bbenner7635 !
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
We add token-based pagination which closely follows
PaginationCrawlerlogic. It will execute partitions based on pages of a certain batch size--however, a token-based API will require sequential retrieval before processing.It can continue to require PaginationCrawlerWorkerProgressState since worker state is only determined by items in the partition, not timestamp or token (src). Moreover, LeaderPartition will fetch log IDs sequentially, while WorkerPartition will fetch log contents for the batch created by LeaderPartition.
Issues Resolved
Resolves #6007
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.