Skip to content

Check active subscriptions before starting Office365 crawler#6231

Open
bbenner7635 wants to merge 1 commit into
opensearch-project:mainfrom
bbenner7635:bug/office365_list_subscriptions
Open

Check active subscriptions before starting Office365 crawler#6231
bbenner7635 wants to merge 1 commit into
opensearch-project:mainfrom
bbenner7635:bug/office365_list_subscriptions

Conversation

@bbenner7635

@bbenner7635 bbenner7635 commented Nov 1, 2025

Copy link
Copy Markdown
Contributor

Description

Check active subscriptions before starting Office365 crawler

Occassionally, if a user has multiple pipelines sharing credentials, startSubscriptions get get throttled on startup:

Caused by: java.lang.SecurityException: Access forbidden: 403 Forbidden: "{"error":{"code":"429","message":"Too many frequent subscription start requests. Please retry again after 8m 46s."}}"
	at org.opensearch.dataprepper.plugins.source.microsoft_office365.RetryHandler.executeWithRetry(RetryHandler.java:44)
	at org.opensearch.dataprepper.plugins.source.microsoft_office365.Office365RestClient.startSubscriptions(Office365RestClient.java:105)

By calling listSubscriptions first, the crawler plugin will only start subscriptions for content types which are not yet enabled and thereby reducing risk of startup failure.

Moreover, if listSubscriptions fails after retry, fallback to calling startSubscriptions anyway.

Tested locally by disabling some content types and starting the pipeline:

2025-11-01T19:37:22,282 [test-pipeline-sink-worker-2-thread-1] INFO  org.opensearch.dataprepper.plugins.source.microsoft_office365.Office365RestClient - Content type Audit.Exchange needs to be started
2025-11-01T19:37:23,997 [test-pipeline-sink-worker-2-thread-1] INFO  org.opensearch.dataprepper.plugins.source.microsoft_office365.Office365RestClient - Content type Audit.SharePoint needs to be started
2025-11-01T19:37:35,374 [pool-3-thread-1] INFO  org.opensearch.dataprepper.plugins.aws.AwsSecretsSupplier - Finished retrieving latest secret in aws:secrets:office365-credentials.
2025-11-01T19:37:36,182 [test-pipeline-sink-worker-2-thread-1] INFO  org.opensearch.dataprepper.plugins.source.microsoft_office365.Office365RestClient - Successfully started 2 subscription(s)

Issues Resolved

N/A

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Brendan Benner <bbenner@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant