CrowdStrike API call and retry mechanism by nsgupta1 · Pull Request #5654 · opensearch-project/data-prepper

nsgupta1 · 2025-04-24T23:50:11Z

Description

Interactions with CrowdStrike Falcon Threat Intel API
Token Refresh mechanism
Validation for paginationLink sent by CrowdStrike Falcon API

Issues Resolved

Resolves #[Issue number to be closed when this PR is merged]

Check List

[Y] New functionality includes testing.
[N] New functionality has a documentation issue. Please link to it in this PR.
- [Y] New functionality has javadoc added
[Y] Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: nsgupta1 <nsgupta1@users.noreply.github.com>

san81

Left a few comments but overall nice progress

san81 · 2025-04-25T00:32:08Z

+     *   @param paginationLink  An optional pagination URL suffix (used when fetching next pages).
+     *   @return                A {@link CrowdStrikeApiResponse} containing response body and headers.
+     */
+    public CrowdStrikeApiResponse getAllContent(Long startTime, Long endTime, String paginationLink) {


Please use Instant type for startTime and endTime

nit: See if using Optional<> for paginationLink adds any redability.

san81 · 2025-04-25T00:38:44Z

+                response.setHeaders(responseEntity.getHeaders());
+                return response;
+            } catch (Exception e) {
+                log.error("Error fetching CrowdStrike content from URI: {}", uri, e);


Use Noisy flag to avoid repeated logging in the case of repeated failures.

log.error(NOISY, "Error fetching CrowdStrike content from URI: {}", uri, e);

Also, it looks like you are not doing anything here by catching. Do you really want to catch it here?

I wanted to log the exact URI that's throwing error but I'm also doing it in invokeGetAPI method, I'll remove this try catch block.

san81 · 2025-04-25T00:53:35Z

+
+    // Convenience method to get a specific header
+    public List<String> getHeader(String headerName) {
+        return headers.getOrDefault(headerName, Collections.emptyList());


headers itself could be null. Either handle null case or have only one constructor that takes both the arguments and that is the only way to initialize this instance.

CrowdStrike API returns a header even if API throws an error. I'll create a constructor that takes header and body as inputs.

san81 · 2025-04-25T01:05:05Z

+            //There is still time to renew, or someone else must have already renewed it
+            return;
+        }
+        synchronized (tokenRenewLock) {


May be it is good to have this synchronized block inside getAuthToken() method? Otherwise, there is no protection if someone call getAuthToken() concurrently

@san81 I think we should keep synchronized block inside refreshToken method because getAuthToken is called from initCredentials as well. initCredentials is only called once before we start crawler, it doesn't need synchronization. RefreshToken is the only way to call getAuthToken in multi-thread manner and if multiple threads are trying to refreshToken at the same time we can block them there itself.

san81 · 2025-04-25T05:53:04Z

+     * @throws UnauthorizedException if the API returns 403 (Forbidden)
+     * @throws RuntimeException if all retries are exhausted or unexpected errors occur
+     */
+    public <T> ResponseEntity<T> invokeGetApi(URI uri, Class<T> responseType) {


I would recommend to see the possibility of reusing existing code here. I see the current RestClient code is more inside the Atlassian commons but we can pull that out into the source crawler and make it available for any future saas sources as well.

By reusing, we will minimize the future code maintenance and make sure every plugin gets the future fixes.

I see the additional headers setup in this method which was done here in the Atlassian case. That is also generalizable.

AddressValidation validation is something that you probably don't need as you are not even asking for url from the customer. Apart from that, we can reuse rest of the logic. Lets try that here.

Thanks @san81 I'll refactor this in the follow up PR.

san81 · 2025-04-25T05:58:59Z

+                        sub = URLDecoder.decode(encodedSub, StandardCharsets.UTF_8);
+                    } catch (IllegalArgumentException e) {
+                        log.warn("Invalid URL encoding in subfilter: {}", encodedSub);
+                        continue;


is it Ok to continue in this case?

Yes, it is okay to continue here because there is no point in further validating that sub filter. We will just ignore that subfilter completely and move on to the next one. for example if url is last_updated:>=1745519529+bad%ZZsegment+_marker:<'abc'> we will ignore bad%ZZsegment and santize url last_updated:>=1745519529+_marker:<'abc'>

san81 · 2025-04-25T06:03:09Z

+
+    @Test
+    void testValidEncodedCrowdStrikeUrlPreserved() throws MalformedURLException {
+        String url = "https://api.crowdstrike.com//intel/combined/indicators/v1" +


nit: double / after crowdstrike.com. probably a typo?

san81 · 2025-04-25T06:09:03Z

+        when(restClient.invokeGetApi(eq(sanitizedUri), eq(CrowdStrikeIndicatorResult.class)))
+                .thenReturn(responseEntity);
+
+        CrowdStrikeApiResponse response = service.getAllContent(null, null, paginationLink);


is null an acceptable value for startTime and endTime?

No, startTime and endTime cannot be null anymore because we are passing Instant now and startTime.getEpochSeconds() with throw NPE. I'll add a null check in the method itself. Thanks for catching this.

san81 · 2025-04-25T06:11:47Z

+/**
+ * CrowdStrike service Test
+ */
+class CrowdStrikeServiceTest {


See if you can add any test case to validate the searchCallLatencyTimer metric as well.

eirsep · 2025-04-25T18:33:32Z

+     *   @param paginationLink  An optional pagination URL suffix (used when fetching next pages).
+     *   @return                A {@link CrowdStrikeApiResponse} containing response body and headers.
+     */
+    public CrowdStrikeApiResponse getAllContent(Long startTime, Long endTime, String paginationLink) {


minor: can we rename method and append to code comments to indicate this indicator API if it's not a generic get for all api responses

srikanthjg · 2025-04-25T22:50:35Z

Having configurable client timeouts will help

I'll consider adding this as part of source configuration along with look_back_days in follow up PR.

Signed-off-by: ngsupta1 <guptaneha.e@gmail.com>

engechas · 2025-04-25T23:47:42Z

+                String filter1 = URLEncoder.encode(LAST_UPDATED + ":>=" + startTime.getEpochSecond(), StandardCharsets.UTF_8);
+                String filter2 = URLEncoder.encode(LAST_UPDATED + ":<" + endTime.getEpochSecond(), StandardCharsets.UTF_8);


minor: startTimeFilter, endTimeFilter would be better names than filter1, filter2

engechas · 2025-04-25T23:48:55Z

 @Getter
+@Setter


Minor: @Data contains both @Getter and @Setter functionality + has the added bonus of providing a toString implementation for logging

eirsep · 2025-04-26T16:23:14Z

+ * Represents the response returned from a CrowdStrike API call.
+ */
+@Getter @Setter
+ public class CrowdStrikeApiResponse {


nit: can we rename this as CrowdstrikeIntelApiResponse for disambiguation purposes.. since there are other APIs we could be integrating in the future

eirsep · 2025-04-26T16:29:27Z

+                return new URI(urlString);
+            } else {
+                // Manually construct and encode the query string
+                String filter1 = URLEncoder.encode(LAST_UPDATED + ":>=" + startTime.getEpochSecond(), StandardCharsets.UTF_8);


minor: plz also use CONSTANTS like GREATER_THAN, LESSER_THAN and explain in code comments with an example of what the final url will look like for readability

Signed-off-by: ngsupta1 <guptaneha.e@gmail.com>

san81

Nice progress 👍

CrowdStrike API call and retry mechanism

bdcb41e

Signed-off-by: nsgupta1 <nsgupta1@users.noreply.github.com>

nsgupta1 requested review from KarstenSchnitter, chenqi0805, dinujoh, dlvenable, engechas, graytaylor0, kkondaka, oeyh, san81, sb2k16 and srikanthjg as code owners April 24, 2025 23:50

san81 reviewed Apr 25, 2025

View reviewed changes

eirsep reviewed Apr 25, 2025

View reviewed changes

srikanthjg reviewed Apr 25, 2025

View reviewed changes

Addressing review comments

cf2f975

Signed-off-by: ngsupta1 <guptaneha.e@gmail.com>

engechas previously approved these changes Apr 25, 2025

View reviewed changes

eirsep reviewed Apr 26, 2025

View reviewed changes

Addressing minor comments

d87303e

Signed-off-by: ngsupta1 <guptaneha.e@gmail.com>

nsgupta1 dismissed engechas’s stale review via d87303e April 28, 2025 03:42

nsgupta1 requested review from engechas and san81 April 28, 2025 03:43

nsgupta1 added 2 commits April 27, 2025 21:04

Fixing checkstyle errors

1cbd9bd

Signed-off-by: ngsupta1 <guptaneha.e@gmail.com>

Fixing unit test failure

79a8a43

Signed-off-by: ngsupta1 <guptaneha.e@gmail.com>

engechas approved these changes Apr 28, 2025

View reviewed changes

san81 approved these changes Apr 28, 2025

View reviewed changes

san81 merged commit 5340c3d into opensearch-project:main Apr 28, 2025
45 of 47 checks passed

		String filter1 = URLEncoder.encode(LAST_UPDATED + ":>=" + startTime.getEpochSecond(), StandardCharsets.UTF_8);
		String filter2 = URLEncoder.encode(LAST_UPDATED + ":<" + endTime.getEpochSecond(), StandardCharsets.UTF_8);

Conversation

nsgupta1 commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issues Resolved

Check List

Uh oh!

san81 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

san81 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nsgupta1 commented Apr 24, 2025 •

edited

Loading