CrowdStrike client and coordinator implementation#5678
Conversation
Signed-off-by: nsgupta1 <nsgupta1@users.noreply.github.com>
| createPartitionForCrawling(leaderPartition, coordinator, latestModifiedTime); | ||
| long crawlTimeMillis = System.currentTimeMillis() - startTime; | ||
| log.debug("Crawling completed in {} ms", crawlTimeMillis); | ||
| crawlingTimer.record(crawlTimeMillis, TimeUnit.MILLISECONDS); |
There was a problem hiding this comment.
Leader Crawling in this case doesn't involve any network calls, tracking this time may not be much helpful. May be tracking the number of partitions helpful here? Something to think about.
There was a problem hiding this comment.
Yes, we are tracking the partitionCounter, I will remove Crawling time.
| for(int i = remainingDays; i > 0; i--) { | ||
| Instant startDate = todayUtc.minus(Duration.ofDays(i)); | ||
| createWorkerPartition(startDate, startDate.plus(Duration.ofDays(1)), coordinator); | ||
| updateLeaderProgressState(leaderPartition, i-1, leaderProgressState.getLastPollTime(), coordinator); |
There was a problem hiding this comment.
May be updatingLeaderProgressState just once at the end of the loop is also fine I guess as there are not network calls and possibility of failures inside the loop are super minimal.
There was a problem hiding this comment.
I will change this.
san81
left a comment
There was a problem hiding this comment.
Just added 2 suggestions. Nice work 👍
Thank you for quick review |
Signed-off-by: nsgupta1 <nsgupta1@users.noreply.github.com>
| */ | ||
| @Override | ||
| public void executePartition(SaasWorkerProgressState state, Buffer<Record<Event>> buffer, AcknowledgementSet acknowledgementSet) { | ||
| public void executePartition(CrowdStrikeWorkerProgressState state, Buffer<Record<Event>> buffer, AcknowledgementSet acknowledgementSet) { |
|
|
||
| try { | ||
| buffer.writeAll(records, (int) Duration.ofSeconds(bufferWriteTimeoutInSeconds).toMillis()); | ||
| } catch (Exception e) { |
There was a problem hiding this comment.
Its better to catch the expected exception rather than generalising it. This might mask the actual actual code error as service failure.
There was a problem hiding this comment.
WriteAll method doesn't throw a specific exception, that's why I had to catch Exception. Here is the method signature. void writeAll(Collection<T> records, int timeoutInMillis) throws Exception;
| buffer.writeAll(records, (int) Duration.ofSeconds(bufferWriteTimeoutInSeconds).toMillis()); | ||
| } catch (Exception e) { | ||
| log.error("Failed to write {} indicators to buffer", records.size(), e); | ||
| throw new RuntimeException("Buffer write failed for CrowdStrike indicators", e); |
There was a problem hiding this comment.
Convert this to a custom exception rather than throwing runtime exception
There was a problem hiding this comment.
I can address it in follow-up PR, I see all the other clients also have same implementation I can check with @san81 and create a specific exception for all Saas crawlers.
| .registerModule(new JavaTimeModule()); | ||
| private final LeaderProgressState state; | ||
|
|
||
| public LeaderPartition() { |
There was a problem hiding this comment.
No it is not used anymore
|
|
||
| @Named | ||
| public class PaginationCrawler implements Crawler { | ||
| public class PaginationCrawler implements Crawler<AtlassianWorkerProgressState> { |
There was a problem hiding this comment.
Is this crawler specific to Atlassian/Jira? Why use a Atlassian specific state for base crawler/s?
There was a problem hiding this comment.
Yes PaginationCrawler is specific to Atlassian and TimeSlice is specific to CrowdStrikeWorker.
There was a problem hiding this comment.
can we rename PaginationCrawler to AtlassianPaginationCrawler?
Signed-off-by: nsgupta1 <nsgupta1@users.noreply.github.com>
| CrowdStrikeIndicatorResult result = response.getBody(); | ||
| List<ThreatIndicator> indicators = result.getResults(); | ||
| if (indicators == null || indicators.isEmpty()) { | ||
| log.info("No threat indicators found for the time window {} to {}", startTime, endTime); |
There was a problem hiding this comment.
if it's empty should we continue to next page at all?
| @Min(0) | ||
| @Max(90) | ||
| @Valid | ||
| private int lookBackDays = DEFAULT_NUMBER_OF_LOOK_BACK_DAYS; |
| */ | ||
| @JsonProperty("published_date") | ||
| private long publishedDate = 0L; | ||
| private Instant publishedDate; |
There was a problem hiding this comment.
can all these variables be final?
| @JsonSubTypes.Type(value = AtlassianLeaderProgressState.class), | ||
| @JsonSubTypes.Type(value = CrowdStrikeLeaderProgressState.class) | ||
| }) | ||
| public interface LeaderProgressState { |
There was a problem hiding this comment.
does it not store information about failures too?
plz add java docs on what is "pollTime"
| } | ||
|
|
||
| /** | ||
| * Main crawling logic for CrowdStrike using time-based partitioning. |
There was a problem hiding this comment.
plz add code comments/java docs for each method
| double startCount = partitionsCreatedCounter.count(); | ||
| createPartitionForCrawling(leaderPartition, coordinator, latestModifiedTime); | ||
| double partitionsInThisCrawl = partitionsCreatedCounter.count() - startCount; | ||
| log.info("Total partitions created in this crawl: {}", partitionsInThisCrawl); |
There was a problem hiding this comment.
plz have log indicate. it is crowdstrikeClient's partition count
| private Instant endTime; | ||
|
|
||
| @JsonProperty("marker") | ||
| private String marker; |
There was a problem hiding this comment.
can we add java doc on what is a marker
| CrowdStrikeService crowdStrikeService; | ||
| private static final Logger log = LoggerFactory.getLogger(CrowdStrikeClient.class); | ||
| private final CrowdStrikeSourceConfig configuration; | ||
| private final int bufferWriteTimeoutInSeconds = 10; |
There was a problem hiding this comment.
Should this be configurable?
| @JsonProperty("remaining_days") | ||
| private int remainingDays; | ||
|
|
||
| public CrowdStrikeLeaderProgressState(@JsonProperty("last_poll_time") final Instant lastPollTime, @JsonProperty("remaining_days") int remainingDays) { |
There was a problem hiding this comment.
Nit: do we need the @JsonProperty annotation on both the constructor parameters and the member variables? Jackson should be able to deserialize with just a no-args constructor + setters on the member variables
…#5678) * CrowdStrike client and coordinator implementation Signed-off-by: nsgupta1 <nsgupta1@users.noreply.github.com>
Description
Issues Resolved
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.