Metric Centralization through Dependency Injection#6354
Conversation
| */ | ||
| private void waitUntilPeerListPopulated(final AwsCloudMapPeerListProvider objectUnderTest) { | ||
| await().atMost(5, TimeUnit.SECONDS) | ||
| await().atMost(30, TimeUnit.SECONDS) |
There was a problem hiding this comment.
This unit tests fails intermittently.
AwsCloudMapPeerListProviderTest > WithSeveralFailedAttempts > listener_gets_list_after_several_failed_attempts() FAILED
java.lang.AssertionError:
Expected: a value greater than <0>
but: <0> was equal to <0>
AwsCloudMapPeerListProviderTest > WithSeveralFailedAttempts > getPeerList_returns_value_after_several_failed_attempts() FAILED
java.lang.AssertionError:
Expected: a value greater than <0>
but: <0> was equal to <0>
Increasing the wait time appeared to fix the issue. So I am assuming there is some scheduling issue with the tests that doesn't allow the full test loop to complete before the 5 second wait period.
| */ | ||
| private void waitUntilPeerListPopulated(final AwsCloudMapPeerListProvider objectUnderTest) { | ||
| await().atMost(5, TimeUnit.SECONDS) | ||
| await().atMost(30, TimeUnit.SECONDS) |
There was a problem hiding this comment.
This is too long. Do you really need this big of a wait time? It increased overall build time, which is already not in a good state.
There was a problem hiding this comment.
I am not really sure what is a good wait time but I can try reducing it to 10.
There was a problem hiding this comment.
10 is definitely better than 30, but I would first understand why 5 is not enough.
There was a problem hiding this comment.
I can revert if need be, I am not sure why the peer list is sometimes not populated within the 5 second time frame but seems to be more consistent with the 10 second time frame and I don't have the bandwidth to take a deeper look at the moment.
There was a problem hiding this comment.
It appears that it still occasionally fails to populate the list when this is 10 seconds, removed this change.
| try { | ||
| headers.setBearerAuth(authConfig.getAccessToken()); | ||
| apiCallsCounter.increment(); | ||
| provideApiRequestsCounter(pluginMetrics).increment(); |
There was a problem hiding this comment.
I am actually not sure this kind of MetricHeper static method based seggregation is really helping with anything. I would argue that having pluginMetrics.counter(TOTAL_API_REQUESTS).increment being here is more readable than jumping to a method to get that reference. What are we trying to solve with additional method hop?
There was a problem hiding this comment.
The goal is to unify metrics for consistency across plugins—addressing naming inconsistencies, creating a single source of truth for metric definitions, and reducing redundant test burden by centralizing metric logic in one place rather than duplicating it across each plugin. Once this change is merged in, I'll work on migrating the other crawlers over to use this same pattern.
2aad063 to
9da6959
Compare
796fc85 to
8a55dda
Compare
| public Office365RestClient(final Office365AuthenticationInterface authConfig, | ||
| final PluginMetrics pluginMetrics) { | ||
| // TODO: Abstract into a Office365PluginMetrics | ||
| final SearchApiMetricsRecorder searchApiMetrics, |
There was a problem hiding this comment.
I don't see a good reason to have 3 record. This constructor would need to be updated if we add additional metric.
We should combine all of them to one Recorder called VendorAPIMetricsRecorder
The usage is like
VendorAPIMetricsRecorder.searchAPICounter
There was a problem hiding this comment.
Sounds good, combine the recorders into one recorder.
8a55dda to
7b4ab5f
Compare
7b4ab5f to
e714a37
Compare
|
|
||
| public VendorAPIMetricsRecorder error(Exception exception) { | ||
| // Record specific error types | ||
| if (exception instanceof HttpClientErrorException) { |
There was a problem hiding this comment.
Shouldn't we still emit these metrics like previously emit here ? These metrics were agreed upon with ATP team.
private static final String REQUEST_ACCESS_DENIED = "requestAccessDenied";
private static final String REQUEST_THROTTLED = "requestThrottled";
private static final String RESOURCE_NOT_FOUND = "resourceNotFound";
There was a problem hiding this comment.
Good catch! I will update this.
| }, authConfig::renewCredentials); | ||
| }, authConfig::renewCredentials, metricsRecorder::recordAuthFailure); | ||
| } | ||
| return null; // Void return for Supplier |
There was a problem hiding this comment.
nit: returning null is fine and works, but to be cleaner, we can have an overload of recordAuthLatency as a runnable operation in VendorAPIMetricsRecorder this way its a little more explicit for devs later and we can get rid of the redundant return null. Like this:
public void recordAuthLatency(Runnable operation) {
authLatencyTimer.record(operation);
}
|
|
||
| public void recordGetResponseSize(ResponseEntity<?> response) { | ||
| if (response != null && response.getBody() != null) { | ||
| getResponseSizeSummary.record(response.getHeaders().getContentLength()); |
There was a problem hiding this comment.
does getContentLength return -1 if not found? could record -1 as a metric for those cases.
There was a problem hiding this comment.
There was a problem hiding this comment.
+1 on this. We should not record when the value is negative
There was a problem hiding this comment.
Quick offline discussion summary:
The current implementation already emits -1 following the getContentLength() behavior. However, this could cause a small downward skew when calculating the average of this metric. To avoid this, we'll emit 0 instead of -1 when no body is present.
8adf278 to
fe03411
Compare
| HttpClientErrorException httpE = (HttpClientErrorException) exception; | ||
| HttpStatus status = httpE.getStatusCode(); | ||
|
|
||
| if (status == HttpStatus.FORBIDDEN || status == HttpStatus.UNAUTHORIZED) { |
There was a problem hiding this comment.
super-nit: We should use HttpStatus.FORBIDDEN == status instead. Same comment for the other checks here.
There was a problem hiding this comment.
I can update that really quick.
fe03411 to
93df770
Compare
| String.class | ||
| ); | ||
| log.debug("Started subscription for {}: {}", contentType, response.getBody()); | ||
| metricsRecorder.recordAuthSuccess(); |
There was a problem hiding this comment.
I don't think m365 subscription is an auth action. We should not emit metric here.
There was a problem hiding this comment.
renewCredentials function is where auth metric should be emitted.
There was a problem hiding this comment.
Reverted the change, will work on Auth Metrics and API specific metrics in another PR.
2520f76 to
23b6494
Compare
This change centralizes metrics creation and management by implementing dependency injection for metrics in the Office365 source plugin and other remaining components. This ensures consistent metrics handling across the codebase. Signed-off-by: Alexander Christensen <alchrisk@amazon.com>
23b6494 to
8844f10
Compare
…t#6354) This change centralizes metrics creation and management by implementing dependency injection for metrics in the Office365 source plugin and other remaining components. This ensures consistent metrics handling across the codebase. Signed-off-by: Alexander Christensen <alchrisk@amazon.com> Co-authored-by: Alexander Christensen <alchrisk@amazon.com> Signed-off-by: Nathan Wand <wandna@amazon.com>
…t#6354) This change centralizes metrics creation and management by implementing dependency injection for metrics in the Office365 source plugin and other remaining components. This ensures consistent metrics handling across the codebase. Signed-off-by: Alexander Christensen <alchrisk@amazon.com> Co-authored-by: Alexander Christensen <alchrisk@amazon.com> Signed-off-by: Simon ELBAZ <elbazsimon9@gmail.com>
…t#6354) This change centralizes metrics creation and management by implementing dependency injection for metrics in the Office365 source plugin and other remaining components. This ensures consistent metrics handling across the codebase. Signed-off-by: Alexander Christensen <alchrisk@amazon.com> Co-authored-by: Alexander Christensen <alchrisk@amazon.com>
…t#6354) This change centralizes metrics creation and management by implementing dependency injection for metrics in the Office365 source plugin and other remaining components. This ensures consistent metrics handling across the codebase. Signed-off-by: Alexander Christensen <alchrisk@amazon.com> Co-authored-by: Alexander Christensen <alchrisk@amazon.com>
Description
Metrics Available in VendorAPIMetricsRecorder
Search Metrics
Get Metrics
Authentication Metrics
Shared Metrics
Error Metrics
Future Work:
Testing:
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.