refactor(metrics): migrate buffer/retry metrics to unified VendorAPIMetricsRecorder#6428
Conversation
|
|
||
| Runnable leaderScheduler = new LeaderScheduler(coordinator, crawler); | ||
| LeaderScheduler leaderScheduler = new LeaderScheduler(coordinator, crawler); | ||
| leaderScheduler.setLeaseInterval(sourceConfig.getLeaseInterval()); |
There was a problem hiding this comment.
Can you remind me again that 1 min is the current leaseInterval for LeaderScheduler?
There was a problem hiding this comment.
Yes, the LeaderScheduler currently uses a 1-minute leaseInterval, which is why we set the default to 1 minute in CrawlerSourceConfig.
There was a problem hiding this comment.
Extracted this change into a separate PR for independent review: #6432
| bufferWriteFailuresCounter.increment(); | ||
| throw e; | ||
| } | ||
| metricsRecorder.recordBufferWriteLatency(() -> { |
There was a problem hiding this comment.
nit: we don't need to wrap this with the try-catch block for error right?
There was a problem hiding this comment.
Correct! We don't need the try-catch wrapper for two reasons: 1) writeRecordsWithRetry() already calls metricsRecorder.recordBufferWriteFailure() internally when exceptions occur, so wrapping it would double-count buffer write failures, and 2) it's already inside the outer try-catch block in executePartition() which properly handles all exceptions with metrics recording and retryable error categorization.
| * CrawlerClient implementation for Office 365 Management API audit logs. | ||
| * | ||
| * Handles partition execution, audit log retrieval, JSON processing, | ||
| * and buffer management with retry logic and comprehensive metrics recording. |
There was a problem hiding this comment.
am open to discussion, but I have concerns adding so many ReadMe information for each function.It is because one main goal is to add AI automation for future review and even code generation.One challenge is to avoid hitting context window. Do you think it make sense to remove function description on each connector?I am fine to keep it for shareable classes like VendorAPIMetricsRecorder
| * | ||
| * @return Duration for lease interval | ||
| */ | ||
| default Duration getLeaseInterval() { |
There was a problem hiding this comment.
Can we avoid having so many changes in one PR/CR? It would slow down the review.In this case, getLeaseInterval should have its own PR so folks can discuss this independently. It is also higher priority given it is high impact.
There was a problem hiding this comment.
Extracted this change into a separate PR for independent review: #6432
…etricsRecorder Signed-off-by: Alexander Christensen <alchrisk@amazon.com>
826e74d to
d266ff5
Compare
…etricsRecorder (opensearch-project#6428) Signed-off-by: Alexander Christensen <alchrisk@amazon.com>
…etricsRecorder (opensearch-project#6428) Signed-off-by: Alexander Christensen <alchrisk@amazon.com> Signed-off-by: Simon ELBAZ <elbazsimon9@gmail.com>
…etricsRecorder (opensearch-project#6428) Signed-off-by: Alexander Christensen <alchrisk@amazon.com>
…etricsRecorder (opensearch-project#6428) Signed-off-by: Alexander Christensen <alchrisk@amazon.com>
Description
This PR migrates buffer and retry metrics to the unified VendorAPIMetricsRecorder, consolidating metric recording across the codebase.
Changes
Testing
Signed-off-by: Alexander Christensen alchrisk@amazon.com