Skip to content

Add aggregate metrics for rds source#5697

Merged
oeyh merged 3 commits into
opensearch-project:mainfrom
oeyh:rds/add-aggregate-metrics
May 19, 2025
Merged

Add aggregate metrics for rds source#5697
oeyh merged 3 commits into
opensearch-project:mainfrom
oeyh:rds/add-aggregate-metrics

Conversation

@oeyh

@oeyh oeyh commented May 16, 2025

Copy link
Copy Markdown
Collaborator

Description

Issues Resolved

N/A

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@oeyh oeyh marked this pull request as ready for review May 16, 2025 14:43
@oeyh oeyh changed the title Add aggregated metrics for rds source Add aggregate metrics for rds source May 16, 2025
Comment on lines +74 to +88
rdsSourceAggregateMetrics.getStream4xxErrors().increment();
LOG.error("Failed to connect to replication stream: Authentication failed. [{}]", e.getMessage());
} else if (e.getCause() != null && e.getCause().getMessage() != null && e.getCause().getMessage().contains(CONNECTION_REFUSED)) {
rdsSourceAggregateMetrics.getStream4xxErrors().increment();
LOG.error("Failed to connect to replication stream: Cannot connect to MySQL server. [{}]", e.getMessage());
} else if (e.getMessage() != null && e.getMessage().contains(FAILED_TO_DETERMINE_BINLOG_FILENAME)) {
rdsSourceAggregateMetrics.getStream4xxErrors().increment();
LOG.error("Failed to connect to replication stream: Binary logging not enabled on the server. [{}]", e.getMessage());
} else if (e.getMessage() != null && e.getMessage().contains(ACCESS_DENIED)) {
rdsSourceAggregateMetrics.getStream4xxErrors().increment();
LOG.error("Failed to connect to replication stream: Insufficient privileges. [{}]", e.getMessage());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also emit specific metrics for each cases or add dimension to 4xxErrors metrics.

@oeyh oeyh May 19, 2025

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Added metrics for those specific errors.

}

private void categorizeError(Exception e) {
if (e.getMessage() != null && e.getMessage().contains(AUTHENTICATION_FAILED)) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added metrics for those specific errors.

LOG.error("Failed to create or process PostgreSQL replication stream: Replication slot does not exist. [{}]", e.getMessage());
} else if (e.getMessage() != null && e.getMessage().contains(PERMISSION_DENIED)) {
rdsSourceAggregateMetrics.getStream4xxErrors().increment();
rdsSourceAggregateMetrics.getStreamAccessDeniedErrors().increment();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update the unit test to verify on these new metrics added ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

dinujoh
dinujoh previously approved these changes May 19, 2025
oeyh added 3 commits May 19, 2025 15:11
Signed-off-by: Hai Yan <oeyh@amazon.com>
Signed-off-by: Hai Yan <oeyh@amazon.com>
Signed-off-by: Hai Yan <oeyh@amazon.com>
@oeyh oeyh force-pushed the rds/add-aggregate-metrics branch from fe340ac to d8d656e Compare May 19, 2025 20:12
@oeyh

oeyh commented May 19, 2025

Copy link
Copy Markdown
Collaborator Author

Did a rebase to resolve merge conflicts

rdsSourceAggregateMetrics.getStream4xxErrors().increment();
rdsSourceAggregateMetrics.getStreamServerNotFoundErrors().increment();
LOG.error("Failed to connect to replication stream: Cannot connect to MySQL server. [{}]", e.getMessage());
} else if (e.getMessage() != null && e.getMessage().contains(FAILED_TO_DETERMINE_BINLOG_FILENAME)) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to check through exception type?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are mostly just generic IOException so I didn't use exception type except the AuthenticationException

@oeyh oeyh merged commit 5d04c25 into opensearch-project:main May 19, 2025
45 of 47 checks passed
alparish pushed a commit to alparish/data-prepper that referenced this pull request May 22, 2025
* Add aggregated metrics for rds source

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Add more granular error metrics

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Add more tests

Signed-off-by: Hai Yan <oeyh@amazon.com>

---------

Signed-off-by: Hai Yan <oeyh@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants