Skip to content

Fix requestsTooLarge metric reporting when decompression buffer overflows on armeria in otel logs source#6633

Merged
graytaylor0 merged 2 commits into
opensearch-project:mainfrom
graytaylor0:MetricFix413
Mar 16, 2026
Merged

Fix requestsTooLarge metric reporting when decompression buffer overflows on armeria in otel logs source#6633
graytaylor0 merged 2 commits into
opensearch-project:mainfrom
graytaylor0:MetricFix413

Conversation

@graytaylor0

@graytaylor0 graytaylor0 commented Mar 12, 2026

Copy link
Copy Markdown
Member

Description

This change handles an edge case where the otel logs source does not report the "requestsTooLarge" metric despite throwing a 413 exception.

The edge case happens when the source is receiving compressed requests. If the decompressed request exceeds the max_request_length, the decompression buffer in armeria throws a 413 error, but it does not increment the metric "requestsTooLarge" because the exception handler does not run in this scenario.

Issues Resolved

Related to #5594

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…lows on armeria in otel logs source

Signed-off-by: Taylor Gray <tylgry@amazon.com>

@KarstenSchnitter KarstenSchnitter left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing this change. I saw a test earlier. Is it possible to test this addition?

final HttpStatus expectedStatus,
final Throwable throwable) {
assertThat("Http Status", response.status(), equalTo(expectedStatus));
assertThat("Http Status", response.status(), equalTo(expectedStatus));

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unecessary whitespace.

Suggested change
assertThat("Http Status", response.status(), equalTo(expectedStatus));
assertThat("Http Status", response.status(), equalTo(expectedStatus));

@graytaylor0 graytaylor0 Mar 12, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KarstenSchnitter The test I had did not actually cover the new code. I have not been able to find a good way to test this code path with log access writer. I think there needs to be some actual integration tests that don't mock the OtelLogsSource server to cover this. If you see a way to add a unit test for this let me know.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I do not see an easy possibility. Furthermore, it would be more of a test of armeria.

Signed-off-by: Taylor Gray <tylgry@amazon.com>
serverBuilder.http(oTelLogsSourceConfig.getPort());
}

serverBuilder.accessLogWriter(log -> {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this catches all cases?
When buffer.writeBytes(request.toByteArray(), null, bufferWriteTimeoutInMillis); throws an exception, we catch and throw new BufferWriteException(e.getMessage(), e);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this exception already handled? This would indicate to me, that this PR closes a hole in the existing error handling.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this should already be handled. The problem was that armeria was swallowing the exception for decompression overflowing the max request size, and not running the exception handler code that increments this metric.

@kkondaka

Copy link
Copy Markdown
Collaborator

@graytaylor0 we should probably add the siimilar metric for OTEL traces and metrics as well, right?

@KarstenSchnitter

Copy link
Copy Markdown
Collaborator

@graytaylor0 we should probably add the siimilar metric for OTEL traces and metrics as well, right?

I agree. In my experience the issue is most likely with metrics.Note, that this will create a conflict with #6604.

@graytaylor0

Copy link
Copy Markdown
Member Author

@graytaylor0 we should probably add the siimilar metric for OTEL traces and metrics as well, right?

Yes we should. I will follow up with a similar change in those sources.

@graytaylor0

Copy link
Copy Markdown
Member Author

@graytaylor0 we should probably add the siimilar metric for OTEL traces and metrics as well, right?

I agree. In my experience the issue is most likely with metrics.Note, that this will create a conflict with #6604.

Yes I will make a similar change in the other sources as a follow up.

@graytaylor0 graytaylor0 merged commit 2c95e70 into opensearch-project:main Mar 16, 2026
70 of 72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants