Skip to content

Add Async proxy configuration support for AWS Kinesis/Cloudwatch input#26404

Merged
danotorrey merged 5 commits into
masterfrom
kinesis-proxy-support
Jun 26, 2026
Merged

Add Async proxy configuration support for AWS Kinesis/Cloudwatch input#26404
danotorrey merged 5 commits into
masterfrom
kinesis-proxy-support

Conversation

@Nithin-Kasam

@Nithin-Kasam Nithin-Kasam commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Description

Added async proxy configuration support for AWS Kinesis/CloudWatch input by introducing a new AWSAsyncProxyConfigurationProvider class.

Motivation and Context

The existing AWSProxyConfigurationProvider uses the Apache HTTP client, which cannot be used by AWS async clients (DynamoDB, CloudWatch, Kinesis) that the Kinesis input relies on. The Kinesis Client Library additionally requires HTTP/2, which is only supported by the Netty async client. This change enables the Kinesis/CloudWatch input to work through the configured Graylog HTTP proxy.

closes Graylog2/graylog-plugin-enterprise#14460, closes Graylog2/graylog-plugin-enterprise#14459

How Has This Been Tested?

-> Installed tiny proxy in my system.
-> Configured tiny.cfg file with
port 8888
Listen 0.0.0.0
Allow 127.0.0.1
ConnectPort 443
ConnectPort 80
-> Run the file . tinyproxy -c tiny.cfg
-> Configured http_proxy_uri = http://graylog:password12345@127.0.0.1:8888 in graylog.conf file
-> Verified logs by creating the input.

Added comprehensive unit tests in [AWSAsyncProxyConfigurationProviderTest.java] covering:

  • Proxy configuration without credentials
  • Proxy configuration with username/password credentials
  • Proxy configuration with username only (no password)
  • HTTPS proxy with credentials
  • Default HTTP port fallback (80)
  • Default HTTPS port fallback (443)

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactoring (non-breaking change)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have requested a documentation update.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.

@Nithin-Kasam Nithin-Kasam requested a review from a team June 18, 2026 11:39
@danotorrey

Copy link
Copy Markdown
Contributor

I just launched a test instance to give this one more smoke test with the latest.

@ryan-carroll-graylog

ryan-carroll-graylog commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

I just launched a test instance to give this one more smoke test with the latest.

Just saw this @danotorrey , I just set up a test instance running with proxy setup: https://server-pr-26404.dev.torch.sh/

Working on getting an input set up.

@ryan-carroll-graylog ryan-carroll-graylog left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and tests successfully on test instance: https://server-pr-26404.dev.torch.sh/graylog/search?q=gl2_source_input%3A6a3d72a96839d3dc77198b3a&rangetype=relative&relative=0

Sent logs to out tdir-test kinesis stream that the input is set up with:

aws kinesis put-record --stream-name tdir-test --region us-east-1 \
    --partition-key test \
    --data '{"message": "proxy test 1"}' \
    --cli-binary-format raw-in-base64-out

With proxy running. Verified by ssh-ing into the instance and viewing tinyproxy container logs while input running: sudo docker logs -f ubuntu-tinyproxy-1

Also smoke tested without proxy successfully.

Just left a few comments for nits: adding PR num to CL entry, updating comment.

Comment thread changelog/unreleased/issue-14460.toml Outdated
@danotorrey

Copy link
Copy Markdown
Contributor

I just launched a test instance to give this one more smoke test with the latest.

Just saw this @danotorrey , I just set up a test instance running with proxy setup: https://server-pr-26404.dev.torch.sh/

Working on getting an input set up.

@ryan-carroll-graylog Thanks a lot. I actually added this comment to the wrong PR, sorry. I meant to add it here: https://github.com/Graylog2/graylog-plugin-enterprise/pull/14470#issuecomment-4803170661.

final DynamoDbAsyncClientBuilder dynamoDbClientBuilder = DynamoDbAsyncClient.builder();
awsClientBuilderUtil.initializeBuilder(dynamoDbClientBuilder, request.dynamodbEndpoint(), region, credentialsProvider);
dynamoDbClientBuilder.httpClientBuilder(awsClientBuilderUtil.asyncHttpClientBuilder());
final DynamoDbAsyncClient dynamoClient = dynamoDbClientBuilder.build();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only feedback from Claude that seems pertinent to clean up is some leaked resources:

  graylog2-server/src/main/java/org/graylog/integrations/aws/transports/KinesisConsumer.java:116,120 — DynamoDB and CloudWatch async clients never closed
  dynamoClient and cloudWatchClient are local variables in run() with explicit Netty HTTP client builders. stop() calls only kinesisScheduler.shutdown(); the KCL Scheduler closes zero SDK clients on shutdown (verified in source). Each stop/restart cycle leaks a NioEventLoopGroup + connection pool per client.
  The pre-existing kinesisAsyncClient had the same issue, but that client previously used the SDK default lifecycle; these two now have custom Netty clients that must be explicitly closed.
The simplest fix: wrap all three clients in try-with-resources in run(). Since kinesisScheduler.run() blocks until shutdown completes, all three clients are still alive for the lifetime of the scheduler, and get closed automatically when the block exits — which happens after stop() triggers the graceful
  shutdown and kinesisScheduler.run() returns.

  try (DynamoDbAsyncClient dynamoClient = dynamoDbClientBuilder.build();
       CloudWatchAsyncClient cloudWatchClient = cloudwatchClientBuilder.build();
       KinesisAsyncClient kinesisAsyncClient = kinesisAsyncClientBuilder.build()) {

      // ... ConfigsBuilder, pollingConfig, listeners, scheduler setup ...

      LOG.debug("Starting Kinesis scheduler.");
      kinesisScheduler.run();
      LOG.debug("After Kinesis scheduler stopped.");
  } // all three clients closed here

  The sequence on stop is:
  1. External thread calls stop() → triggers kinesisScheduler.startGracefulShutdown()
  2. stop() waits up to 20s for the future
  3. Scheduler finishes → kinesisScheduler.run() returns in the run() thread
  4. Try-with-resources closes Kinesis, CloudWatch, DynamoDB clients in reverse order

let me know what you guys think @danotorrey @ryan-carroll-graylog

@ryan-carroll-graylog ryan-carroll-graylog left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and tests successfully with the latest!

@danotorrey

Copy link
Copy Markdown
Contributor

Digging into this now...

@danotorrey danotorrey left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and tested successfully. Verified both the standard flow against an existing Kinesis stream and the full auto-setup flow (stream creation, IAM role/policy, and CloudWatch subscription) on:

  • The test instance with the proxy on
  • Locally with the proxy both on and off
  • Tailed the proxy logs to confirm all the AWS traffic (Kinesis, DynamoDB, CloudWatch, IAM, STS) actually routes through it.
Image

@danotorrey danotorrey merged commit 0766b06 into master Jun 26, 2026
30 checks passed
@danotorrey danotorrey deleted the kinesis-proxy-support branch June 26, 2026 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants