Skip to content

Calling KinesisAsyncClient#close() throws BlockingOperationException #6863

@TomaszGaweda

Description

@TomaszGaweda

Describe the bug

Hello,

In Hazelcast we use AWS Java SDK in our Kinesis integration. Recently we've migrated from v1 to v2 and our tests became flaky - sometimes calling KinesisAsyncClient.close() results in:

[ WARN] [aws-java-sdk-NettyEventLoop-52-10] [i.n.u.c.DefaultPromise]: An exception was thrown by software.amazon.awssdk.http.nio.netty.internal.http2.Http2MultiplexedChannelPool$$Lambda/0x00007f5300911928.operationComplete()
io.netty.util.concurrent.BlockingOperationException: DefaultChannelPromise@4414e1f4(incomplete)
	at io.netty.util.concurrent.DefaultPromise.checkDeadLock(DefaultPromise.java:477)
	at io.netty.channel.DefaultChannelPromise.checkDeadLock(DefaultChannelPromise.java:159)
	at io.netty.util.concurrent.DefaultPromise.awaitUninterruptibly(DefaultPromise.java:283)
	at io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:137)
	at io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:30)
	at io.netty.channel.pool.SimpleChannelPool.close(SimpleChannelPool.java:408)
	at software.amazon.awssdk.http.nio.netty.internal.BetterSimpleChannelPool.close(BetterSimpleChannelPool.java:38)
	at software.amazon.awssdk.http.nio.netty.internal.HonorCloseOnReleaseChannelPool.close(HonorCloseOnReleaseChannelPool.java:80)
	at software.amazon.awssdk.http.nio.netty.internal.http2.Http2MultiplexedChannelPool.lambda$doClose$11(Http2MultiplexedChannelPool.java:419)
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:604)
	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:571)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:506)
	at io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:199)
	at software.amazon.awssdk.http.nio.netty.internal.http2.Http2MultiplexedChannelPool.lambda$doClose$12(Http2MultiplexedChannelPool.java:418)
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
	at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at java.base/java.lang.Thread.run(Thread.java:1583)

We call close in here: https://github.com/hazelcast/hazelcast/blob/master/extensions/kinesis/src/main/java/com/hazelcast/jet/kinesis/impl/source/KinesisSourcePSupplier.java#L134-L138

In v1 it worked just fine. In v2 is flaky, works sometimes, sometimes throws an exception. Removing a call to close() "fixed" the issue, but it really feels rather like a workaround, rather than proper solution - it is said that calling close() is not needed, but not that it is prohibited (and if it was, then it would probably throw always).

I see that Flink has similar problem: https://issues.apache.org/jira/browse/FLINK-37949

Our migration commit can be seen here: hazelcast/hazelcast@cb700c3 (if you want to check what has changed in our codebase).

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

Works always (not only from time to time) and does not throw BlockingOperationException

Current Behavior

Throws software.amazon.awssdk.http.nio.netty.internal.http2.Http2MultiplexedChannelPool$$Lambda/0x00007f5300911928.operationComplete() io.netty.util.concurrent.BlockingOperationException: DefaultChannelPromise

Reproduction Steps

You can run KinesisIntegrationTest#restart_dynamicStream_graceful in Hazelcast codebase in a loop, will crash eventually.

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

2.42.31

JDK version used

Range from JDK17 to JDK25

Operating System and version

Test running on Linux, not working on Mac 26.3.1 either

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.needs-triageThis issue or PR still needs to be triaged.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions