Describe the bug
Using DynamoDbAsyncClient, when I try to scan a table with at least a few thousand elements and user the per-item publisher, the operation sometimes hangs.
Something like this:
client.scanPaginator(ScanRequest.builder()
.tableName("test")
.consistentRead(true)
.build()).items().subscribe(_ -> counter.incrementAndGet()).join();
always hangs for me with the right size table (see repro repo).
If I don't use items(), i.e. just inspect the individual pages, I never see a hang:
client.scanPaginator(ScanRequest.builder()
.tableName("test")
.consistentRead(true)
.build()).subscribe(response -> response.items().forEach(_ -> counter.incrementAndGet())).join();
I believe the issue is the recursive processing in sendNextElement. If there are enough elements in a single page (in my repro, I could get 7K elements in a page which was enough), each element makes a recursive call to sendNextElement which probably blows out the stack. Nothing even listens on the CompletionStage returned by the whenComplete so this exception is silently dropped.
Regression Issue
Expected Behavior
The publisher should not hang and publish all elements of the scan
Current Behavior
The publisher hangs forever
Reproduction Steps
https://github.com/ravi-signal/scan-publisher-hang/tree/main is a fully self contained reproduction that has a test that always hangs for me.
We noticed this in a production system pointed at a real dynamodb instance so I do not think this is a test artifact.
Possible Solution
- Convert the recursive processing loop of the current page's results into an iterative one
- Surround the
whenComplete in a try/catch and propagate exceptions to the subscriber
Additional Information/Context
No response
AWS Java SDK version used
2.33.5
JDK version used
openjdk version "24.0.1" 2025-04-15 OpenJDK Runtime Environment Temurin-24.0.1+9 (build 24.0.1+9) OpenJDK 64-Bit Server VM Temurin-24.0.1+9 (build 24.0.1+9, mixed mode, sharing)
Operating System and version
macOS 15.6.1
Describe the bug
Using
DynamoDbAsyncClient, when I try to scan a table with at least a few thousand elements and user the per-item publisher, the operation sometimes hangs.Something like this:
always hangs for me with the right size table (see repro repo).
If I don't use
items(), i.e. just inspect the individual pages, I never see a hang:I believe the issue is the recursive processing in sendNextElement. If there are enough elements in a single page (in my repro, I could get 7K elements in a page which was enough), each element makes a recursive call to
sendNextElementwhich probably blows out the stack. Nothing even listens on theCompletionStagereturned by thewhenCompleteso this exception is silently dropped.Regression Issue
Expected Behavior
The publisher should not hang and publish all elements of the scan
Current Behavior
The publisher hangs forever
Reproduction Steps
https://github.com/ravi-signal/scan-publisher-hang/tree/main is a fully self contained reproduction that has a test that always hangs for me.
We noticed this in a production system pointed at a real dynamodb instance so I do not think this is a test artifact.
Possible Solution
whenCompletein a try/catch and propagate exceptions to the subscriberAdditional Information/Context
No response
AWS Java SDK version used
2.33.5
JDK version used
openjdk version "24.0.1" 2025-04-15 OpenJDK Runtime Environment Temurin-24.0.1+9 (build 24.0.1+9) OpenJDK 64-Bit Server VM Temurin-24.0.1+9 (build 24.0.1+9, mixed mode, sharing)
Operating System and version
macOS 15.6.1