Skip to content

Add Debouncer — coalesce rapid workflow calls into one execution#390

Merged
devhawk merged 33 commits into
dbos-inc:mainfrom
easmith:feature/debouncer
May 28, 2026
Merged

Add Debouncer — coalesce rapid workflow calls into one execution#390
devhawk merged 33 commits into
dbos-inc:mainfrom
easmith:feature/debouncer

Conversation

@easmith
Copy link
Copy Markdown
Contributor

@easmith easmith commented May 19, 2026

Implements a debounce mechanism for DBOS workflows analogous to dbos-transact-py _debouncer.py. Multiple calls with the same key within a period are collapsed into a single user-workflow execution that runs with the most recently supplied arguments.

Architecture:

  • DebouncerServiceImpl: internal @workflow that runs a recv-loop, absorbing messages until the debounce period times out or the absolute debounceTimeout elapses, then starts the user workflow.
  • Debouncer: public fluent API. Enqueues the service workflow on _dbos_internal_queue with a deduplicationId derived from (workflowName, debounceKey). On DBOSQueueDuplicatedException, forwards a message to the running debouncer and waits for an ack.
  • DBOSExecutor.captureInvocation(): extracted from startWorkflow so Debouncer can capture a lambda's workflow call without executing it.
  • Auto-registration of DebouncerService in DBOS constructor so users need no boilerplate setup.
  • Internal system workflows filtered from getRegisteredWorkflows / getRegisteredWorkflowInstances to keep public counts clean.

Usage:

var handle = dbos.<String>debouncer()
    .withDebounceTimeout(Duration.ofMinutes(5))
     .debounce("key", Duration.ofSeconds(2), () -> svc.process(arg)); 
String result = handle.getResult();

Tests: 6 integration tests via Testcontainers Postgres covering single-call, multi-call coalescing, absolute timeout, independent keys, concurrent callers, and queue-based user workflow.

easmith added 8 commits May 19, 2026 17:56
Implements a debounce mechanism for DBOS workflows analogous to
dbos-transact-py _debouncer.py. Multiple calls with the same key
within a period are collapsed into a single user-workflow execution
that runs with the most recently supplied arguments.

Architecture:
- DebouncerServiceImpl: internal @workflow that runs a recv-loop,
  absorbing messages until the debounce period times out or the
  absolute debounceTimeout elapses, then starts the user workflow.
- Debouncer<R>: public fluent API. Enqueues the service workflow on
  _dbos_internal_queue with a deduplicationId derived from
  (workflowName, debounceKey). On DBOSQueueDuplicatedException,
  forwards a message to the running debouncer and waits for an ack.
- DBOSExecutor.captureInvocation(): extracted from startWorkflow so
  Debouncer can capture a lambda's workflow call without executing it.
- Auto-registration of DebouncerService in DBOS constructor so users
  need no boilerplate setup.
- Internal system workflows filtered from getRegisteredWorkflows /
  getRegisteredWorkflowInstances to keep public counts clean.

Usage:
  var handle = dbos.<String>debouncer()
      .withDebounceTimeout(Duration.ofMinutes(5))
      .debounce("key", Duration.ofSeconds(2), () -> svc.process(arg));
  String result = handle.getResult();

Tests: 6 integration tests via Testcontainers Postgres covering
single-call, multi-call coalescing, absolute timeout, independent
keys, concurrent callers, and queue-based user workflow.
Bug 1: userWorkflowId and messageId were generated as UUID.randomUUID()
outside any durable step. When debounce() is called from inside a workflow,
these values differ on every replay — the returned handle points to a
nonexistent workflow and the ack getEvent waits on the wrong key forever.
Fix: wrap UUID generation in a runStep when called from a workflow context.

Bug 2: Retrieving the existing debouncer's userWorkflowId via
status.input()[1] instanceof DebouncerContextOptions always fails on replay.
Java records are implicitly final; DBOSJavaSerializer uses NON_FINAL
DefaultTyping, so no @Class type metadata is written for them. On
deserialisation from Object.class the element comes back as LinkedHashMap,
not as DebouncerContextOptions, causing an IllegalStateException.
Fix: publish userWorkflowId as a named event (DEBOUNCER_CHILD_ID_KEY) at
the start of the debouncer-workflow; callers read it via getEvent instead.
Object[] args round-trip through dbos.send/recv as generic JSON types:
long 5L serialises to JSON 5 and back to Integer(5) when the target
is Object.class, causing IllegalArgumentException when the method
expects a primitive long.

Adds JsonUtility.coerceArguments() call before startRegisteredWorkflow,
mirroring the coercion already applied in executeWorkflowById (line 1344).

Adds numericArgsRoundTripCorrectly test that exercises long/double
parameters through the full debounce + coalesce path.
lookupExistingDebouncerId previously called listWorkflows and iterated
all active debouncer entries in Java to find the one matching the
deduplication id. When called from inside a workflow this result was
also serialised as a step, making it a potential OOM bomb under load.

Add WorkflowDAO.findWorkflowIdByDeduplicationId that issues a direct
point-lookup on the UNIQUE (queue_name, deduplication_id) index:

  SELECT workflow_uuid FROM workflow_status
   WHERE queue_name = ? AND deduplication_id = ?

Expose through SystemDatabase → DBOSExecutor → DBOSIntegration so
Debouncer.lookupExistingDebouncerId becomes a single delegation call.
Two review findings investigated:

Reviewed bug: "SQL without status filter -> livelock"
Finding: NOT real. updateWorkflowOutcome clears deduplication_id to NULL
on completion (WorkflowDAO line 329). PostgreSQL UNIQUE constraints treat
NULL != NULL, so the unique slot is freed and a new enqueue succeeds without
conflict. findWorkflowIdByDeduplicationId also returns null for completed
debouncers since the WHERE deduplication_id = ? predicate never matches NULL.
Added regression test reDebouncAfterWindowCloses that confirms two sequential
debounce windows on the same key both execute correctly.

Reviewed bug: "lookupExistingDebouncerId not a durable step"
Finding: REAL when debounce() is called from inside a workflow. If the parent
workflow crashes after DBOSQueueDuplicatedException but before the first step
(send) is recorded, recovery would re-execute lookupExistingDebouncerId
against the live DB rather than replaying a recorded result. This can produce
a different debouncer id and break the determinism of the subsequent send and
getEvent steps. Python wraps the equivalent call in call_function_as_step.
Fix: when DBOS.inWorkflow() && !DBOS.inStep(), record the lookup result as a
durable step "lookupDebouncer" so recovery replays it deterministically.
- Propagate caller workflow context (priority, appVersion, deduplicationId,
  timeout) to the user workflow via DebouncerContextOptions.
  Add these fields to DBOSContext, populate from ExecutionOptions.
- Change debouncerWorkflow return type String → void: return value was
  unused, Python returns None.
- Guard send with messageSent flag: only one message per debounce() call
- Replace unreachable childIdOpt.isEmpty() continue with IllegalStateException
…rkflows

- Add debounceVoid, absoluteTimeoutUsesLatestArgs, priorityPropagatedFromCallerContext tests
- executeWorkflowById now restores priority/appVersion from workflow_status so
  DBOSContext.currentPriority() is non-null inside dequeued workflows
- Skip queue-option validation for dequeued/recovered workflows
- DebouncerServiceImpl: skip priority/deduplicationId when no user queue
@easmith easmith marked this pull request as ready for review May 19, 2026 18:40
@devhawk
Copy link
Copy Markdown
Collaborator

devhawk commented May 20, 2026

Hey @easmith, thanks for the contribution! I was traveling today so I haven't had a chance to review this, but I kicked off the GH actions run. I'll take a look tomorrow. (USA Pacific time zone)

@devhawk devhawk linked an issue May 20, 2026 that may be closed by this pull request
Comment thread transact/src/main/java/dev/dbos/transact/workflow/internal/DebouncerOptions.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/workflow/internal/DebouncerMessage.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/DBOS.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/internal/DBOSIntegration.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/workflow/Debouncer.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/workflow/Debouncer.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/workflow/Debouncer.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/workflow/Debouncer.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/workflow/Debouncer.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/DBOS.java
Comment thread transact/src/main/java/dev/dbos/transact/workflow/Debouncer.java Outdated
Comment thread transact/src/test/java/dev/dbos/transact/workflow/DebouncerTest.java Outdated
Comment thread transact/src/test/java/dev/dbos/transact/client/DebouncerClientTest.java Outdated
@devhawk
Copy link
Copy Markdown
Collaborator

devhawk commented May 27, 2026

getting close. Still need to use runDbosFunctionAsStep in Debouncer. Also added new comments about how to handle missing workflow being debounced and regarding debouncer constants. We also need a few more tests:

  • Recovery/replay: No test restarts DBOS mid-debounce and verifies that replay produces the same IDs and doesn't start a second user workflow.
  • withDeduplicationId on the user workflow: The DebouncerOptions.deduplicationId field is forwarded to the user workflow, but no test exercises it.
  • DebouncerClient missing a "re-debounce after window closes" test: DebouncerTest has reDebouncAfterWindowCloses but DebouncerClientTest does not (the client's deduplication path is slightly different).

@easmith
Copy link
Copy Markdown
Contributor Author

easmith commented May 28, 2026

  • DebouncerClient missing a "re-debounce after window closes" test: DebouncerTest has reDebouncAfterWindowCloses but DebouncerClientTest does not (the client's deduplication path is slightly different).

DebouncerClientTest.java:111 - already has that test

@easmith
Copy link
Copy Markdown
Contributor Author

easmith commented May 28, 2026

Fixed the latest review comments and added the missing tests.
Ready to review again!

@easmith easmith requested a review from devhawk May 28, 2026 12:12
Comment thread transact/src/main/java/dev/dbos/transact/DBOS.java
Comment thread transact/src/main/java/dev/dbos/transact/internal/DBOSIntegration.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/internal/DBOSIntegration.java Outdated
Comment thread transact/src/main/java/dev/dbos/transact/workflow/internal/InternalWorkflows.java Outdated
@devhawk
Copy link
Copy Markdown
Collaborator

devhawk commented May 28, 2026

I think the remaining asks from me are:

  • remove DBOSIntegration.registerInternalWorkflow. It's only called from DBOS constructor and it can call workflowRegistry.registerInternalWorkflow directly
  • remove DBOSIntegration.recordErrorForUnstartedWorkflow. Instead, inject a Supplier<DBOSExecutor> executorSupplier into InternalWorkflows and invoke DBOSExecutor.recordErrorForUnstartedWorkflow directly.

…nt integration wrappers

- DBOS constructor registers internal workflow via workflowRegistry directly
- InternalWorkflows uses Supplier<DBOSExecutor> for executor-level calls
- Remove unused Debouncer fail-fast check (error surfaced in debouncerWorkflow)
@easmith
Copy link
Copy Markdown
Contributor Author

easmith commented May 28, 2026

Thanks for HIGH quality review. Ready to go =)

@easmith easmith requested a review from devhawk May 28, 2026 17:43
@devhawk devhawk requested a review from kraftp May 28, 2026 17:54
@devhawk
Copy link
Copy Markdown
Collaborator

devhawk commented May 28, 2026

Thanks for HIGH quality review. Ready to go =)

I made a small change to debounceWorkflow - pulled the executorSupplier.get call outside the while loop and checked it for null. It was minor enough that I just made the change myself.

I'm signed off but I want @kraftp to also take a look since he wrote the original debouncer code.

* <p>Not part of the public API — the debouncer infrastructure consumes this directly.
*/
public record DebouncerMessage(
@NonNull String messageId, @NonNull Object[] args, @NonNull Duration debouncePeriod) {}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does typing work here? As far as I can tell, all tests have primitive types, what happens if a more complex type (user-defined object?) is routed through this?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DebouncerMessage is used as a parameter for debouncerWorkflow as well as the message sent to debouncerWorkflow. Both of those code paths use the standard DBOSSerializer which is tested in JavaSerializerTest.

Copy link
Copy Markdown
Member

@kraftp kraftp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, other than one comment about typing

@devhawk devhawk merged commit a8039cf into dbos-inc:main May 28, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Debouncing support

3 participants