Flaky component test: BulkDispatchWorkflows should wait for all child workflows to complete

## Summary

The component test `BulkDispatchWorkflows should wait for all child workflows to complete` in `test/component/Elsa.Workflows.ComponentTests/Scenarios/Activities/Composition/BulkDispatchWorkflows/BulkDispatchWorkflowsTests.cs` is **non-deterministic**. Two runs of the same commit on the same branch yield different outcomes.

## Evidence

Running on an internal fork of `main` @ commit `5e97e9130` (main + a workflow-only additive change, no source modification):

| Run | Workflow | Outcome for `BulkDispatchWorkflows_should_wait...` |
| --- | --- | --- |
| A | Internal `ci-mediawan.yml` (unit + component, `dotnet test` per-project, net10.0) | **PASS** — 188 passed / 3 skipped / 191 total in `Elsa.Workflows.ComponentTests.dll` |
| B | Upstream `packages.yml` (`Test with coverage` job, same per-project pattern, net10.0) | **FAIL** — `Assert.Equal() Failure: Values differ, Expected: 4, Actual: 3` |

Both runs executed on the **same commit SHA**, on `ubuntu-latest`, with .NET SDK 10.0.1xx, Release config, `/p:CollectCoverage=true`. Only difference : other test projects running alongside in the same job (and therefore overall test duration / resource contention).

## Hypothesis

The assertion reads `Expected: 4 / Actual: 3`, which suggests one of the four dispatched child workflows hadn't reported completion at the moment the parent asserts. Classic race on a shared completion signal / missing await / premature read of a counter incremented asynchronously by child callbacks.

Possible culprits (to be confirmed by someone with the runtime context):

1. The test polls/awaits on a fixed timeout rather than a deterministic signal
2. A child workflow completion event is dispatched fire-and-forget and may be observed out of order
3. Shared state (counter, dictionary) accessed without memory barrier — reminiscent of the pattern that [#7284](https://github.com/elsa-workflows/elsa-core/pull/7284) fixed (`Fix BulkDispatchWorkflows sharing input dictionary across dispatches`) but in a different surface

## Suggested next steps

- Reproduce locally with `dotnet test --filter "BulkDispatchWorkflows should wait for all child workflows to complete"` in a loop (`for i in {1..30}; do ...; done`) to quantify flake rate
- Consider an explicit barrier / completion waiter instead of a sleep-then-assert
- If the root cause is time-sensitive, `[Retry(3)]` is a temporary mitigation but not a fix

I'm happy to file a follow-up PR if you have a direction in mind. Opening this primarily to flag the flake before it masks a future regression.

cc @sfmskywalker

Run	Workflow	Outcome for `BulkDispatchWorkflows_should_wait...`
A	Internal `ci-mediawan.yml` (unit + component, `dotnet test` per-project, net10.0)	PASS — 188 passed / 3 skipped / 191 total in `Elsa.Workflows.ComponentTests.dll`
B	Upstream `packages.yml` (`Test with coverage` job, same per-project pattern, net10.0)	FAIL — `Assert.Equal() Failure: Values differ, Expected: 4, Actual: 3`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky component test: BulkDispatchWorkflows should wait for all child workflows to complete #7404

Summary

Evidence

Hypothesis

Suggested next steps

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Flaky component test: BulkDispatchWorkflows should wait for all child workflows to complete #7404

Description

Summary

Evidence

Hypothesis

Suggested next steps

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions