Skip to content

Replay Outgoing Messages When Inbox Receives Duplicate#4067

Open
iancooper wants to merge 5 commits intomasterfrom
replay_on_seen
Open

Replay Outgoing Messages When Inbox Receives Duplicate#4067
iancooper wants to merge 5 commits intomasterfrom
replay_on_seen

Conversation

@iancooper
Copy link
Copy Markdown
Member

When an Inbox receives a duplicate message, we may choose to replay any outgoing messages that it triggered when it was originally called. This makes us idempotent, in that our behavior matches that of the original call. It also allows us to recreate messages for downstream consumers who might have failed, by triggering from upstream, thus allowing us to replay a whole workflow until we reach the point where it failed.

Closes #2541

iancooper and others added 2 commits April 17, 2026 18:40
Introduces the design for CausationId tracking and OnceOnlyAction.Replay,
enabling automatic replay of downstream outbox messages when the inbox
detects a duplicate command. Accepted after 7 rounds of adversarial review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cate

Create and approve implementation task list (23 tasks) for the replay
outbox messages feature. Tasks cover structural prerequisites, core
behavior (CausationId propagation, UseInboxHandler replay logic,
pipeline validation), telemetry, DI registration, and all 18 persistent
store implementations. Minor ADR clarification for ReplayCausation
semantics.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@iancooper iancooper self-assigned this Apr 17, 2026
@iancooper iancooper added 2 - In Progress feature request .NET Pull requests that update .net code V10.X labels Apr 17, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 17, 2026

Code Review — PR #4067: Replay Outgoing Messages When Inbox Receives Duplicate

This PR contains design and specification documents only — no implementation code yet. The changes are: PROMPT.md (root), docs/adr/0057-replay-outbox-on-inbox-duplicate.md, and 12 spec files under specs/0027-replay-matching-outbox-events-when-inbox-has-already-seen/.


🔴 Critical: PROMPT.md should not be committed

PROMPT.md at the repo root is an AI session state artifact — it tracks which round of ADR review was completed, which spec phases are approved, and what "next step" the AI assistant should take. This is an internal working document for the development workflow tool, not project documentation.

Recommendation: Remove PROMPT.md from this PR and add it to .gitignore. Its content is already captured in the spec files and ADR that are correctly committed.


✅ ADR Quality

docs/adr/0057-replay-outbox-on-inbox-duplicate.md is thorough and follows the project's ADR conventions well:

  • Problem statement is clear: no mechanism to replay a failed workflow from an upstream trigger
  • Decision rationale is solid — using CausationId as a scoped, single-invocation concept (vs. CorrelationId for request-reply, vs. future WorkflowId) avoids conceptual collision
  • The SupportsCausationTracking() runtime schema check is a correct defensive choice — fail fast at startup rather than at runtime
  • The tidy-first prerequisite list (tasks 1–4) correctly separates structural refactors from behavioral changes per project conventions

Design Observations

Interface duplication on SupportsCausationTracking()

Both IAmACausationTrackingInbox and IAmACausationTrackingOutbox expose both sync and async variants:

bool SupportsCausationTracking();
Task<bool> SupportsCausationTrackingAsync(CancellationToken cancellationToken = default);

This is a schema check that typically only needs to run once at startup (or can be cached). Consider whether the async variant adds real value, or if a single sync check (cached after first call) is sufficient. Most schema-existence checks against relational databases complete in microseconds on a live connection. The async overload adds interface surface that every implementing store (all 18 of them) must provide.

ReplayCausation args parameter

void ReplayCausation(string causationId, RequestContext? requestContext,
    Dictionary<string, object>? args = null);

The args dictionary follows the existing Brighter convention for extensibility, which is fine. However, the ADR does not document what keys/values are valid. Adding a brief note about expected args (or confirming none are currently used) would help implementors.

UseInboxHandler prerequisite: switching from InitRequestContext() to this.Context

The ADR correctly flags this as a tidy-first change (Task 1), but notes it "affects the existing Throw and Warn paths as well." This is a subtle behavioral change: existing tests may have been written against the old private context. The task spec calls this out with "needs focused regression tests" — make sure the existing When_Handling_A_Command_Once_Only_With_Throw_Enabled and Warn tests still pass after the refactor, not just after the full implementation.

Describe() global inbox injection

Task 3 adds global inbox attributes to Describe() output to match Build(). The ADR notes this is a shared infrastructure change that "should be treated as a tidy-first structural change with its own focused test coverage." This is the right call — the existing pipeline validation test suite should be run as a regression gate before any behavioral tasks proceed.

Relay vs. Replay enum name

The ADR (round 4 finding #3) notes that the requirements use "e.g." so Replay is not mandated. The name Replay is clear and consistent with the feature concept — this is fine as-is.


Spec Completeness

The tasks.md (23 tasks) is very well structured with clear TDD commands, file locations, and dependency ordering. A few observations:

  • Task 4 (expose Outbox from IAmAnOutboxProducerMediator) is a public API surface change. The ADR should explicitly document this interface change as a breaking/additive decision, not just as an implementation prerequisite.
  • Tasks 19–22 (all 18 persistent stores) are large in scope. The task list correctly gates these behind base test classes (Task 18), but there should be an explicit note about whether the persistent store migration SQL is expected in this PR or arrives separately. The ADR says "a separate PR provides migration tooling" — the tasks.md should reference that dependency.
  • Task 16 (add telemetry to existing Throw/Warn/Add paths) is labeled a "tidy improvement" but is included in the same tasks.md. Consider whether this should be a separate spec/PR to keep scope clear.

Summary

Area Status
PROMPT.md in repo root ❌ Should be removed
ADR quality and completeness ✅ Well-written
Interface design ⚠️ SupportsCausationTracking async variant worth revisiting
Tidy-first discipline ✅ Correctly scoped
Task list TDD discipline ✅ Follows project /test-first conventions
Scope creep (Task 16) ⚠️ Consider separating telemetry tidy from feature work
Persistent store migration dependency ⚠️ Needs explicit cross-PR reference in tasks.md

The core design is sound. The main ask is removing PROMPT.md before merge; the interface and scope observations are lower priority suggestions.

iancooper and others added 2 commits April 20, 2026 17:26
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
codescene-delta-analysis[bot]

This comment was marked as outdated.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 20, 2026

Code Review — PR #4067: Replay Outgoing Messages When Inbox Receives Duplicate

Overview

This PR is a specification and design phase PR only — it contains requirements, ADR 0057, and task definitions for the Replay-on-Inbox-Duplicate feature. No production code has been written yet (the PROMPT.md explicitly states "Implementation — not started"). The 1606 additions are almost entirely documentation and spec artefacts.


Issues

🔴 PROMPT.md Should Not Be Merged

PROMPT.md is a Claude Code session-resume artefact containing internal workflow state (task review history, implementation notes). It is simultaneously committed and gitignored in this PR, which means it will persist in the repository's git history even after being ignored for future commits.

Recommendation: Remove PROMPT.md from the branch before merging. The .gitignore entry is the right call, but the file itself should be deleted so it does not land in master's history.

🟡 README.md for Spec 0027 Shows Tasks as Incomplete

specs/0027-.../README.md shows:

- [ ] Tasks (`tasks.md`)

...but .tasks-approved is committed in the same PR, meaning tasks are approved.

Recommendation: Update the README checkbox to [x] Tasks.

🟡 Inconsistent specs/.current-specspecs/.current_spec Rename

The PR deletes specs/.current-spec (hyphen) and updates specs/.current_spec (underscore). This is a housekeeping fix but worth a brief mention in the PR description so reviewers understand the intent.


ADR 0057 Quality

The ADR is thorough and well-reasoned. Specific strengths:

  • Non-breaking opt-in design via role interfaces (IAmACausationTrackingInbox, IAmACausationTrackingOutbox) is the right approach. Third-party implementations continue to work unchanged.
  • Pipeline validation at startup using Specification<HandlerPipelineDescription> is consistent with ADR 0053 and catches misconfiguration before any message is processed.
  • Tidy-first prerequisites are clearly separated from behavioral changes — switching UseInboxHandler to use the pipeline's this.Context before adding the Replay path is correct.
  • Race condition acknowledged (sweeper dispatches while ReplayCausation clears DispatchedAt): the decision to accept this in favour of downstream inbox deduplication is sound and correctly documented.

One design consideration worth flagging:

SupportsCausationTracking() I/O cost at startup: The ADR describes this as a "runtime schema check" — for relational stores, this almost certainly requires a round-trip to the database (e.g. querying INFORMATION_SCHEMA). Pipeline validation runs at startup, and if the check is synchronous and hits the DB, it adds latency to application startup. For async stores this is worse as there will be a sync-over-async call path. The ADR should document whether this check is cached after the first call, and whether the async variant (SupportsCausationTrackingAsync) is used during async pipeline validation.


Spec / Requirements Quality

  • Requirements are clear and well-scoped.
  • Out-of-scope items (saga orchestration, immediate-send replay, migration tooling) are explicitly listed — good.
  • AC-8 covers all persistent stores, which aligns with the ADR's 18-store list.
  • Minor typo in requirements.md line 507: "Brigter consumer" and "te re-execution" / "re never actioned" — these don't affect correctness but should be fixed before merge.

Summary

Area Status
PROMPT.md in commit history ❌ Remove before merge
specs/0027/README.md tasks checkbox ⚠️ Update to [x]
ADR 0057 design quality ✅ Solid
SupportsCausationTracking() startup I/O cost ⚠️ Document caching/async behaviour
Requirements completeness ✅ Good — minor typos only
Non-breaking / opt-in design ✅ Correct approach

The design is sound and the specification artefacts are high quality. The main blocker before merging is removing PROMPT.md from the branch history.


Review generated by Claude Code

Copy link
Copy Markdown

@codescene-delta-analysis codescene-delta-analysis Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No application code in the PR — skipped Code Health checks.

See analysis details in CodeScene

Quality Gate Profile: Clean Code Collective
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 20, 2026

Code Review — PR #4067: Replay Outgoing Messages When Inbox Receives Duplicate

Note: This PR is a specification/planning PR — it contains no implementation code. All changes are documentation, ADR, and spec artifacts. The review below is therefore focused on the design quality and planning artifacts.


What's in this PR

  • .gitignore updated to exclude PROMPT.md
  • PROMPT.md — a Claude Code session resume file (accidentally committed)
  • docs/adr/0057-replay-outbox-on-inbox-duplicate.md — comprehensive ADR (624 lines)
  • specs/0027-replay-matching-outbox-events-when-inbox-has-already-seen/ — full spec: requirements, tasks (23 tasks), design review, approval markers
  • Review/approval artifacts for specs 0022 and 0026
  • .current_spec pointer updated; duplicate .current-spec (hyphen) file deleted

Issues

PROMPT.md should not be committed

The .gitignore update adds PROMPT.md, which is correct going forward, but the file itself is now tracked and is part of this PR's diff. It contains Claude Code internal session state (task history, round-by-round review notes, implementation roadmap). This should be removed from git tracking:

git rm --cached PROMPT.md

Without this, the session file becomes permanent history and will appear in everyone's working trees on checkout.


ADR Quality — docs/adr/0057-replay-outbox-on-inbox-duplicate.md

The ADR is well-structured and thorough. The following points are worth addressing before or during implementation:

1. DescribePipelines() also needs InboxConfiguration

The ADR correctly identifies that ValidatePipelines() creates PipelineBuilder without InboxConfiguration. However, DescribePipelines() (line 89 in BrighterPipelineValidationExtensions.cs) has the same gap. The review-design.md notes this at score 35. Since DescribePipelines() is a diagnostic tool used by operators to inspect pipeline state, giving it an incomplete view of the pipeline (missing global inbox attributes) makes its output misleading. Worth fixing as part of Task 3.

2. SupportsCausationTracking() synchronous schema check at startup

SupportsCausationTracking() is described as a "permanent runtime schema check." For persistent stores this likely means a DB query (e.g. checking if the CausationId column exists). When called during ValidatePipelines() at startup, this blocks application startup for every registered pipeline. For stores with slow connections (cloud-hosted DBs, cold containers), this could meaningfully extend startup time. The ADR should document the expected performance characteristics of this check, or note that implementations should cache the result.

3. Race condition framing could be clearer

The ADR describes the race condition — sweeper dispatches while ReplayCausation is clearing — and correctly notes it's acceptable because downstream inboxes handle duplicates. However, the framing implies only a single delivery. In practice, a message could be delivered twice: once by the in-flight sweep, and once after the dispatch state is cleared. The mitigation should explicitly say "downstream consumers with OnceOnly inboxes handle the duplicate." Without inboxes downstream, this becomes a real correctness concern. Worth adding a note about the requirement for downstream idempotency.

4. CausationId defaults to request.Id

Using the command's Id as the default CausationId is convenient, but the implications should be documented: if request.Id is not truly unique across handler invocations (e.g. deterministic IDs in tests or retried commands with the same ID), two different handler invocations could share the same CausationId, causing one invocation's replay to clear another's outbox messages. The ADR should note this assumption.


Requirements (requirements.md)

Minor: two typos in the Problem Statement — "Brigter consumer" → "Brighter consumer" and "those that we re never actioned" → "were never actioned". Small but this document is the canonical source of truth for the feature.


Tasks (tasks.md)

The task structure is clean with clear TDD enforcement. A few observations:

Task 3 — DescribePipelines() not in scope

As noted above, DescribePipelines() has the same InboxConfiguration gap as ValidatePipelines(). Consider adding it to Task 3 to fix both paths together.

Task 14 — 7 validation scenarios in one test file

The review-tasks.md (score 50) flags that Task 14 packs 7 scenarios into one test file and recommends [Theory]/[InlineData]. This is the right call — enforce it explicitly in the task description so the implementing agent follows it, rather than leaving it as a "recommendation."

Task 23 — "run all core tests" is underspecified

Task 23 says "Build + run all core tests" without listing which projects constitute "core." Given the breadth of this change (UseInboxHandler, PipelineBuilder, validation, InMemoryInbox/Outbox), consider enumerating the minimum test suite: Paramore.Brighter.Core.Tests, the OnceOnly suite, and the Validation suite at minimum.

Dependency on schema migration PR

The ADR notes a separate schema migration PR must land on master before this feature. This external dependency is not tracked in the task list. Suggest adding a prerequisite note or a blocking PR link so the implementation isn't merged ahead of its schema dependency.


Summary

Area Status
PROMPT.md in repo Must fix — remove from git tracking
ADR structure & completeness Good — minor gaps noted
DescribePipelines() gap Low priority — fix in Task 3
SupportsCausationTracking() startup cost Needs documentation
Race condition clarity Needs a note on downstream idempotency requirement
Requirements typos Trivial — worth fixing
Task 14 scenario structure Clarify [Theory] expectation in task
Task 23 test scope Enumerate specific suites
Schema migration PR dependency Track explicitly

The design itself is sound — the role interface approach (IAmACausationTrackingInbox/Outbox) is non-breaking and consistent with Brighter's existing patterns. The decision to use the outbox sweeper for re-dispatch rather than building a new dispatch path is the right call. The CausationId concept is well-scoped and distinct from CorrelationId/JobId.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2 - In Progress feature request .NET Pull requests that update .net code V10.X

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Force Replay of Outbox on seeing duplicate Processed Inbox Message

1 participant