fix(webapp): stop replica lag from double-triggering session runs and 404ing fresh sessions by ericallam · Pull Request #3914 · triggerdotdev/trigger.dev

ericallam · 2026-06-12T09:56:48Z

Summary

Two read-replica races on the session APIs could break chats whose first activity lands inside the replication window (or any time the replica lags):

A session's first .in append or .out subscribe could fail with a 404 for a session that exists on the writer, because the route resolved the Session row on the replica only.
ensureRunForSession probed run liveness on the replica, so a probe miss on a run triggered moments earlier was judged "run is dead" and a second live run was spawned for the same session. Both runs then consumed the same input stream, producing duplicated turns and doubled responses (and doubled LLM cost).

Fix

Liveness now re-probes the writer before declaring the current run dead (the old code already fell back to the writer, but only to recover the friendlyId, after the wrong verdict was made). Session resolution on the append and subscribe/init routes goes through a new resolveSessionWithWriterFallback, which stays replica-first on the hot path and only touches the writer on a miss.

Reproduced and verified against a local streaming replica with an artificial apply delay: pre-fix, a send immediately after session creation reliably produced either the 404 or two executing runs with a doubled response; post-fix, the same flow produces exactly one run and one response.

Also rides along: the local docker replica's default apply delay drops from 150ms to a realistic 20ms (override via REPLICA_APPLY_DELAY when you want to deliberately widen the race window).

… 404ing fresh sessions ensureRunForSession probed run liveness on the read replica, so a probe miss on a just-triggered run was judged dead and a second live run was spawned for the same session, duplicating turns and responses. The append and subscribe/init session routes also resolved the Session row on the replica only, failing a fresh session's first append or subscribe inside the replication window. Liveness now re-probes the writer before declaring a run dead, and session resolution on those routes falls back to the writer on a replica miss.

changeset-bot · 2026-06-12T09:56:54Z

⚠️ No Changeset found

Latest commit: c0ccb24

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-06-12T09:57:05Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3f9ab13a-0640-4504-a3c3-cb0ae8545441

📥 Commits

Reviewing files that changed from the base of the PR and between 8856bc6 and c0ccb24.

📒 Files selected for processing (1)

docker/docker-compose.yml

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)

GitHub Check: webapp / 🧪 Unit Tests: Webapp (5, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (7, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (4, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (9, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (3, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (10, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (1, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (8, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (2, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (6, 10)
GitHub Check: typecheck / typecheck
GitHub Check: e2e-webapp / 🧪 E2E Tests: Webapp
GitHub Check: 🛡️ E2E Auth Tests (full)

🔇 Additional comments (2)

docker/docker-compose.yml (2)

61-62: LGTM!

76-76: LGTM!

Walkthrough

This PR fixes read-replica race conditions affecting session APIs by introducing a replica-first resolver with writer fallback and updating run liveness detection. The core change adds resolveSessionWithWriterFallback to session resolution logic, allowing realtime routes to tolerate replica lag when looking up sessions. Run probe logic is also enhanced to explicitly re-probe the writer when the replica misses, preventing incorrect "dead run" detection. The changes resolve two behavioral issues: initial session appends no longer return 404, and session runs created just before replica updates are no longer treated as dead, preventing duplicate run triggers and chat response duplication.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description covers the problem statement, implementation details, and verification approach, but is missing the required checklist items, testing steps section, and changelog section from the template.	Add the missing template sections: checklist with contribution guide verification, explicit testing steps section, changelog section, and screenshots section if applicable.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and concisely summarizes the main changes: fixing replica lag issues that cause duplicate session runs and 404 errors on fresh sessions.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/session-replica-races

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

150ms is useful for deliberately shaking out replica races but is an order of magnitude above typical streaming-replication lag; 20ms keeps the local replica honest by default. Override via REPLICA_APPLY_DELAY.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

chore(docker): default the dev replica apply delay to a realistic 20ms

c0ccb24

150ms is useful for deliberately shaking out replica races but is an order of magnitude above typical streaming-replication lag; 20ms keeps the local replica honest by default. Override via REPLICA_APPLY_DELAY.

ericallam marked this pull request as ready for review June 12, 2026 11:36

devin-ai-integration Bot reviewed Jun 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(webapp): stop replica lag from double-triggering session runs and 404ing fresh sessions#3914

fix(webapp): stop replica lag from double-triggering session runs and 404ing fresh sessions#3914
ericallam wants to merge 2 commits into
mainfrom
fix/session-replica-races

ericallam commented Jun 12, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ericallam commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Uh oh!

changeset-bot Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ericallam commented Jun 12, 2026 •

edited

Loading

changeset-bot Bot commented Jun 12, 2026 •

edited

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading