fix: remove await from afterSaveMessage callbacks execution#37503
fix: remove await from afterSaveMessage callbacks execution#37503
Conversation
|
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Looks like this PR is not ready to merge, because of the following issues:
Please fix the issues and try again If you have any trouble, please check the PR guidelines |
d9fe526 to
fd84604
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #37503 +/- ##
===========================================
- Coverage 69.81% 69.80% -0.01%
===========================================
Files 3296 3296
Lines 119173 119173
Branches 21479 21503 +24
===========================================
- Hits 83197 83193 -4
- Misses 32668 32672 +4
Partials 3308 3308
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
@copilot based on the errors in this workflow run, document the blast radius of the proposed change and create a plan to adapt the data flows to accept the |
Blast Radius of
|
| Priority | Callback ID | File | What it does | Fire-and-forget safe? |
|---|---|---|---|---|
| HIGH | native-federation-after-room-message-sent |
ee/server/hooks/federation/index.ts |
Sends message to Matrix via FederationMatrix.sendMessage |
|
| HIGH | federation-matrix-after-room-message-updated |
ee/server/hooks/federation/index.ts |
Sends message edits to Matrix | |
| MEDIUM | notifyUsersOnMessage |
app/lib/server/lib/notifyUsersOnMessage.ts |
Updates user subscriptions (unread counts, highlights, mention increments) via roomUpdater |
❌ State mutation; many tests read subscription state immediately after send |
| MEDIUM | engagementDashboard.afterSaveMessage |
ee/server/lib/engagementDashboard/startup.ts |
Increments engagement analytics counters | ✅ Analytics; eventual consistency is fine |
| MEDIUM | autotranslate |
app/autotranslate/server/autotranslate.ts |
Auto-translates messages via external provider | ✅ Can be eventually consistent |
| MEDIUM | message-read-receipt-afterSaveMessage |
ee/app/message-read-receipt/server/hooks/afterSaveMessage.ts |
Marks message as read by its sender | ✅ Can be eventually consistent |
| MEDIUM | after-omnichannel-save-message |
app/livechat/server/hooks/afterSaveOmnichannelMessage.ts |
Runs the full afterOmnichannelSaveMessage callback chain (livechat room updates) |
❌ State mutation; livechat API tests check room state right after send |
| MEDIUM | search-events |
app/search/server/events/index.ts |
Emits search index update event | ✅ Can be eventually consistent |
| LOW | sendNotificationsOnMessage |
app/lib/server/lib/sendNotificationsOnMessage.ts |
Sends push/email notifications | ✅ Inherently async |
| LOW | SlackBridge_Out |
app/slackbridge/server/RocketAdapter.ts |
Relays message to Slack | ✅ External call; already effectively async |
| LOW | integrations-sendMessage |
app/integrations/server/triggers.ts |
Triggers outgoing webhook integrations | ✅ External call; can be eventually consistent |
| LOW | threads-after-save-message |
app/threads/server/hooks/aftersavemessage.ts |
Updates thread metadata (tcount, replies, tlm), thread subscriptions, sends thread notifications |
❌ State mutation; many API tests check tcount / parent message state immediately after posting a reply |
| LOW | PropagateDiscussionMetadata |
app/discussion/server/hooks/propagateDiscussionMetadata.ts |
Updates parent room's linked system-message metadata | ❌ State mutation; discussion API tests check the parent message right after posting |
| LOW | mentionUserNotInChannel |
app/lib/server/startup/mentionUserNotInChannel.ts |
Validates @mentions against channel membership | ✅ Post-send validation; can be async |
| — | EmailInbox (no ID) |
server/features/EmailInbox/EmailInbox_Outgoing.ts |
Sends email reply via inbox | ✅ External; can be async |
Additionally, after the callback chain runs, afterSaveMessage calls:
Rooms.updateFromUpdater(...)— applies any changes accumulated inroomUpdater(e.g. roomlm,usersCount). Now happens asynchronously.Message.afterSave({ message: data })— notifies core services of message save. Now happens asynchronously.
Why the CI tests fail
Test API (CE & EE) / Test API Livechat (CE & EE)
These Mocha API tests follow the pattern:
- POST
chat.sendMessage(or livechat equivalent) - Immediately GET the parent message / room / subscription
The failing assertions are things like:
- Thread parent message has
tcount > 0— set byprocessThreads()inthreads-after-save-message - Discussion parent message has updated
msgs/lm— set byPropagateDiscussionMetadata - Livechat room has updated metadata — set by
after-omnichannel-save-message - User subscription unread count incremented — set by
notifyUsersOnMessage
With fire-and-forget, these DB writes happen in a later event-loop turn (via setTimeout(0) in runAsync, or in a floating Promise for the void case), so the GET immediately after POST can race with the callback.
Test Federation Matrix
Federation tests send a message then poll Matrix's /_matrix/client/v3/sync to verify it arrived. The native-federation-after-room-message-sent callback now starts after sendMessage returns, so the Matrix side sees the message slightly later. This is a timing race that can flip flaky at different VM speeds.
Test UI (CE/EE) shards 2
Playwright tests that send a message in a thread or discussion and then assert on thread counts / parent message states in the UI face the same timing issue — the DB isn't updated by the time the UI re-renders.
Plan to adapt data flows
The goal is to keep afterSaveMessage fire-and-forget in sendMessage while ensuring correctness. There are two tracks:
Track 1 — Code: split "must complete before return" from "can be deferred"
The cleanest approach is to stop using a single afterSaveMessage chain for everything and instead split it into two named hooks:
afterSaveMessageSync— run withawaitinsidesendMessageandupdateMessage. Only state-mutating callbacks live here.afterSaveMessage— run withvoid(orcallbacks.runAsync). Only side-effect callbacks live here.
Alternatively, keep one hook but change each callback's registration point:
| Callback | Action |
|---|---|
notifyUsersOnMessage |
Move to synchronous execution (run it directly in sendMessage before returning, or keep in a sync hook) |
after-omnichannel-save-message |
Same — move to synchronous, or call afterOmnichannelSaveMessage directly in the livechat send path before returning |
threads-after-save-message |
Move thread metadata updates (reply(), Messages.setThreadMessagesCount) to be awaited before sendMessage returns; only thread notifications can be fire-and-forget |
PropagateDiscussionMetadata |
Move updateAndNotifyParentRoomWithParentMessage to be awaited before sendMessage returns |
engagementDashboard.afterSaveMessage |
No change — keep async |
autotranslate |
No change — keep async |
message-read-receipt-afterSaveMessage |
No change — keep async |
search-events |
No change — keep async |
sendNotificationsOnMessage |
No change — keep async |
SlackBridge_Out |
No change — keep async |
integrations-sendMessage |
No change — keep async |
native-federation-* |
Keep async; tests need to add explicit waits (already done in most Federation test utils) |
EmailInbox |
No change — keep async |
Also: updateMessage.ts still uses await afterSaveMessage(...) — it should be made consistent with whatever approach is chosen for sendMessage.
Track 2 — Tests: add retry/polling where assertions race with async callbacks
For any test that cannot wait for a synchronous callback (e.g., federation), replace direct-assert-after-send patterns with a poll helper. Example (pseudo-code):
// Before
await sendMessage(...);
const parent = await getParentMessage();
expect(parent.tcount).to.equal(1); // can race
// After
await sendMessage(...);
const parent = await poll(() => getParentMessage(), { until: (m) => m.tcount === 1 });
expect(parent.tcount).to.equal(1);A poll utility (exponential backoff, configurable timeout) should be introduced in the test helpers if not already available.
Immediate next steps (prioritised)
- Decide scope of the sync hook — confirm which callbacks must be synchronous (proposal above). The minimum is
notifyUsersOnMessage, thread metadata inprocessThreads,PropagateDiscussionMetadata, andafter-omnichannel-save-message. - Refactor those 4 callbacks out of the async chain into direct awaited calls in
sendMessage/ livechat send paths. - Update
updateMessage.tsto match (either both sync or both async). - Add polling helpers to the test framework for assertions that are inherently eventually-consistent (federation, notification delivery).
- Re-run CI and verify the remaining failures are only in the inherently-async categories.
Proposed changes (including videos or screenshots)
Issue(s)
Steps to test or reproduce
Further comments