Fix race: client receives broadcast updates before sync handshake completes#138
Open
xrl wants to merge 2 commits into
Open
Fix race: client receives broadcast updates before sync handshake completes#138xrl wants to merge 2 commits into
xrl wants to merge 2 commits into
Conversation
…completes YRoom.serve() added the client to self.clients before the initial sync exchange (SYNC_STEP1/SYNC_STEP2) finished. If the server-side document was mutated concurrently (e.g. by an MCP tool adding notebook cells), _broadcast_updates would send a SYNC_UPDATE to the un-synced client. The client-side yjs then throws "Unexpected case" in findIndexSS because the update references a struct at a clock the client doesn't have yet (always exactly one tick past the client's last known struct). The fix defers self.clients.add(channel) until after the first sync message exchange completes, so broadcast updates only reach clients that have a complete initial state. Also changes self.clients.remove() to self.clients.discard() in the finally block, since a client that disconnected before syncing was never added to the set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
saruulkhasar
approved these changes
Apr 9, 2026
| channel.path, | ||
| ) | ||
| tg.start_soon(channel.send, reply) | ||
| if not synced: |
Contributor
There was a problem hiding this comment.
I think that we are synchronized when we receive a SYNC_STEP2 message:
Suggested change
| if not synced: | |
| if not synced and message[1] == YSyncMessageType.SYNC_STEP2: |
Contributor
There was a problem hiding this comment.
Sorry, that's not correct. The client is synchronized when the server sends a SYNC_STEP2 after receiving the client's SYNC_STEP1, so we don't need the synced variable:
Suggested change
| if not synced: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
YRoom.serve()adds the client toself.clientsimmediately upon connection — before the SYNC_STEP1/SYNC_STEP2 handshake finishes. If the server-side document is mutated concurrently,_broadcast_updatessends aSYNC_UPDATEto the un-synced client. The client-side yjs then throws "Unexpected case" infindIndexSSbecause the update references a struct at a clock the client doesn't have yet.How to trigger
This happens reliably in JupyterLab with jupyter-collaboration when an AI agent (via MCP tools) rapidly adds cells to a notebook while a new browser tab opens the same notebook. The new tab's WebSocket connects and starts the sync handshake, but before SYNC_STEP2 is received, a broadcast update arrives referencing a struct one clock tick past what the client has.
Debug data from instrumented yjs findIndexSS
The client has structs through clock 1222. The server sent an update referencing clock 1223. Always exactly one tick past the end — one broadcast update leaked through before the sync completed.
The race
Fix
Defer
self.clients.add(channel)until after the first sync message exchange completes. The client only joins the broadcast list once it has a complete initial state.Also changes
self.clients.remove()toself.clients.discard()in thefinallyblock, since a client that disconnects before syncing was never added to the set.Test
Added
test_broadcast_not_sent_before_sync_completeswhich:Without the fix, the client would crash on "Unexpected case" during step 3.