CBL-8071: Do not retry replicator connections if WebSocket being closed abnormally by jianminzhao · Pull Request #2475 · couchbase/couchbase-lite-core

jianminzhao · 2026-04-28T22:47:50Z

The Abnormal Close occurs after the client is already decided to close the socket but fails to receive response from the server. We don't want to retry replicating on this condition.

We fix it by a change in WebSocket, that it now treats a connection closure that is not fully complete only due to a pending final server acknowledgement (i.e. CLOSE frame was already sent).

…ed abnormally The Abnormal Close occurs after the client is already decided to close the socket but fails to receive response from the server. We don't want to retry replicating on this condition. We fix it by a change in WebSocket, that it now treats a connection closure that is not fully complete only due to a pending final server acknowledgement (i.e. CLOSE frame was already sent).

github-actions · 2026-04-28T22:48:00Z

This is a release branch and commits are restricted.

Please confirm this PR is one of the following:

A response to a customer ask
A change per our security policy
A non-functional change (i.e. changes needed for building an older version)
A change that has been granted an exception (please comment)

cbl-bot · 2026-04-28T23:02:46Z

Code Coverage Results:

Type	Percentage
branches	67.9
functions	79.29
instantiations	35.13
lines	79.38
regions	75.22

Copilot

Pull request overview

Adjusts WebSocket close-status mapping to avoid treating an in-progress client-initiated close (CLOSE sent, no peer ACK) as an “abnormal” close that would trigger replicator retry behavior.

Changes:

Update WebSocketImpl::onClose to report kCodeNormal (instead of kCodeAbnormal) when the close was initiated by the client (_closeSent) but not fully acknowledged (!_closeReceived) and the underlying disconnect is otherwise considered clean.
Add a replicator API test covering “peer going away” both before CLOSE is sent and after CLOSE is sent.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
Replicator/tests/ReplicatorAPITest.cc	Adds a regression test for peer “going away” behavior before/after a client-initiated close.
Networking/WebSockets/WebSocketImpl.cc	Changes close code mapping to avoid reporting abnormal close when CLOSE was already sent by the client.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-29T01:12:11Z

+                        // if (!expected) then _closeSent => !_closeReceived
+                        if ( !expected ) status.code = _closeSent ? kCodeNormal : kCodeAbnormal;
                        else if ( !_closeMessage )


The inline comment is logically incorrect: !expected does not imply _closeSent => !_closeReceived (it can also mean neither CLOSE was sent nor received). Consider updating the comment and making the intent explicit in code (e.g., branch specifically on _closeSent && !_closeReceived for the "client already initiated close" case, otherwise keep abnormal).

Suggested change

// if (!expected) then _closeSent => !_closeReceived

if ( !expected ) status.code = _closeSent ? kCodeNormal : kCodeAbnormal;

else if ( !_closeMessage )

if ( !expected ) {

// Treat only the "client already initiated close, but no CLOSE was received"

// case as normal; all other unexpected close states remain abnormal.

status.code = (_closeSent && !_closeReceived) ? kCodeNormal : kCodeAbnormal;

} else if ( !_closeMessage )

I agreed with the copilot comment that the comment is a bit confusing. I have also mistakenly think that it's an obsolute code comment that got left behind.

Copilot · 2026-04-29T01:12:11Z

+        // Give some time for Replicator::_stop to be called, but before timeout in WebSocketImpl
+        // to not get Timeout error.
+        std::this_thread::sleep_for(1s);
+        // WebSocket will treat it as Normal Close
+        c4socket_closed(c4socket, {WebSocketDomain, websocket::kCodeGoingAway});


The test relies on a fixed sleep_for(1s) to ensure the CLOSE frame has been sent before calling c4socket_closed. This is timing-dependent and can be flaky on slow/loaded CI machines. Prefer synchronizing on an observable event (e.g., have factory.write signal when it sees the CLOSE write after c4repl_stop, then trigger c4socket_closed), or poll with a bounded timeout until the expected write happens.

I could agree with this comment. For a quick mitigate, I think we could bump the timeout a little bit. However, if we are not seeing the test to be flaky anywhere, we can leave it as is.

pasin · 2026-04-29T23:44:23Z

+        // Give some time for Replicator::_stop to be called, but before timeout in WebSocketImpl
+        // to not get Timeout error.
+        std::this_thread::sleep_for(1s);
+        // WebSocket will treat it as Normal Close
+        c4socket_closed(c4socket, {WebSocketDomain, websocket::kCodeGoingAway});


I could agree with this comment. For a quick mitigate, I think we could bump the timeout a little bit. However, if we are not seeing the test to be flaky anywhere, we can leave it as is.

pasin · 2026-04-29T23:45:42Z

+                        // if (!expected) then _closeSent => !_closeReceived
+                        if ( !expected ) status.code = _closeSent ? kCodeNormal : kCodeAbnormal;
                        else if ( !_closeMessage )


I agreed with the copilot comment that the comment is a bit confusing. I have also mistakenly think that it's an obsolute code comment that got left behind.

github-actions · 2026-04-30T01:00:28Z

This is a release branch and commits are restricted.

Please confirm this PR is one of the following:

A response to a customer ask
A change per our security policy
A non-functional change (i.e. changes needed for building an older version)
A change that has been granted an exception (please comment)

jianminzhao requested review from Copilot and pasin April 29, 2026 01:06

Copilot started reviewing on behalf of jianminzhao April 29, 2026 01:07 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

pasin approved these changes Apr 29, 2026

View reviewed changes

pasin changed the title ~~CBL-8071: Do not retry replicator connections if WebSocket being clos…~~ CBL-8071: Do not retry replicator connections if WebSocket being closed abnormally Apr 29, 2026

addressing review comment.

37d3840

jianminzhao merged commit c8b2b18 into release/3.2 Apr 30, 2026
9 of 10 checks passed

jianminzhao deleted the cbl-8071 branch April 30, 2026 01:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CBL-8071: Do not retry replicator connections if WebSocket being closed abnormally#2475

CBL-8071: Do not retry replicator connections if WebSocket being closed abnormally#2475
jianminzhao merged 2 commits into
release/3.2from
cbl-8071

jianminzhao commented Apr 28, 2026 •

edited by pasin

Loading

Uh oh!

github-actions Bot commented Apr 28, 2026 •

edited by pasin

Loading

Uh oh!

cbl-bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

pasin Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

pasin Apr 29, 2026

Uh oh!

pasin Apr 29, 2026

Uh oh!

pasin Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-                        // if (!expected) then _closeSent => !_closeReceived
-                        if ( !expected ) status.code = _closeSent ? kCodeNormal : kCodeAbnormal;
-                        else if ( !_closeMessage )
+                        if ( !expected ) {
+                            // Treat only the "client already initiated close, but no CLOSE was received"
+                            // case as normal; all other unexpected close states remain abnormal.
+                            status.code = (_closeSent && !_closeReceived) ? kCodeNormal : kCodeAbnormal;
+                        } else if ( !_closeMessage )

Uh oh!

Conversation

jianminzhao commented Apr 28, 2026 • edited by pasin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 28, 2026 • edited by pasin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cbl-bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

pasin Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

pasin Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

pasin Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

pasin Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jianminzhao commented Apr 28, 2026 •

edited by pasin

Loading

github-actions Bot commented Apr 28, 2026 •

edited by pasin

Loading

cbl-bot commented Apr 28, 2026 •

edited

Loading