net/tcp: close write-worker fd on async write error (fix #3711)#3874
Open
dondetir wants to merge 1 commit intoOpenSIPS:3.4from
Open
net/tcp: close write-worker fd on async write error (fix #3711)#3874dondetir wants to merge 1 commit intoOpenSIPS:3.4from
dondetir wants to merge 1 commit intoOpenSIPS:3.4from
Conversation
In the IO_WATCH_WRITE handler, when the stream write handler returns <0 (e.g. TLS async write fails after the 2s handshake timeout), the error branch breaks out of the switch before reaching the close(s) call that the success/pending paths hit at line 327. The fd s is delivered to the write worker via SCM_RIGHTS and is an independent copy in the worker's fd table — it must be explicitly closed on every exit path. tcpconn_release_error() signals main to close main's copy; close(s) closes the worker's copy. Without it one fd leaks per failed async write, confirmed: 7 leaked fds from 51 TLS OPTIONS (handshake timeout path) vs. 0 with the fix. Cherry-pick of bffcf00 from master. Fixes: OpenSIPS#3711
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Issue #3711 reports fd leaks when TLS connections fail to establish. PR #3634 fixed the immediate failure path (n < 0 — cert verify error returned synchronously), but a second path still leaks: the async handshake timeout path (n = 0).
Root cause
When tls_async_connect returns 0 (handshake timer fired), the connection is queued via ASYNC_WRITE_GENW and handed to the TCP write worker. The worker receives fd s via SCM_RIGHTS — an independent fd copy — then calls tls_async_write, which fails and returns -1.
In handle_io's IO_WATCH_WRITE handler (net/net_tcp_proc.c), the resp < 0 branch breaks before reaching the close(s) that success/pending paths hit:
tcpconn_release_error(con, 1,"Write error");
break; // ← jumps past close(s) below
// ...
close(s); // only success/pending paths reach here
tcpconn_release_error() signals main to close main's copy. The worker's copy s is never closed → one fd leaked per failed async write.
Fix
Single line, cherry-pick of master commit bffcf00:
tcpconn_release_error(con, 1,"Write error");
close(s); /* we always close the socket received for writing */
break;
Verification on 3.4.14
Mock server accepts TCP but withholds TLS data (forces tls_async_handshake_timeout to fire):
A server presenting an untrusted cert (immediate SSL error) hits the n < 0 path covered by #3634 and does not reproduce this leak. The mock server must hold TCP open without completing TLS to trigger the timeout path.
Scope : Master already has this via bffcf00. Backport to 3.4. Same fix likely needed in 3.5.x/3.6.x.