Skip to content

network: propagate peer full close as RemoteClose for user-space io_handle when half-close is enabled#45198

Open
PetrMc wants to merge 1 commit into
envoyproxy:mainfrom
PetrMc:petrmc/internal-listener-peer-destroyed
Open

network: propagate peer full close as RemoteClose for user-space io_handle when half-close is enabled#45198
PetrMc wants to merge 1 commit into
envoyproxy:mainfrom
PetrMc:petrmc/internal-listener-peer-destroyed

Conversation

@PetrMc
Copy link
Copy Markdown

@PetrMc PetrMc commented May 20, 2026

Commit Message

network: propagate peer full close as RemoteClose for user-space io_handle when half-close is enabled

Description

Bug

In tcp_proxy tunneling mode (HTTP CONNECT) routed through an internal_listener, upstream tunneled streams are not reset when the downstream side fully closes. They remain pinned until the upstream idle timeout fires (envoy default 1 hour), accumulating across requests. This topology is used by Istio ambient mesh on the gateway side; the bug was originally reported and reproduced there (see istio/istio#60074).

The downstream upstream side correctly detects the close at the user-space IoHandle level (the peer io_handle is destroyed, peer_handle_ is nulled, setEof() and onPeerDestroy() both fire). But tcp_proxy enables half-close semantics on its downstream connection via enableHalfClose(true), and ConnectionImpl is configured to ignore the Closed file event in that mode (source/common/network/connection_impl.cc:215). A subsequent read returning EOF becomes a half-close, not a RemoteClose. The tcp_proxy filter keeps the upstream CONNECT stream open waiting for response data while downstream is fully gone.

For real OS sockets this is fine: kernel signals like EPIPE on write or RST detection in doRead produce the proper RemoteClose event. For user-space IoHandle pairs there is no kernel; the only signal of "peer fully gone" is onPeerDestroy() setting peer_handle_ to nullptr.

Fix

Add a new Network::IoHandle::wasPeerFullyClosed() method that:

  • Returns false by default. Real OS socket implementations inherit this; behavior unchanged.
  • User-space IoHandleImpl overrides to return a new peer_fully_closed_ bool, set inside onPeerDestroy().

In ConnectionImpl::onReadReady(), after the existing half-close translation, check whether enable_half_close_ && result.end_stream_read_ && ioHandle().wasPeerFullyClosed(). If so, treat the event as RemoteClose and set DetectedCloseType::RemoteReset, just like the RST handling above. tcp_proxy's onDownstreamEvent then propagates this to HttpUpstream::onDownstreamEvent → resetEncoder → resetStream, producing the expected RST_STREAM frame upstream.

A peer shutdown(WR) (legitimate half-close, peer keeps its io_handle alive) only triggers setEof(), not onPeerDestroy(). peer_fully_closed_ stays false. Half-close semantics are preserved.

Risk and behavior change

  • Real OS sockets: zero change. wasPeerFullyClosed() returns false; the new branch never fires.
  • User-space IoHandle users with half-close disabled: zero change. Existing full-close path (!enable_half_close_ && end_stream_read_) already converts EOF to close.
  • User-space IoHandle users with half-close enabled AND a legitimate half-close pattern: half-close still works. Only full peer destroy triggers the new path.
  • User-space IoHandle users with half-close enabled AND full peer close (tcp_proxy with tunneling_config over internal_listener, possibly other patterns): the connection now fires RemoteClose immediately, releasing upstream tunneled streams. This is the fix.

Guarded behind envoy.reloadable_features.internal_listener_peer_destroyed_propagation, default on. Set to false to restore legacy behavior.

Reproducing the bug

Set up a tcp_proxy listener with tunneling_config whose upstream is reached via an internal_listener, with half-close enabled on the tcp_proxy downstream (the default). Open multiple downstream connections that get the upstream tunnel established, then close the downstream side. Observe upstream tunneled streams pin until the upstream idle timeout (default 1 hour) clears them.

The unit tests in this PR exercise the connection-level half-close translation directly. A real-world reproduction using Istio ambient is documented at istio/istio#60074.

Risk Level

Low. New behavior is gated behind a runtime feature flag. Affects only the user-space IoHandle half-close-enabled path. Real OS sockets unchanged. Legitimate half-close clients through internal_listener unchanged.

Testing

Unit tests (test/extensions/io_socket/user_space/io_handle_impl_test.cc): verify wasPeerFullyClosed() returns false by default, stays false after peer shutdown(WR), becomes true after peer close().

Connection-level tests (test/extensions/io_socket/user_space/connection_compatbility_test.cc):

  • HalfCloseEnabledPeerFullClosePropagatesRemoteClose: with feature on, full peer close fires RemoteClose.
  • HalfCloseEnabledPeerFullCloseLegacyBehavior: with feature off, legacy behavior preserved.
  • HalfCloseEnabledPeerShutdownWritePreservesHalfClose: with feature on, peer shutdown(WR) is still observed as half-close.

Docs Changes

Release note in changelogs/current/bug_fixes/tcp_proxy__fix-stream-leak-on-internal-listener-peer-full-close.rst.

Release Notes

Added.

Platform Specific Features

N/A. Pure user-space path.

Runtime guard

envoy.reloadable_features.internal_listener_peer_destroyed_propagation

…andle when half-close is enabled

Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
@PetrMc PetrMc requested a review from kyessenov as a code owner May 20, 2026 20:24
@repokitteh-read-only
Copy link
Copy Markdown

Hi @PetrMc, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #45198 was opened by PetrMc.

see: more, trace.

@PetrMc PetrMc requested a deployment to external-contributors May 20, 2026 20:24 — with GitHub Actions Waiting
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).

🐱

Caused by: #45198 was opened by PetrMc.

see: more, trace.

@NetRanger-222
Copy link
Copy Markdown

1.36 branch also needs to be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants