Stop stack-capture UAF in ConnectionPoolImpl timeout timer#222
Merged
gophergogo merged 2 commits intomainfrom Apr 21, 2026
Merged
Stop stack-capture UAF in ConnectionPoolImpl timeout timer#222gophergogo merged 2 commits intomainfrom
gophergogo merged 2 commits intomainfrom
Conversation
added 2 commits
April 20, 2026 20:26
) ConnectionPoolImpl::createNewConnection constructed PendingConnection on the caller's stack, handed its address to the timeout-timer lambda, and only then std::move-pushed the entry onto pending_connections_. The captured pointer dangled as soon as createNewConnection returned, so the timer callback touched freed memory when it fired. Emplace an empty slot on pending_connections_ first and bind the timer capture to that list element. std::list guarantees per-element address stability across push/erase of other elements, which is the invariant the capture relies on. onConnectionTimeout previously compacted the list via std::remove_if plus list::erase. The non-member remove_if moves elements to shuffle the erased one to the end — which would move the entry whose timer lambda is currently executing on top of itself. Switch to std::list::remove_if (member) so the node is unlinked in place, and identify it by address rather than unique_ptr equality to avoid accidentally matching an unrelated entry.
Drive the pool's timeout path with real IO: bind to TEST-NET-1 (192.0.2.1:1, RFC 5737 non-routable) and set a short 150ms connection_timeout so the timer fires long before the kernel's SYN retransmit budget gives up. On the pre-fix code this segfaults in the timer callback (the lambda dereferences a destroyed stack slot); with the fix applied the dispatcher delivers onPoolFailure(Timeout) cleanly in ~150ms. Use RealIoTestBase so pool construction, newConnection, and teardown all run inside the dispatcher thread — the only context libevent and ConnectionImpl will accept for creating file events and timers.
2131456 to
5b792e3
Compare
caleb2h
approved these changes
Apr 21, 2026
gophergogo
pushed a commit
that referenced
this pull request
Apr 21, 2026
) ConnectionPoolImpl::createNewConnection constructed PendingConnection on the caller's stack, handed its address to the timeout-timer lambda, and only then std::move-pushed the entry onto pending_connections_. The captured pointer dangled as soon as createNewConnection returned, so the timer callback touched freed memory when it fired. Emplace an empty slot on pending_connections_ first and bind the timer capture to that list element. std::list guarantees per-element address stability across push/erase of other elements, which is the invariant the capture relies on. onConnectionTimeout previously compacted the list via std::remove_if plus list::erase. The non-member remove_if moves elements to shuffle the erased one to the end — which would move the entry whose timer lambda is currently executing on top of itself. Switch to std::list::remove_if (member) so the node is unlinked in place, and identify it by address rather than unique_ptr equality to avoid accidentally matching an unrelated entry.
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ConnectionPoolImpl::createNewConnectionkept itsPendingConnectionon the stack and handed the address to the timeout-timer lambda beforestd::move-pushing the entry ontopending_connections_. Once the function returned, the lambda's capture dangled — so when the timer fired, the callback dereferenced freed memory. Fix: emplace an empty slot on the list first and bind the capture to the stable list-element address.onConnectionTimeoutpreviously usedstd::remove_if+list::erase, which moves elements around to compact them — including the entry whose timer lambda is currently executing on top of itself. Switched tostd::list::remove_if(member) so the node is unlinked in place, identified by address rather thanunique_ptrequality.ConnectionPoolPendingTimeoutTest.TimeoutFiresCleanlydrivesnewConnectionagainst TEST-NET-1 (192.0.2.1:1, RFC 5737) with a 150 ms pool timeout, so the timer wins the race vs. the kernel SYN retransmit budget. Against the pre-fix code it segfaults (exit 139); with the fix it deliversonPoolFailure(Timeout)in ~150 ms.Test plan
test_connection_pool_pending_timeout— new test passes with fix (152 ms)test_connection_manager_impl— passtest_connection_impl— passtest_connection_manager_real_io— pass (7/7)test_connection_pool_callbacks— pass (13/13)test_connection_manager_callback_fixes— pass (14/14)