Skip to content

fix(listener): dedup AcceptKCP for handshake retransmit from same client (#340)#341

Draft
luongs3 wants to merge 1 commit into
xtaci:masterfrom
luongs3:fix/340-accept-dedup-handshake-retransmit
Draft

fix(listener): dedup AcceptKCP for handshake retransmit from same client (#340)#341
luongs3 wants to merge 1 commit into
xtaci:masterfrom
luongs3:fix/340-accept-dedup-handshake-retransmit

Conversation

@luongs3
Copy link
Copy Markdown

@luongs3 luongs3 commented May 28, 2026

Draft for direction — bug is fresh (filed today) and not maintainer-validated yet, so I'd love your read before going further.

Problem

Listener.packetInput (in sess.go) looks up l.sessions[addr.String()] under an RLock, and if the entry is missing it falls through to create a new UDPSession and writes it into the map under a separate Lock. There is a TOCTOU window between the two locks: two packets from the same remote address (e.g. a handshake retransmit, a duplicate from recvmmsg, or any case where the application invokes packetInput from more than one goroutine) can both miss on the read lock, each construct a UDPSession, and each l.chAccepts <- s. The second write into l.sessions[addr] silently overwrites the first.

Net effect, matching the symptom in #340:

  • AcceptKCP() returns two *UDPSession for the same RemoteAddr.
  • The session held in l.sessions[addr] is whichever one wrote last, so closeSession(remote) removes the map binding regardless of which UDPSession the caller closes — and from the caller's perspective that "closes" the other session too (no more packets get routed to it).

Repro

dedup_test.go (added in this PR) drives Listener.packetInput directly with a fake net.PacketConn and fires 32 copies of the same PUSH packet from the same remote address, then drains chAccepts. On master I see 2–3 sessions handed out; the assert wants exactly 1.

$ go test -race -run TestListenerDedupSameAddr -count=5 -v
# master:
    dedup_test.go:89: Listener handed out 2 sessions for a single (addr, conv); want 1 (issue #340)
    dedup_test.go:89: Listener handed out 3 sessions for a single (addr, conv); want 1 (issue #340)
    ...
# with this patch:
--- PASS: TestListenerDedupSameAddr (0.00s)  (x5)

Candidate fix

Take sessionLock.Lock() before allocating the new session and re-check the map under the write lock. If a concurrent goroutine already inserted a session for this remote, feed the incoming packet into it (when conv matches) and skip the duplicate.

Diff is ~20 lines in sess.go. Insertion order is also tightened so the session is in l.sessions before chAccepts <- s — that way any later packet for the same remote that races the accept handler can still find it.

Open questions

  1. Is the dedup key right? I keyed on addr.String() (same key already used by the map). Should the dedup also consider conv, or is the existing stale-conv reset path above (s.Close() when conv != s.kcp.conv && sn == 0) intended to be the canonical reconnect mechanism?
  2. The duplicate-packet branch silently drops the packet when the conv mismatches and sn != 0. That matches the existing behavior in the "session already exists" branch above, but I want to make sure I'm not papering over a different bug.
  3. I considered using sync.Map or a per-addr lock map but didn't want to widen the patch surface. Happy to follow whichever direction you prefer.

Reproducer is small and standalone; happy to iterate on style, structure, or rewrite the whole fix in a different direction if I read this wrong.

…ent (xtaci#340)

When two packets from the same remote address race the Listener's
session-tracking map, both can miss on the RLock lookup, each spawn a
new UDPSession, push it to chAccepts, and the second insert overwrites
the first in l.sessions. AcceptKCP then returns two UDPSession objects
for the same RemoteAddr; closing one causes the listener's map to lose
the binding, which 'kills' the other from the application's perspective.

The fix re-checks the map under the write lock before inserting. If a
concurrent goroutine already created the session for this remote, we
feed the data into it (when the conv matches) and skip the duplicate.
A regression test in dedup_test.go drives Listener.packetInput
concurrently with N=32 copies of the same handshake packet over a fake
PacketConn and asserts exactly one session is handed out.

Fixes xtaci#340

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant