|
| 1 | +# Fix QUIC Full Flow |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +Three bugs prevent QUIC/HTTP3 from working end-to-end in production. The approval flow works (Telegram prompt appears, user approves) but the actual data never completes the round trip. |
| 6 | + |
| 7 | +## Context |
| 8 | + |
| 9 | +- UDP dispatch loop: `internal/proxy/server.go` (handleAssociate at ~line 1459, QUIC dispatch at ~line 1590) |
| 10 | +- QUIC proxy: `internal/proxy/quic.go` (QUICProxy, handles TLS termination and HTTP/3) |
| 11 | +- Policy engine: `internal/policy/engine.go` (EvaluateQUICDetailed) |
| 12 | +- QUIC packet detection: `internal/proxy/protocol.go` (IsQUICPacket) |
| 13 | +- Response relay: `internal/proxy/server.go:relayQUICResponses` |
| 14 | +- DNS interceptor reverse cache: `internal/proxy/dns.go` (ReverseLookup for IP -> hostname) |
| 15 | +- Existing SNI extraction: `internal/proxy/sni.go` (works on raw TLS records, not QUIC) |
| 16 | + |
| 17 | +## Development Approach |
| 18 | + |
| 19 | +- **Testing approach**: Regular (code first, then tests) |
| 20 | +- Complete each task fully before moving to the next |
| 21 | +- CRITICAL: every task MUST include new/updated tests |
| 22 | +- CRITICAL: all tests must pass before starting next task |
| 23 | +- CRITICAL: update this plan file when scope changes during implementation |
| 24 | +- Run tests after each change |
| 25 | +- Uses gofumpt for Go formatting |
| 26 | +- Deploy to knuth after each fix and test with quictest binary |
| 27 | + |
| 28 | +## Testing Strategy |
| 29 | + |
| 30 | +- **Unit tests**: test hostname recovery, pending session dedup, relay forwarding |
| 31 | +- **Production test**: quictest binary on knuth (full tun2proxy -> sluice -> upstream chain) |
| 32 | + |
| 33 | +## Solution Overview |
| 34 | + |
| 35 | +1. **Hostname recovery via DNS reverse cache** (not QUIC packet parsing). QUIC Initial packets encrypt the TLS ClientHello (RFC 9001 Section 5.2), so extracting SNI requires decrypting with Initial keys derived from the connection ID. This is complex and fragile. Since tun2proxy resolves DNS before sending UDP, sluice's DNS interceptor already has the IP -> hostname mapping in its reverse cache. Use that as the primary strategy. |
| 36 | + |
| 37 | +2. **Pending session dedup with bounded buffer**. Before calling `resolveQUICPolicy` (which blocks on broker), check if there's already a pending approval for this session key. Buffer up to 32 packets per session. When approval resolves, flush or discard. |
| 38 | + |
| 39 | +3. **Response relay fix**. The QUIC proxy's `quic-go` listener reads Initial packets from `upstream` PacketConn, but sends responses through its own listener socket back to the `upstream` address. `relayQUICResponses` reads from `upstream` and should receive these responses. The issue to investigate: does `quic-go` actually send responses back to the `upstream.LocalAddr()` that forwarded the Initial? Or does it send to the original source address from the QUIC packet header? |
| 40 | + |
| 41 | +## Technical Details |
| 42 | + |
| 43 | +**Hostname recovery flow:** |
| 44 | +``` |
| 45 | +1. QUIC packet arrives at dispatch: dest = "104.16.132.229", port = 443 |
| 46 | +2. Call dnsInterceptor.ReverseLookup("104.16.132.229") -> "cloudflare.com" |
| 47 | +3. Use "cloudflare.com" for policy eval and approval message |
| 48 | +4. Fall back to raw IP if reverse lookup misses |
| 49 | +``` |
| 50 | + |
| 51 | +**Pending session dedup:** |
| 52 | +``` |
| 53 | +pendingQUICSessions map[string]*pendingQUICSession |
| 54 | +
|
| 55 | +type pendingQUICSession struct { |
| 56 | + mu sync.Mutex |
| 57 | + packets [][]byte // buffered payloads (max 32) |
| 58 | + done chan struct{} // closed when approval resolves |
| 59 | + allowed bool // true if approved, false if denied |
| 60 | +} |
| 61 | +``` |
| 62 | + |
| 63 | +**Response relay architecture:** |
| 64 | +``` |
| 65 | +Client -> tun2proxy -> SOCKS5 UDP ASSOCIATE -> bindLn |
| 66 | + -> dispatch loop reads from bindLn |
| 67 | + -> sess.upstream.WriteTo(payload, quicAddr) // forward to QUIC proxy |
| 68 | + -> QUIC proxy processes, sends response |
| 69 | + -> relayQUICResponses reads from upstream, writes to bindLn |
| 70 | + -> tun2proxy receives response, forwards to client |
| 71 | +``` |
| 72 | + |
| 73 | +## Implementation Steps |
| 74 | + |
| 75 | +### Task 1: Recover hostname from DNS reverse cache |
| 76 | + |
| 77 | +**Files:** |
| 78 | +- Modify: `internal/proxy/server.go` |
| 79 | +- Modify: `internal/proxy/dns.go` (if ReverseLookup doesn't exist, add it) |
| 80 | +- Modify: `internal/proxy/server_test.go` or create `internal/proxy/dns_test.go` |
| 81 | + |
| 82 | +- [ ] Add `ReverseLookup(ip string) (hostname string, ok bool)` to the DNS interceptor if it doesn't exist (check the reverse cache that's populated during DNS query handling) |
| 83 | +- [ ] In the UDP dispatch loop, after `IsQUICPacket` returns true, call `dnsInterceptor.ReverseLookup(dest)`. If hostname found, replace `dest` with it for both `sessionKey` and `resolveQUICPolicy` |
| 84 | +- [ ] Update the approval message: when hostname is recovered, the Telegram prompt shows `cloudflare.com:443` instead of `104.16.132.229:443` |
| 85 | +- [ ] Write tests: reverse lookup hit replaces IP, reverse lookup miss keeps IP, hostname used in session key |
| 86 | +- [ ] Run tests |
| 87 | + |
| 88 | +### Task 2: Deduplicate broker requests with bounded buffer |
| 89 | + |
| 90 | +**Files:** |
| 91 | +- Modify: `internal/proxy/server.go` |
| 92 | +- Modify: `internal/proxy/server_test.go` |
| 93 | + |
| 94 | +- [ ] Add `pendingQUICSessions` map (mutex-protected) to track in-flight approvals |
| 95 | +- [ ] Before calling `resolveQUICPolicy`, check if sessionKey is pending. If so, buffer the payload (max 32 packets, drop beyond). Skip broker call. |
| 96 | +- [ ] When approval resolves: if allowed, create session, flush buffered payloads through it, start relay goroutine. If denied, discard buffer. |
| 97 | +- [ ] Remove pending entry after resolution (both allow and deny paths) |
| 98 | +- [ ] Write tests: concurrent packets to same dest trigger one broker request, buffer overflow drops packets, denied approval discards buffer |
| 99 | +- [ ] Run tests |
| 100 | + |
| 101 | +### Task 3: Fix response relay path |
| 102 | + |
| 103 | +**Files:** |
| 104 | +- Modify: `internal/proxy/server.go` (relayQUICResponses) |
| 105 | +- Modify: `internal/proxy/quic.go` (if response routing is wrong) |
| 106 | + |
| 107 | +- [ ] Verify that quic-go's listener sends responses to the address that forwarded the Initial packet (upstream.LocalAddr). Check quic-go's source or test empirically. |
| 108 | +- [ ] If quic-go sends to the original client address (from QUIC packet header) instead of the forwarding address, fix by using a connected UDP socket or adjusting the relay. |
| 109 | +- [ ] Ensure relayQUICResponses wraps response payloads in SOCKS5 UDP headers with the original destination (not the QUIC proxy address) |
| 110 | +- [ ] Write test: forward a QUIC-like packet to a UDP echo server through the relay, verify response returns via relayQUICResponses |
| 111 | +- [ ] Run tests |
| 112 | + |
| 113 | +### Task 4: Verify acceptance criteria |
| 114 | + |
| 115 | +- [ ] QUIC approval shows hostname (not IP) in Telegram message |
| 116 | +- [ ] Single broker request per destination during approval wait |
| 117 | +- [ ] Full QUIC flow: quictest binary gets HTTP/3 response |
| 118 | +- [ ] Run full test suite: `go test ./... -v -timeout 120s` |
| 119 | +- [ ] Deploy to knuth and test with quictest binary |
| 120 | +- [ ] Run tests - must pass before next task |
| 121 | + |
| 122 | +### Task 5: [Final] Update documentation |
| 123 | + |
| 124 | +- [ ] Update CLAUDE.md if QUIC handling details changed |
| 125 | +- [ ] Move this plan to `docs/plans/completed/` |
| 126 | + |
| 127 | +## Post-Completion |
| 128 | + |
| 129 | +**Manual verification on knuth:** |
| 130 | +```bash |
| 131 | +# Always recreate tun2proxy + openclaw together |
| 132 | +docker compose up -d --force-recreate sluice tun2proxy && sleep 5 |
| 133 | +docker compose up -d --force-recreate openclaw && sleep 5 |
| 134 | +docker cp /tmp/quictest openclaw:/tmp/quictest |
| 135 | +docker compose exec openclaw /tmp/quictest https://cloudflare.com |
| 136 | +``` |
| 137 | +- Verify Telegram shows `cloudflare.com:443` |
| 138 | +- Verify single approval prompt |
| 139 | +- Verify HTTP/3 response is received |
0 commit comments