Skip to content

WireGuard proxy: Refactor#6287

Merged
RPRX merged 13 commits into
XTLS:mainfrom
LjhAUMEM:wg
Jun 16, 2026
Merged

WireGuard proxy: Refactor#6287
RPRX merged 13 commits into
XTLS:mainfrom
LjhAUMEM:wg

Conversation

@LjhAUMEM

@LjhAUMEM LjhAUMEM commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

better wireguard

  • 改为标准的 wg bind 方法,同时现在 1 个 tun 设备只会有 1 个 socket,即使多 peer
  • 改用 DialSystem ListenSystemPacket 替代原来的 udp 传输层,server 现在类似 tun in
  • 删除 workers 配置项
  • 客户端增加内置远程解析方法
  • 理论上是支持 udphop,没写,不过即使可以也需要服务端用 iptables 配合

base #6275

{
  "log": { "loglevel": "debug" },
  "inbounds": [
    {
      "tag": "wg-in",
      "listen": "127.0.0.1",
      "port": 1081,
      "protocol": "wireguard",
      "settings": {
        "secretKey": "QGTBbjHOXH7qPACfhW0a4qViSWz9AzlY4LG+jHaYMFY=",
        "peers": [
          {
            "publicKey": "HVRq1Lj8MJkTZAI3HFL/ca0E7Dh40b5g8yzIVXQuZhU="
          }
        ]
      }
      ,"streamSettings": {
        "finalmask": {
          "udp": [
            {
              "type": "salamander",
              "settings": {
                "password": "1234"
              }
            }
          ]
        }
      }
    }
  ],
  "outbounds": [
    {
      "protocol": "freedom"
    }
  ]
}
{
  "log": { "loglevel": "debug" },
  "inbounds": [
    {
      "listen": "127.0.0.1",
      "port": 1080,
      "protocol": "socks",
      "settings": {
        "auth": "noauth",
        "udp": true
      }
    }
  ],
  "outbounds": [
    {
      "protocol": "wireguard",
      "settings": {
        "secretKey": "8C36wsrFq5MLAZW9Y6l/IEzs362iiRNlm0GOA4S34kc=",
        "peers": [
          {
            "publicKey": "5kuqMvvPx8NuGMDGJ/naRK/yRjtnmEIMByRTCF8kCXg=",
            "endpoint": "127.0.0.1:1081"
          }
        ]
      }
      ,"streamSettings": {
        "finalmask": {
          "udp": [
            {
              "type": "salamander",
              "settings": {
                "password": "1234"
              }
            }
          ]
        }
      }
    }
  ]
}

@bytecategory

Copy link
Copy Markdown
Contributor

@LjhAUMEM 那之前是多peer多socket了?

@LjhAUMEM

LjhAUMEM commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator Author

@LjhAUMEM 那之前是多peer多socket了?

是的,之前在每个 conn.Endpoint 上都绑了一个 conn

@bytecategory

Copy link
Copy Markdown
Contributor

@LjhAUMEM 你的内置远程解析方法是resolveRemote吗?那之后会加udphop吗?不太懂这个改动,有没有性能测试呀?

@LjhAUMEM

LjhAUMEM commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator Author

@LjhAUMEM 你的内置远程解析方法是resolveRemote吗?那之后会加udphop吗?不太懂这个改动,有没有性能测试呀?

具体来说是 *Net 的 LookupHost,用的 udp 53,失败会 fallback tcp,如果走 xray 内置 dns 没配置路由还要担心泄露,全给扔 remote 就没事了,现在只有 peer domain 会走 xray dns,或者想控制也可以在入站解析完成

@Exclude0122

Copy link
Copy Markdown
Contributor

有个 bug

n, addr, err := c.ReadFrom(bufs[0])
if err != nil {
if goerrors.Is(err, io.EOF) {
return 0, net.ErrClosed
}
var netErr net.Error
if goerrors.As(err, &netErr) {
return 0, err
}
continue
}

如果读失败了,应该把 conn 丢掉再重新开一个。因为这个 conn 可能是链式代理,tcp 伪装的 udp packetconn

之前是这样写的

n, err := buff.ReadFrom(c)
if err != nil {
buff.Release()
endpoint.conn = nil
c.Close()
return
}

@bytecategory bytecategory left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有没有一种可能 addr是无效地址
有3个路径会返回nil

if len(addrs) == 0 && lastErr != nil {
		return nil, lastErr
}

这种情况是dns解析失败 还有两种情况分别是非法域名和空域名

@bytecategory

bytecategory commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

有没有一种可能 addr是无效地址 有3个路径会返回nil

我指的是netstack中的LookupHost 然后不检查addr是否为nil就直接parse的在 f66031b

@LjhAUMEM

LjhAUMEM commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

我指的是netstack中的LookupHost 然后不检查addr是否为nil就直接parse的在 f66031b

来源是 netip.Addr.String(),无需检查

有个 bug

n, addr, err := c.ReadFrom(bufs[0])
if err != nil {
if goerrors.Is(err, io.EOF) {
return 0, net.ErrClosed
}
var netErr net.Error
if goerrors.As(err, &netErr) {
return 0, err
}
continue
}

如果读失败了,应该把 conn 丢掉再重新开一个。因为这个 conn 可能是链式代理,tcp 伪装的 udp packetconn

之前是这样写的

n, err := buff.ReadFrom(c)
if err != nil {
buff.Release()
endpoint.conn = nil
c.Close()
return
}

所以我为什么说之前是非标的 bind,ReceiveFunc 里当 socket 失效应该返回 net.ErrClosed 告知给调用方,还有链式代理,不然你以为我为什么判断 io.EOF,之前的代码就是屎没想到还有人以这个 review 的

你如果拿 wg-go 的 bind 举例我都不会说什么

@Exclude0122

Copy link
Copy Markdown
Contributor
客户端配置
{
  "log": {
    "loglevel": "debug"
  },
  "inbounds": [
    {
      "port": 10000,
      "listen": "127.0.0.1",
      "protocol": "socks",
      "settings": {
        "udp": true
      }
    }
  ],
  "outbounds": [
    {
      "protocol": "wireguard",
      "settings": {
        "secretKey": "***",
        "address": ["***"],
        "peers": [
          {
            "endpoint": "PROTON VPN SERVER",
            "publicKey": "***"
          }
        ]
      },
      "streamSettings": {
        "sockopt": {
          "dialerProxy": "vless"
        }
      }
    },
    {
      "protocol": "vless",
      "tag": "vless",
      "settings": {
        "vnext": [
          {
            "address": "127.0.0.1",
            "port": 20000,
            "users": [
              {
                "id": "your-uuid-here",
                "encryption": "none"
              }
            ]
          }
        ]
      }
    }
  ]
}
服务端配置
{
  "log": {
    "loglevel": "debug"
  },
  "inbounds": [
    {
      "port": 20000,
      "listen": "127.0.0.1",
      "protocol": "vless",
      "settings": {
        "clients": [
          {
            "id": "your-uuid-here",
            "decryption": "none"
          }
        ],
        "decryption": "none"
      }
    }
  ],
  "outbounds": [
    {
      "protocol": "freedom"
    }
  ]
}

确实有bug,可以复现
wg 首次握手成功以后把 xray 服务端重启一下,客户端不会建立新的 vless 连接

[Debug] peer(***) - Sending handshake initiation
[Error] proxy/wireguard: bind send err > io: read/write on closed pipe
[Error] peer(***) - Failed to send handshake initiation: io: read/write on closed pipe

@bytecategory

Copy link
Copy Markdown
Contributor

我指的是netstack中的LookupHost 然后不检查addr是否为nil就直接parse的在 f66031b

来源是 netip.Addr.String(),无需检查

有个 bug

n, addr, err := c.ReadFrom(bufs[0])
if err != nil {
if goerrors.Is(err, io.EOF) {
return 0, net.ErrClosed
}
var netErr net.Error
if goerrors.As(err, &netErr) {
return 0, err
}
continue
}

如果读失败了,应该把 conn 丢掉再重新开一个。因为这个 conn 可能是链式代理,tcp 伪装的 udp packetconn
之前是这样写的

n, err := buff.ReadFrom(c)
if err != nil {
buff.Release()
endpoint.conn = nil
c.Close()
return
}

所以我为什么说之前是非标的 bind,ReceiveFunc 里当 socket 失效应该返回 net.ErrClosed 告知给调用方,还有链式代理,不然你以为我为什么判断 io.EOF,之前的代码就是屎没想到还有人以这个 review 的

你如果拿 wg-go 的 bind 举例我都不会说什么

我来当那个蒙古人 你的意思是 Exclude0122说的tcp伪装的udp packetCon会EOF从而不需要重开?还是我的理解有问题呢?

@LjhAUMEM

LjhAUMEM commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

那看来链式失效了返回的错误不是 io.EOF,或者没有错误?我得搭下环境,稍等

@bytecategory

bytecategory commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

那看来链式失效了返回的错误不是 io.EOF,或者没有错误?我得搭下环境,稍等

因为你后面continue了呀 我觉得

n, addr, err := c.ReadFrom(bufs[0])
if err != nil {
if goerrors.Is(err, io.EOF) {
return 0, net.ErrClosed
}
var netErr net.Error
if goerrors.As(err, &netErr) {
return 0, err
}
continue
}

@Exclude0122

Copy link
Copy Markdown
Contributor

链式代理失败不一定返回 EOF,在安卓端不知道为啥会返回 read/write on closed pipe,我之前就被这个问题坑过

@Exclude0122

Copy link
Copy Markdown
Contributor

conn, err := internet.DialSystem(ctx, dest, h.sockopt)

现在 internet.DialSystem 只会被调用一次。我觉得只要 send / receive 返回任何报错,这个 conn 就不能要了,需要重新 internet.DialSystem

@LjhAUMEM

LjhAUMEM commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

链式代理失败不一定返回 EOF,在安卓端不知道为啥会返回 read/write on closed pipe,我之前就被这个问题坑过

那个是写方向的,不影响

现在 internet.DialSystem 只会被调用一次。我觉得只要 send / receive 返回任何报错,这个 conn 就不能要了,需要重新 internet.DialSystem

不是只调用一次,ReceiveFunc 传了 net.ErrClosed 后上层会重新调用 listenFunc,刚刚出门拿外卖了,我先看看读方向什么情况

@bytecategory

bytecategory commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

https://gist.github.com/bytecategory/c6d1a8551907a387a55a1ba0c29ccfe6
客户端日志完整版接近这个吧(错误日志,可以看到重启后没有新的UDP bind has been updated 说明listenfunc只被调用了一次.) 测了几分钟 采用 https://github.com/XTLS/Xray-core/actions/runs/27116764302/artifacts/7471894972
所以我觉得把连接关掉重开是一个好主意 @LjhAUMEM 我先撤了

@LjhAUMEM

LjhAUMEM commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

被 freedom 的 finalRule 耽误了点时间

@Exclude0122 你说的对,读收到的也是 io.ErrClosedPipe,不过还发现即使返回 net.ErrClosed 也没触发 re bind,似乎只能手动 dev.BindUpdate 或者像之前那些用 chan,但是为了一个链式改回 chan 感觉不值得,我再看看

客户端日志完整版接近这个吧(错误日志,可以看到重启后没有新的UDP bind has been updated 说明listenfunc只被调用了一次.)

那些日志都是写方向的不用分析了

@Fangliding

Copy link
Copy Markdown
Member

链式代理失败不一定返回 EOF,在安卓端不知道为啥会返回 read/write on closed pipe,我之前就被这个问题坑过

因为core里各种close满天飞的正常关闭和interrupt根本分不清

@bytecategory bytecategory left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

只改了读方向的

@LjhAUMEM

LjhAUMEM commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

@LjhAUMEM

LjhAUMEM commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

只改了读方向的

预期行为,在 send 里判断有滞后性,recv 才是第一时间得知关闭的地方,准确来说 recv 和 send 都不应该影响 bind 的生命周期,为了 fake connection 这种非标 conn 加一个判断已经是最大的让步,同时这个判断还要能做到区分来自 bind close 还是仅 fake conn close

@bytecategory

bytecategory commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

只改了读方向的

预期行为,在 send 里判断有滞后性,recv 才是第一时间得知关闭的地方,准确来说 recv 和 send 都不应该影响 bind 的状态,为了tcp 伪装的 udp packetconn 这种非标 conn 加一个判断已经是最大的让步,同时这个判断还要能做到区分来自 bind close(errors.LogDebug(context.Background(), "bind closed")) 还是仅 tcp 伪装的 udp packetconn(fakecon) close(errors.LogErrorInner(context.Background(), err, "unexpected closed"))

简单加个注释

@LjhAUMEM

LjhAUMEM commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

你如果观察 wg 的日志在 recv 退出本身就有一条日志...

@LjhAUMEM

LjhAUMEM commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

本身链式也有问题不管成不成功 dial 都会返回一个 fake connection,还有流量计数现在应该是 break,重要你不看喜欢关注一些奇奇怪怪的方面 @bytecategory

@Exclude0122

Copy link
Copy Markdown
Contributor

@Exclude0122 尝试一下 https://github.com/XTLS/Xray-core/actions/runs/27188615755

可以用了

@bytecategory bytecategory left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果能做到同时发送多个UDP packets就好了,减少syscall次数,而不是逐个把bufs发出去

@bytecategory

bytecategory commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

本身链式也有问题不管成不成功 dial 都会返回一个 fake connection,还有流量计数现在应该是 break,重要你不看喜欢关注一些奇奇怪怪的方面 @bytecategory

而你 我的确抓不到重点 事已至此 合了吧

@LjhAUMEM

LjhAUMEM commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

如果能做到同时发送多个UDP packets就好了,减少syscall次数,而不是逐个把bufs发出去

gso+ReadBatch/WriteBatch,linux 平台特定代码,你从我没分 udp4 udp6 来看就知道不会支持了

@RPRX

RPRX commented Jun 9, 2026

Copy link
Copy Markdown
Member

@LjhAUMEM 记得 rebase

@LjhAUMEM

LjhAUMEM commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

@LjhAUMEM 记得 rebase

done,应该也 ready 了

@almatv54

almatv54 commented Jun 9, 2026

Copy link
Copy Markdown

As I read it, recovery is now driven entirely from the receive path: a fatal ReadFrom error → downFunc() (dev.Down) → teardown → re-init on the next Process(). Send only logs the error and breaks, so the write path never triggers a teardown on its own.

That works for the chained-proxy / TCP-backed PacketConn case discussed above, because there a broken upstream surfaces as a read error ("closed pipe") and the receive goroutine unblocks.

But the original report was about a plain connected UDP socket during a silent uplink outage (host stays up, no ICMP). In that situation ReadFrom may just keep blocking indefinitely and never return an error — so downFunc is never called, the device is never torn down, and writes keep going into a dead socket. From the code path alone it's not obvious what would break that goroutine out and trigger recovery in this case.

So the question is: in the pure connected-UDP, no-ICMP outage scenario, what is expected to eventually error on the receive side and kick off the teardown? If nothing does, would it make sense to also trigger teardown from a Send error (or from a WireGuard handshake/keepalive timeout), instead of relying solely on the read path?

@LjhAUMEM

LjhAUMEM commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

@almatv54 可以测试一下,如果断网后再发送确实有 send 的错误而 read 没有需要提供完整日志,另外你有测试过其他实现在你说的断网后能恢复吗

@almatv54

almatv54 commented Jun 9, 2026

Copy link
Copy Markdown

The technical crux of my worry is exactly the asymmetry you mention. On a connected UDP socket, a send error usually only appears when there's an ICMP signal (e.g. port-unreachable). During a full uplink outage there's typically no ICMP at all, so neither send returns an error nor read unblocks — both sides just go silent. That's the case I'm unsure about: if neither path errors, nothing seems to trigger downFunc/teardown.

@LjhAUMEM

LjhAUMEM commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

@almatv54 Tests and logs are required.
https://github.com/XTLS/Xray-core/actions/runs/27208011476

@RPRX RPRX changed the title refactor wireguard WireGuard proxy: Refactor Jun 16, 2026
@RPRX RPRX merged commit 0b5b87a into XTLS:main Jun 16, 2026
39 of 40 checks passed
RPRX pushed a commit that referenced this pull request Jun 17, 2026
Maolaohei pushed a commit to Maolaohei/Bray-Core that referenced this pull request Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants