Commit e5305e0
committed
refactor(ray.sub): drop NETWORK_INIT_CMDS — MC_TCP_BIND_ADDRESS suffices
The NETWORK_INIT_CMDS block (pkill avahi-autoipd / ifconfig usb0 down /
ip addr flush + a 2-second relaunch loop) was a workaround for an
outdated diagnosis in data-plane-bench/DEBUG_TQ_BACKENDS.md (Issue 1):
"MC_TCP_BIND_ADDRESS controls server_name (registration) but NOT the
RPC listener bind address."
Re-reading current Mooncake main (commit fast-forwarded today):
- mooncake-transfer-engine/src/transfer_engine_impl.cpp:159-170
If MC_TCP_BIND_ADDRESS is set, it goes directly into
desc.ip_or_host_name, which is the address registered via
addRpcMetaEntry — i.e. the address peers receive from the
metadata service. This was added by PR #226 (caef1ef, merged
2025-04-10) and IS in the pinned wheel 0.3.10.post2 (bumped
2026-04-22).
- mooncake-transfer-engine/src/transfer_metadata_plugin.cpp:1292
The TCP listener binds INADDR_ANY and accepts on all interfaces.
Bind itself was never the bug — the announce was.
So per-process MC_TCP_BIND_ADDRESS in TQDataPlaneClient.__init__
(unchanged in this commit, runs on every process) gives Mooncake the
routable announce address and peer connections work cross-node
without OS-level interface stripping.
The pkill+sleep loop fought a symptom (avahi-autoipd respawning the
APIPA address). With the announce now correct regardless of usb0,
that fight is unnecessary. Removing the block also makes ray.sub a
no-op for non-mooncake_cpu backends (simple, mooncake_rdma) — they
were paying the host-process-kill cost for no reason.
If multi-node smoke regresses with peers connecting to 169.254.x,
revert this commit only — (A) codec/adapter cleanup stays.
Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>1 parent ddb9a02 commit e5305e0
1 file changed
Lines changed: 0 additions & 31 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
205 | 205 | | |
206 | 206 | | |
207 | 207 | | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | 208 | | |
238 | 209 | | |
239 | 210 | | |
240 | 211 | | |
241 | | - | |
242 | 212 | | |
243 | 213 | | |
244 | 214 | | |
| |||
347 | 317 | | |
348 | 318 | | |
349 | 319 | | |
350 | | - | |
351 | 320 | | |
352 | 321 | | |
353 | 322 | | |
| |||
0 commit comments