Skip to content

Commit 6e1e935

Browse files
committed
Don't block IceAdapter.start() on telemetry websocket connect (fixes #42)
TelemetryDebugger.startupComplete() called websocketClient.connectBlocking() synchronously on the IceAdapter startup thread, with no timeout. Telemetry is a debug-only observability channel - its data flow is strictly outgoing (see ice-adapter/src/main/java/com/faforever/iceadapter/telemetry/, all messages implement OutgoingMessageV1) and no game logic depends on it being connected - so it should never gate peer connectivity. When the telemetry server is silently unreachable (TCP layer up but app layer hung - for example an alive load balancer fronting a dead backend), the WebSocket Upgrade handshake never receives an HTTP 101 response. connectBlocking() then waits until TCP keepalive eventually kills the socket, about two hours later. During that whole window IceAdapter.start() is blocked, the FAF client's adapter-ready orchestration trips its own timeouts, and the user-visible symptom is "the ice adapter doesn't connect to other players". Fix: run the connect on a virtual thread so startup returns immediately. The pre-existing reconnect loop in sendingLoop() already handles transient drops, and queued messages catch up once the socket opens because the messageQueue is unbounded. Also adds the missing return after the InterruptedException catch (previously fell through to sendMessage on an interrupted thread) and re-asserts the interrupt flag, per Java best practice. Fixes #42
1 parent d49f9fe commit 6e1e935

1 file changed

Lines changed: 19 additions & 8 deletions

File tree

ice-adapter/src/main/java/com/faforever/iceadapter/debug/TelemetryDebugger.java

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -115,18 +115,29 @@ private void sendingLoop() {
115115

116116
@Override
117117
public void startupComplete() {
118-
try {
119-
if (!websocketClient.connectBlocking()) {
118+
// Connect to the telemetry server asynchronously: telemetry is a debug-only channel and
119+
// must NOT block IceAdapter.start(). Previously connectBlocking() ran on the startup
120+
// thread with no timeout, so when the telemetry server's TCP layer was reachable but
121+
// its application layer was hung, the WebSocket Upgrade response never arrived and
122+
// startup waited indefinitely (until TCP keepalive — ~2 hours). The peer-connectivity
123+
// path was held the whole time, which is why the adapter "didn't connect" when
124+
// telemetry was down. See https://github.com/FAForever/java-ice-adapter/issues/42
125+
Thread.ofVirtual().name("telemetry-connect").start(() -> {
126+
try {
127+
if (!websocketClient.connectBlocking()) {
128+
Debug.remove(this);
129+
return;
130+
}
131+
} catch (InterruptedException e) {
120132
Debug.remove(this);
133+
log.error("Failed to connect to telemetry websocket", e);
134+
Thread.currentThread().interrupt();
121135
return;
122136
}
123-
} catch (InterruptedException e) {
124-
Debug.remove(this);
125-
log.error("Failed to connect to telemetry websocket", e);
126-
}
127137

128-
sendMessage(new RegisterAsPeer(
129-
UUID.randomUUID(), "java-ice-adapter/" + IceAdapter.getVersion(), IceAdapter.getLogin()));
138+
sendMessage(new RegisterAsPeer(
139+
UUID.randomUUID(), "java-ice-adapter/" + IceAdapter.getVersion(), IceAdapter.getLogin()));
140+
});
130141
}
131142

132143
@Override

0 commit comments

Comments
 (0)