Describe the bug
Our baseline was stable on livekit-client-js 2.17.x.
After moving to 2.18.1 (server still on 1.9.12), we started seeing errors like:
- could not handle new participant
- request canceled
We then tested server 1.10.1, but the problem did not clearly disappear in our environment. Rolling the client back to 2.17.x appeared to normalize behavior again on our side.
Because of that, I'm not sure whether this is:
- a regression/change in client behavior in 2.18.1,
- a compatibility problem with server 1.9.12,
- or a transport/proxy path issue that 2.18.1 is exposing more easily.
Observed behavior:
- Intermittent participant join/restart failures
- Some sessions appear to connect and then stall
- Media freeze
- We also observed DTLS/data-channel related warnings during the bad periods
Reproduction
SDK js 2.18.1 with v1.9.12.
Logs
Earlier participant join failure:
ERROR livekit routing/localrouter.go:111 could not handle new participant
{"room": "sporting-c_0407193619645", "participant": "552a637b-d178-471b-86a8-b4ce714b25ab", "connID": "CO_ANHZbD7fTVmz", "error": "request canceled"}
During later testing we also saw transport/proxy related errors:
caddy-1 | {"level":"error","logger":"layer4.handlers.proxy","msg":"upstream connection","local_address":"127.0.0.1:57986","remote_address":"127.0.0.1:7880","error":"local error: tls: bad record MAC"}
livekit-1 | WARN livekit.transport rtc/transport.go:924 error reading data channel
{"transport":"PUBLISHER","label":"_reliable","error":"dtls timeout: read/write timeout: context deadline exceeded"}
livekit-1 | ERROR livekit service/signal.go:186 could not handle new participant
{"room":"sporting-c_0407014153277","participant":"7f0e7c25-6097-42e4-86c3-901d6e6f4f72","connID":"CO_gCxdsGVcZZSc","error":"could not restart participant"}
System Info
Ubuntu 24.04;
Chrome latest version;
Our deployment is Linux + Docker Compose, as per the official documentation.
Severity
annoyance
Additional Information
We also saw warnings like could not get packet from bucket / received packet too old on VP8 video tracks during the same bad periods, even when packet loss in the stats was reported as 0, while jitter and PLI counts were high.
Questions
- Is there any known behavior change in livekit-client-js 2.18.1 that could explain more fragile joins/restarts against livekit-server 1.9.12?
- Is 2.18.1 expected to be fully safe with server 1.9.12, especially behind caddyl4?
- Do the logs above suggest a known transport/signal issue rather than a pure client regression?
These errors start appearing a few seconds after the room is active. I don't know if it was a coincidence, but both rooms that had problems had more than 400 subscribers.
Describe the bug
Our baseline was stable on livekit-client-js 2.17.x.
After moving to 2.18.1 (server still on 1.9.12), we started seeing errors like:
We then tested server 1.10.1, but the problem did not clearly disappear in our environment. Rolling the client back to 2.17.x appeared to normalize behavior again on our side.
Because of that, I'm not sure whether this is:
Observed behavior:
Reproduction
SDK js 2.18.1 with v1.9.12.
Logs
Earlier participant join failure: ERROR livekit routing/localrouter.go:111 could not handle new participant {"room": "sporting-c_0407193619645", "participant": "552a637b-d178-471b-86a8-b4ce714b25ab", "connID": "CO_ANHZbD7fTVmz", "error": "request canceled"} During later testing we also saw transport/proxy related errors: caddy-1 | {"level":"error","logger":"layer4.handlers.proxy","msg":"upstream connection","local_address":"127.0.0.1:57986","remote_address":"127.0.0.1:7880","error":"local error: tls: bad record MAC"} livekit-1 | WARN livekit.transport rtc/transport.go:924 error reading data channel {"transport":"PUBLISHER","label":"_reliable","error":"dtls timeout: read/write timeout: context deadline exceeded"} livekit-1 | ERROR livekit service/signal.go:186 could not handle new participant {"room":"sporting-c_0407014153277","participant":"7f0e7c25-6097-42e4-86c3-901d6e6f4f72","connID":"CO_gCxdsGVcZZSc","error":"could not restart participant"}System Info
Severity
annoyance
Additional Information
We also saw warnings like could not get packet from bucket / received packet too old on VP8 video tracks during the same bad periods, even when packet loss in the stats was reported as 0, while jitter and PLI counts were high.
Questions
These errors start appearing a few seconds after the room is active. I don't know if it was a coincidence, but both rooms that had problems had more than 400 subscribers.