Skip to content

Commit 6f108fd

Browse files
harden reconnect behaviour (#1148)
### Before you submit your PR Make sure the following is true before submitting your PR: - [ ] I have read the [contributing guidelines](https://github.com/livekit/rust-sdks/blob/main/CONTRIBUTING.md) and validated that this PR will be accepted. - [ ] I have read and followed the principles regarding breaking changes, testing, and code quality. ### PR description Describe the changes in this PR. Explain what the PR is meant to solve and how to reproduce the issue in the first place. ### Breaking changes If this PR introduces breaking changes, list them here and document the rationale for introducing such a change. ### MSRV If the PR modifies the crate's MSRV (Minimum Supported Rust Version), document it here. ### Testing Ideally, unit test the code you add, but ensure you're not repeating existing test cases. Use as many already written scaffolding, utilities as possible; write your own, when needed. If external services, APIs, tokens are required (e.g., running an LK server instance), provide the necessary information. Make sure your tests perform useful, context-aware assertions and do not simply emulate "happy paths". ### Async We want the project to be runtime-agnostic, so please reuse what's already in [livekit-runtime](https://github.com/livekit/rust-sdks/blob/main/livekit-runtime/) and feel free to add anything missing. It's ok to use Tokio directly, when writing unit tests, if necessary. When testing, do not use artificial delays for the state to "catch up"; instead, respect the event flow and subscribe properly using channels or other mechanisms. --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
1 parent af369aa commit 6f108fd

14 files changed

Lines changed: 1736 additions & 64 deletions

File tree

.changeset/lukas_reconnect.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
livekit: patch
3+
livekit-api: patch
4+
livekit-ffi: patch
5+
livekit-uniffi: patch
6+
---
7+
8+
harden reconnect behaviour - #1148 (@lukasIO)

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

livekit-api/src/signal_client/mod.rs

Lines changed: 38 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -215,18 +215,48 @@ impl SignalClient {
215215
if matches!(&err, SignalError::WsError(WsError::Http(e)) if e.status() != 403) {
216216
log::error!("unexpected signal error: {}", err.to_string());
217217
}
218-
let urls = RegionUrlProvider::fetch_region_urls(url, token).await?;
219-
let mut last_err = err;
220218

221-
for url in urls.iter() {
222-
log::info!("fallback connection to: {}", url);
223-
match SignalInner::connect(url, token, options.clone(), publisher_offer.clone())
224-
.await
219+
// Fetching region URLs is best-effort. `fetch_region_urls`
220+
// already returns an empty list for non-cloud (direct /
221+
// self-hosted) URLs, so those skip the fallback entirely. If the
222+
// fetch itself fails (e.g. the region endpoint is unreachable),
223+
// that must NOT be fatal: log a warning and fall back to the
224+
// original connection error rather than masking it with the
225+
// fetch error.
226+
let urls = match RegionUrlProvider::fetch_region_urls(url, token).await {
227+
Ok(urls) => urls,
228+
Err(region_err) => {
229+
log::warn!(
230+
"failed to fetch region urls: {region_err}; surfacing original connection error"
231+
);
232+
return Err(err);
233+
}
234+
};
235+
236+
// With no region URLs to try, this surfaces the original error.
237+
// Otherwise we keep the most recent region attempt error, so that
238+
// if every region fails the caller sees why the last region
239+
// connection failed.
240+
let mut last_err = err;
241+
for region_url in urls.iter() {
242+
log::info!("fallback connection to: {}", region_url);
243+
match SignalInner::connect(
244+
region_url,
245+
token,
246+
options.clone(),
247+
publisher_offer.clone(),
248+
)
249+
.await
225250
{
226251
Ok((inner, join_response, stream_events)) => {
227252
return Ok(handle_success(inner, join_response, stream_events))
228253
}
229-
Err(err) => last_err = err,
254+
Err(region_conn_err) => {
255+
// This region is unreachable; drop it from the cache
256+
// so the next attempt doesn't hand it out again.
257+
RegionUrlProvider::mark_failed(url, region_url);
258+
last_err = region_conn_err;
259+
}
230260
}
231261
}
232262

@@ -1279,7 +1309,7 @@ mod tests {
12791309
let endpoint = format!("http://127.0.0.1:{}/settings/regions", addr.port());
12801310
let result = region::fetch_from_endpoint(&endpoint, "fake-token").await;
12811311

1282-
let urls = result.unwrap();
1312+
let (urls, _max_age) = result.unwrap();
12831313
assert_eq!(
12841314
urls,
12851315
vec![

0 commit comments

Comments
 (0)