Skip to content

Feature/server outage screen#3199

Open
MaartenD wants to merge 7 commits into
lichess-org:mainfrom
MaartenD:feature/server-outage-screen
Open

Feature/server outage screen#3199
MaartenD wants to merge 7 commits into
lichess-org:mainfrom
MaartenD:feature/server-outage-screen

Conversation

@MaartenD
Copy link
Copy Markdown
Contributor

Summary: Server outage detection via WebSocket

Below a description of what changed from previous version. See comment

What changed
The previous implementation in server_status.dart sent an HTTP HEAD request to lichessUri('/') every 30 seconds to check if the server was reachable. This was unnecessary overhead. The app already maintains a permanent WebSocket connection to lichess that continuously monitors server health through a ping/pong protocol.

New approach
The polling timer has been replaced by a listener on the existing WebSocket connection. The socket pool tracks connection health via averageLag, a value of Duration.zero means the socket is not connected.

When averageLag drops to zero a 30-second timer starts. If the connection is restored before the timer fires, it is cancelled and the server stays marked as online. If the timer fires, the server is marked as offline and the outage screen is shown. When the connection is restored, the server is immediately marked as online again.

Better.messaging.to.communicate.server.outage_ws.mp4

I tested it locally by stopping my lila_ws-1 docker container proces. When the outage screen was displayed i started the process again.

Fixes #1016

@HaonRekcef
Copy link
Copy Markdown
Collaborator

@MaartenD There are offline features in the app, so the lichess is down indicator should be less invasive in the UI.

How do we make sure to distinguish between the server actually being down and the player simply being offline or experiencing network issues?

@MaartenD
Copy link
Copy Markdown
Contributor Author

@HaonRekcef that one i missed. I will do my research and let you know.

@MaartenD
Copy link
Copy Markdown
Contributor Author

@HaonRekcef is the Over the board game an offline feature? Are there more?

@HaonRekcef
Copy link
Copy Markdown
Collaborator

@MaartenD there are multiple. You can disable the network on your device and see which buttons are interactable and not greyed out.

@MaartenD
Copy link
Copy Markdown
Contributor Author

MaartenD commented May 16, 2026

@HaonRekcef what about this version? This is when in flightmode.

Better.messaging.to.communicate.server.outage_ws_with_offline.mp4

In the video you will see that Puzzle Themes is available and when you click on it you can't select anything. I tested this on my Iphone against the production version and there the behavior is the same. In my opinion this isn't correct, but i'm not sure.

Will upload a version when the websocket connection isn't working later today or tomorrow. I need to figure some things out first.

@MaartenD
Copy link
Copy Markdown
Contributor Author

MaartenD commented May 17, 2026

**Behaviour during outage **

Below what is working so far. It's still work in progress but before i continue i would like to have somen answers according my approach (see Question below).

Video when websocket isn't available

Better.messaging.to.communicate.server.outage_ws_with_offline_wsgone.mp4
  • Home tab: outage screen is shown, the Play button remains accessible
  • Puzzles, Learn, More: accessible and functional as normal (offline features work)
  • Watch tab: shows the outage screen instead of "No internet connection" (offline)
  • Play button: The options that require a server connection (such as Create lobby game, Challenge a friend, Correspondence and Arena tournaments) were already disabled when the network was unavailable. By replacing onlineStatusProvider with lichessOnlineProvider in the Play button components, these options are now also correctly disabled during a server outage

A new provider in lib/src/network/lichess_online.dart combines both checks:

final lichessOnlineProvider = Provider.autoDispose<bool>((ref) {
  final isNetworkOnline = ref.watch(onlineStatusProvider).value ?? false;
  final isServerReachable = ref.watch(serverStatusProvider);
  return isNetworkOnline && isServerReachable;
});

Currently applied to play_menu.dart, quick_game_matrix.dart and create_game_widget.dart. There are other places in the codebase that still use onlineStatusProvider directly, these would benefit from the same treatment.

Question: to fully implement this feature i would like to know if you agree with my approach. Additionally, would you prefer lichessOnlineProvider to live in connectivity.dart alongside onlineStatusProvider rather than in a separate file?

Tests added / updated

  • test/network/server_status_test.dart (new): two tests using fakeAsync verifying the 30-second timer fires after continuous disconnection, and that the timer is cancelled when the connection is restored within 30 seconds
  • test/view/home/home_tab_screen_test.dart (updated):
    outage page shown → extended to also verify the Play button (FloatingActionButton) remains visible
    New test: Watch tab shows ServerOutage when offline

@HaonRekcef
Copy link
Copy Markdown
Collaborator

Hi @MaartenD thanks for the work on this!
Correct me if I am wrong, but looking at the code, it seems the "Lichess is undergoing Maintenance" screen will still show up if the user simply loses their local internet connection (at least on the Watch tab).
Actually a local network drop is way more common than an actual server outage.
I think the cleanest way to handle this is by introducing an enum for the connection state rather than relying on booleans:

enum ConnectionStatus {
  online,
  networkDown,
  serverDown,
} 

To answer your question:
Question: to fully implement this feature i would like to know if you agree with my approach. Additionally, would you prefer lichessOnlineProvider to live in connectivity.dart alongside onlineStatusProvider rather than in a separate file?

I am not the final authority, but I would say it doesn't matter much as long as the code is well written and works, personally I would tend towards putting it into the same file.

@MaartenD
Copy link
Copy Markdown
Contributor Author

Hi @HaonRekcef,

Thanks for your reply and great suggestion according to the ConnectionStatus enum. I myself was also leaning towards putting everything in the same file. I will take that route.

@MaartenD
Copy link
Copy Markdown
Contributor Author

Reason for change
The initial implementation used two separate booleans (onlineStatusProvider and serverStatusProvider) to determine whether the app could reach lichess. This made it impossible to distinguish between a local network drop and an actual server outage. Both situations were treated the same way. @HaonRekcef correctly pointed out that showing "Lichess is undergoing technical difficulties" when the user simply has no internet is misleading.

New approach: ConnectionStatus enum
Added to connectivity.dart:

enum ConnectionStatus {
  online,       // network available and server reachable
  networkDown,  // no network connection
  serverDown,   // network available but lichess server unreachable
}

Behaviour per status

Status Home tab Watch tab Play button online options
online normal normal enabled
networkDown normal + offline banner "No internet connection." disabled
serverDown outage screen outage screen disabled

Tests

  • test/network/connectivity_test.dart (new) — three unit tests covering all three enum values:
  1. Network available + server reachable → online
  2. Network unavailable → networkDown
  3. Network available + server unreachable → serverDown
  • test/view/home/home_tab_screen_test.dart (updated):
  1. Watch tab shows no internet message when network is down → verifies "No internet connection." text for networkDown
  2. Watch tab shows outage screen when server is down → verifies ServerOutage widget for serverDown
  • Final implementation of Offline scenario (no internet connection)
Better.messaging.to.communicate.server.outage_ws_with_offline_II.mp4
  • Final implementation of Server outage scenario
Better.messaging.to.communicate.server.outage_ws_with_offline_wsgone_II.mp4

To be clear: given the scope of this PR, I used Claude as an AI assistant during development (final implementation).

@veloce
Copy link
Copy Markdown
Contributor

veloce commented May 19, 2026

@MaartenD I have not read the comments but only the PR description (which I hope you updated based on the last code).

I have not read the code either, but based on the description I don't see how this can work. How do you distinguish a server outage from a network disconnection? Even if the socket is disconnected for more than 30s, that does not mean the lichess WS server is down.

We certainly don't want to display a message indicating that the lichess server is down if that is not the case. And I don't see how you can know that by just monitoring the WS connection.

I am pretty sure this feature cannot be implemented as is, or am I missing something?

I invite you to reach out to the lichess server devs on discord to see how this is implemented in the website.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better messaging to communicate server outage

3 participants