Skip to content

fix: Start listening after schema cache load#4880

Open
mkleczek wants to merge 1 commit into
PostgREST:mainfrom
mkleczek:push-tynrmqwlwwus
Open

fix: Start listening after schema cache load#4880
mkleczek wants to merge 1 commit into
PostgREST:mainfrom
mkleczek:push-tynrmqwlwwus

Conversation

@mkleczek
Copy link
Copy Markdown
Collaborator

@mkleczek mkleczek commented May 5, 2026

DISCLAIMER:
This commit was authored entirely by a human without the assistance of LLMs.

This change ensures PostgREST starts listening on a server socket only after it loaded the schema cache and is ready to handle requests. It is no longer going to return 503 errors during startup until the schema cache is loaded.

Comment thread CHANGELOG.md Outdated
Comment thread src/PostgREST/Admin.hs
Comment thread src/PostgREST/App.hs
Comment on lines +99 to +100
mainSocket <- initServerSocket conf
atomicWriteIORef mainSocketRef $ Just mainSocket
Copy link
Copy Markdown
Member

@steve-chavez steve-chavez May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Can this change help lift the limitation we have with the --ready check? (#4269 (comment)) i.e. can the admin server know the address that is resolved for the API server when using special hosts? (like !4)

@steve-chavez
Copy link
Copy Markdown
Member

steve-chavez commented May 5, 2026

Previous discussion on the motivation of the change on #4703 (comment).

@steve-chavez steve-chavez requested a review from laurenceisla May 5, 2026 16:38
@steve-chavez
Copy link
Copy Markdown
Member

@mkleczek As per #4703 (comment), this would clearly benefit the case of SO_REUSEPORT given 2 PostgREST instances running.

But let's consider the scenario of a single instance managed by systemd behind a proxy:

  • The service restarts (for any reason, could be manual restart because somehow the schema cache failed reloading).
  • Right now our startup is fast (milliseconds) and we start responding with 503s. During this time clients will get the 503s with a clear error message that says we're "Retrying.." plus a Retry-After header.

With this change, now we'll not respond and clients will get a connection refused with no error message. And this state can last multiple seconds now that we wait for the scache to load.

So under this scenario, it looks like this new behavior is worse?


Thinking more what we need is zero-downtime restarts, which I guess is easier under this new behavior since we could rely on systemd socket activation?

@mkleczek
Copy link
Copy Markdown
Collaborator Author

mkleczek commented May 5, 2026

So under this scenario, it looks like this new behavior is worse?

Not really.

From the point of view of the client there is not much gain from these 503 errors comparing to some connect timeout or similar. The client has to handle connection issues anyway because there are many more cases when they can happen (for example the whole server might have crashed). In case of reverse proxies in front of PostgREST (ie. always) - the client will get some 50x anyway.
The only reasonable way for the client to handle network issues is to retry. Well behaving clients will use some exponential backoff with jitter retry policy to avoid overwhelming freshly started server (ie. to avoid thundering herd).
Retry-After is not very useful because it is not reliable. What's worse: if all clients retry according to this header then... boom - thundering herd - I would even say Retry-After is more harmful than good.

Comment thread test/io/test_io.py
Comment on lines -1788 to +1769
response = postgrest.session.get("/projects")
assert response.status_code == 503
with pytest.raises(requests.ConnectionError):
postgrest.session.get("/projects")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, the test description does not fit anymore.

But anyway, with the change in this PR, the test case does not make sense anymore at all, I think. A connection error does not even make it to PostgREST, so testing its non-effect is useless. I'd remove that test-case as well, if we decide to go on with this PR.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure about that - we want to show that schema cache is loaded once during startup.
We also verify the behavior during schema cache load (before it was 503 now it is connection error).
I also added another assertion that once schema cache is loaded we return 200 from /projects request.

Test description updated - it was indeed wrong.

@mkleczek mkleczek force-pushed the push-tynrmqwlwwus branch from 5bb48c5 to 99554cb Compare May 5, 2026 18:51
@steve-chavez
Copy link
Copy Markdown
Member

What's worse: if all clients retry according to this header then... boom - thundering herd - I would even say Retry-After is more harmful than good.

Thought about removing the Retry-After, but its docs say:

[...] Retry-After indicates the minimum time that the user agent is asked to wait

So it's a minimum, not exact time. I think it should be fine to be clear about this on the docs and recommend jitter.

@steve-chavez
Copy link
Copy Markdown
Member

@mkleczek The direction here is good, make sure to address the comments and then we can merge this.

@mkleczek mkleczek force-pushed the push-tynrmqwlwwus branch from 99554cb to 6a927aa Compare May 7, 2026 08:45
@mkleczek mkleczek marked this pull request as draft May 8, 2026 06:05
@mkleczek
Copy link
Copy Markdown
Collaborator Author

mkleczek commented May 8, 2026

I am marking this PR as draft to address concerns related to handling schema cache loading errors during start-up.

It seems the right course of action cannot be any of these two extremes:

  • always return 503 during schema cache loading on startup
  • start listening only after successful schema cache load

The first one forces clients to handle normal conditions as errors.
The second one makes the clients unaware of errors that might happen during schema cache loading which makes diagnostics more difficult.

It seems like the best (ultimate?) startup sequence should be:

  1. Start admin server.
  2. Try to load schema cache once.
  3. Start listening on main socket
  4. If there was an error in 2, enter retry loop.

That way we achieve both:

  • happy path (ie. successful startup sequence) does not cause any error responses
  • errors are properly reported to clients

The above requires wider refactoring - today the whole schema cache loading loop is implemented in a single function without any means to introspect the state of the loading process. Clients can only trigger asynchronous schema load and wait for it to finish.
It makes it related to #4856, which in turn is a prerequisite to implement loading the schema cache using listener connection to fix #4842.

@steve-chavez WDYT?

@mkleczek mkleczek force-pushed the push-tynrmqwlwwus branch 2 times, most recently from f577ea6 to 3eed89b Compare May 8, 2026 13:24
@wolfgangwalther
Copy link
Copy Markdown
Member

  • Start admin server.
  • Try to load schema cache once.
  • Start listening on main socket
  • If there was an error in 2, enter retry loop.

I wrote up two different proposals but threw them away, because I always came to the conclusion that this is the sensible thing to do.

So 👍

@steve-chavez
Copy link
Copy Markdown
Member

It seems like the best (ultimate?) startup sequence should be:

Looks much better. Also 👍 from me.

@laurenceisla
Copy link
Copy Markdown
Member

  1. Try to load schema cache once.
  2. Start listening on main socket

So between these two steps, we'd still return the connection error, however after that we'd retry and get the 503. I agree with this.

@steve-chavez Not sure if it was discussed elsewhere, but this would mean that the proposal to wait until the schema cache is loaded on startup is no longer desired, right?

@steve-chavez
Copy link
Copy Markdown
Member

@laurenceisla The waiting is being discussed on #4873 (comment). #4129 won't be solved here.

@mkleczek mkleczek force-pushed the push-tynrmqwlwwus branch 3 times, most recently from e653c00 to 55c3e8c Compare May 12, 2026 05:01
@mkleczek mkleczek marked this pull request as ready for review May 12, 2026 06:07
@mkleczek mkleczek requested a review from steve-chavez May 12, 2026 06:07
@mkleczek
Copy link
Copy Markdown
Collaborator Author

It seems like the best (ultimate?) startup sequence should be:

  1. Start admin server.
  2. Try to load schema cache once.
  3. Start listening on main socket
  4. If there was an error in 2, enter retry loop.

That way we achieve both:

  • happy path (ie. successful startup sequence) does not cause any error responses
  • errors are properly reported to clients

Updated the code to implemented the above.

@mkleczek mkleczek force-pushed the push-tynrmqwlwwus branch from 55c3e8c to e8964e8 Compare May 12, 2026 06:11
Comment on lines 161 to -173
@@ -168,9 +171,6 @@ querySchemaCache conf@AppConfig{..} = do
tzones <- if configDbTimezoneEnabled
then sqlTimedStmt gucTzones mempty timezones
else pure S.empty
_ <-
let sleepCall = SQL.Statement "select pg_sleep($1 / 1000.0)" (param HE.int4) HD.noResult True in
for_ configInternalSCQuerySleep (`SQL.statement` sleepCall) -- only used for testing
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this moved here? Change looks like it should be done independently. I remember we had some tests relying on the fact this sleep was after all the schema cache queries.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed - otherwise test_log_error_when_schema_cache_load_error_on_startup_to_stderr fails. The reason is that this test sets two env variables: PGRST_INTERNAL_SCHEMA_CACHE_QUERY_SLEEP=1000 and PGRST_DB_SCHEMAS=non_existent_schema_aaaa (ie. we want to fail schema cache query but after waiting 1s.
If waiting is after queries then it fails immediately.

Copy link
Copy Markdown
Member

@steve-chavez steve-chavez May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I remember some tests depend on this sleep being last, so if we change it those tests would be ineffective and misleading. We should do a git blame for finding those tests and tune them (or remove) if necessary.

Or as an alternative, introduce another sleep before the schema cache queries and leave this one in place.

Copy link
Copy Markdown
Collaborator Author

@mkleczek mkleczek May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I've searched through the tests for usage of this variable and cannot see anything that would require it after queries. It would only be relevant for tests that don't want to wait in case of errors in queries - but we don't have such tests, I think.

Copy link
Copy Markdown
Member

@steve-chavez steve-chavez May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, the original test was modified on 92ac7e5. This leaves us with this current test:

postgrest/test/io/test_io.py

Lines 1336 to 1367 in ed0d9dc

def test_schema_cache_concurrent_notifications(slow_schema_cache_env):
"schema cache should be up-to-date whenever a notification is sent while another reload is in progress, see https://github.com/PostgREST/postgrest/issues/2791"
internal_sleep = (
int(slow_schema_cache_env["PGRST_INTERNAL_SCHEMA_CACHE_QUERY_SLEEP"]) / 1000
)
with run(env=slow_schema_cache_env, wait_for_readiness=False) as postgrest:
time.sleep(2 * internal_sleep + 0.1) # wait for readiness manually
# first request, create a function and set a schema cache reload in progress
response = postgrest.session.post("/rpc/create_function")
assert response.text == ""
assert response.status_code == 204
time.sleep(
internal_sleep / 2
) # wait to be inside the schema cache reload process
# second request, change the same function and do another schema cache reload
response = postgrest.session.post("/rpc/migrate_function")
assert response.text == ""
assert response.status_code == 204
time.sleep(
2 * internal_sleep
) # wait enough time to get the final schema cache state
# confirm the schema cache is up-to-date and the 2nd reload wasn't lost
response = postgrest.session.get("/rpc/mult_them?c=3&d=4")
assert response.text == "12"
assert response.status_code == 200

Which IIRC it's kind of tricky because it involves concurrent notifications.

@mkleczek Can you ensure this test is still effective? I would also be fine if it's modified to not depend on this sleep if possible (if done should be another PR).

Comment thread CHANGELOG.md
@mkleczek mkleczek force-pushed the push-tynrmqwlwwus branch 3 times, most recently from 9c21cbc to 93d2726 Compare May 12, 2026 19:00
Comment thread test/io/test_io.py Outdated
@mkleczek mkleczek force-pushed the push-tynrmqwlwwus branch from 93d2726 to 0077e6f Compare May 12, 2026 19:40
Comment thread test/io/test_io.py
Comment thread src/PostgREST/AppState.hs
Comment thread test/io/test_io.py
@mkleczek mkleczek force-pushed the push-tynrmqwlwwus branch from 0077e6f to 37e2338 Compare May 13, 2026 06:22
This change ensures PostgREST starts listening on a server socket only after it loaded the schema cache and is ready to handle requests. It is no longer going to return 503 errors during startup until the schema cache is loaded.
@mkleczek mkleczek force-pushed the push-tynrmqwlwwus branch from 37e2338 to 9016fd8 Compare May 13, 2026 09:19
Comment thread CHANGELOG.md
Comment on lines 24 to +38
@@ -33,6 +35,7 @@ All notable changes to this project will be documented in this file. From versio
- Build the minimal docker image for aarch64-linux by @wolfgangwalther in #4193
- The name of an embedded table can no longer be used in filters if it has an alias by @laurenceisla in #4075
+ e.g. `?select=alias:table(*)&table.id=eq.1` is not possible anymore, use `?select=alias:table(*)&alias.id=eq.1` instead.
- Do not listen on main socket until schema cache is loaded by @mkleczek in #4880
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, both these entries should be switched places. We've always put the more descriptive change in Changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants