fix: implement connection tracking in metrics by mkleczek · Pull Request #4672 · PostgREST/postgrest

mkleczek · 2026-02-25T18:39:21Z

DISCLAIMER:
This commit was authored entirely by a human without the assistance of LLMs.

Right now metrics observation handler does not track database connections but updates a single Gauge based on HasqlPoolObs events.

This is problematic because Hasql pool reports various connection events in various states that make it impossible to predict the state change from the received event. The connection state machine is not simple and to precisely report the number of connections in various states, it is necessary to track their lifecycles.

Fixes #4622

steve-chavez · 2026-02-25T22:57:17Z

Fixes #4622

Fix should have an entry on the CHANGELOG

steve-chavez · 2026-04-08T18:24:48Z

@mkleczek Since this is a fix, let's change the commit, PR title and add a fix entry in the CHANGELOG.

steve-chavez · 2026-04-14T16:01:15Z

@mkleczek Could you please address the feedback here? I've repeated the same twice now 😩

mkleczek · 2026-04-14T19:44:55Z

@mkleczek Could you please address the feedback here? I've repeated the same twice now 😩

Done

mkleczek · 2026-05-14T07:31:25Z

@steve-chavez @wolfgangwalther - what do you think is required to push this forward?

#4622 is being now reported by our support and it is becoming urgent to fix it.

wolfgangwalther · 2026-05-14T15:19:08Z

what do you think is required to push this forward?

This does not introduce any new test infrastructure that I'd be opposed to, so I won't block on the long term vision of how our tests should be structured. It uses the existing infrastructure. I might not like the way the test is written, but I don't see a need to block on that either.

To be clear, the question in #4672 (comment) was asked to get a feeling of how things could be done differently, if we had better test infrastructure elsewhere - and not to block this PR's progress.

Imho the only previously blocking comment is #4672 (comment). Now, since I wrote that comment, I started a major discussion on how we should test in general, which is blocking the other, test-infra related, issues/PRs. I don't think we should hold this PR hostage to that either.

TLDR: No blockers for me.

steve-chavez · 2026-05-14T17:16:08Z

what do you think is required to push this forward?

It's gonna be really confusing when we look back in history and we say we fix #4622 and there isn't a precise test proving it (the current test does not).

Let's not set a precedent here that can later hurt us with tech debt, so we should first clear the above thread.

wolfgangwalther · 2026-05-15T07:44:42Z

It's gonna be really confusing when we look back in history and we say we fix #4622 and there isn't a precise test proving it (the current test does not).

Let's not set a precedent here that can later hurt us with tech debt, so we should first clear the above thread.

The thread you linked is about the currently existing test, so that doesn't quite match the first sentence about a missing test. If you're concerned about a missing test, then we need to clear #4672 (comment).

I'd still say the situation now is different compared to when #1766 happened - we are actively working on improving the test situation and we have an open PR to track the addition of the herein-missing test. Thus, I'd say the risk of this getting forgotten is much smaller than earlier.

I'd say we should go ahead with this.

steve-chavez · 2026-05-15T20:34:57Z

@wolfgangwalther Let's not merge because I have a much simpler test almost ready for PR, let's merge this after that.

wolfgangwalther · 2026-05-15T20:37:13Z

No worries, I don't intend to merge. I just wanted to make my implicit approval explicit. I am well aware that you still have a thread open (this is now actually blocking the merge as well) - and I'm not just going to override you and resolve that thread. That's for you to decide :)

Right now metrics observation handler does not track database connections but updates a single Gauge based on HasqlPoolObs events. This is problematic because Hasql pool reports various connection events in multiple phases. The connection state machine is not simple and to precisely report the number of connections in various states, it is necessary to track their lifecycles. This change adds a ConnTrack data structure and logic to track database connections lifecycles. At the moment it supports "connected" and "inUse" connection counts precisely. The "pgrst_db_pool_available" metric is implemented on top of ConnTrack instead of a simple Gauge.

steve-chavez · 2026-05-18T23:39:57Z

 ### Fixed

 - Fix unnecessary connection pool flushes during schema cache reloading by @mkleczek in #4645
+- Fix race condition in pool_available metric causing negative values during network instability by @mkleczek in #4622


@mkleczek This was added in an old version Fixed section https://github.com/PostgREST/postgrest/blob/main/CHANGELOG.md#fixed-2 😕

Ehh... rebasing changelog is inherently tricky. My bad.

Raised #4942

steve-chavez · 2026-05-19T14:49:15Z

I think we should backport this, it will require:

mkleczek · 2026-05-19T15:54:52Z

I think we should backport this, it will require:

70327cf

6220ab3

1eba985

Hmm... most probably the test in observability test suite is a problem for backporting.

Maybe we should split it into 2 PRs then (separate the test in MetricsSpec)?

steve-chavez · 2026-05-19T15:57:19Z

Maybe we should split it into 2 PRs then (separate the test in MetricsSpec)?

Yup, sounds good.

taimoorzaeem · 2026-05-20T07:05:03Z

I think we just follow the order in the git log. It involves 5 commits in order:

Let me try doing this in 1 PR, so it's easier to revert, just in case.

This was referenced Feb 25, 2026

refactor(test): provide means to validate metrics and observations #4671

Merged

fix: Limit concurrent schema cache loads #4643

Draft

steve-chavez reviewed Feb 25, 2026

View reviewed changes

Comment thread src/PostgREST/Metrics.hs Outdated

steve-chavez reviewed Feb 25, 2026

View reviewed changes

Comment thread src/PostgREST/Metrics.hs

mkleczek force-pushed the refactor/connection-tracking branch 5 times, most recently from 0f475e3 to 592e5a3 Compare March 10, 2026 14:53

mkleczek force-pushed the refactor/connection-tracking branch 2 times, most recently from 1f3b4bc to bce1f6e Compare March 13, 2026 06:49

mkleczek force-pushed the refactor/connection-tracking branch 8 times, most recently from 7383b59 to 9b3c004 Compare April 2, 2026 14:45

mkleczek force-pushed the refactor/connection-tracking branch from 9b3c004 to 6af7b30 Compare April 8, 2026 12:33

steve-chavez reviewed Apr 8, 2026

View reviewed changes

Comment thread test/observability/Observation/MetricsSpec.hs

mkleczek force-pushed the refactor/connection-tracking branch 5 times, most recently from 27d9a1d to e438eed Compare April 14, 2026 07:02

wolfgangwalther reviewed May 5, 2026

View reviewed changes

Comment thread test/observability/Observation/MetricsSpec.hs

mkleczek force-pushed the refactor/connection-tracking branch 4 times, most recently from 90134d4 to c196a70 Compare May 13, 2026 13:21

mkleczek force-pushed the refactor/connection-tracking branch 3 times, most recently from c389f35 to 202f84e Compare May 14, 2026 08:30

mkleczek force-pushed the refactor/connection-tracking branch from 202f84e to 8e12a9a Compare May 15, 2026 07:59

wolfgangwalther approved these changes May 15, 2026

View reviewed changes

steve-chavez requested changes May 15, 2026

View reviewed changes

mkleczek force-pushed the refactor/connection-tracking branch from 8e12a9a to bf18e9a Compare May 16, 2026 05:25

wolfgangwalther mentioned this pull request May 16, 2026

test: negative pgrst_db_pool_available in metrics #4928

Merged

mkleczek force-pushed the refactor/connection-tracking branch from bf18e9a to 7785329 Compare May 18, 2026 12:16

mkleczek requested a review from steve-chavez May 18, 2026 12:25

steve-chavez approved these changes May 18, 2026

View reviewed changes

steve-chavez merged commit a297391 into PostgREST:main May 18, 2026
33 checks passed

steve-chavez reviewed May 18, 2026

View reviewed changes

taimoorzaeem mentioned this pull request May 20, 2026

v14: fix: implement connection tracking in metrics #4945

Merged

Uh oh!

Uh oh!

Conversation

mkleczek commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steve-chavez commented Feb 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

steve-chavez commented Apr 8, 2026

Uh oh!

steve-chavez commented Apr 14, 2026

Uh oh!

mkleczek commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

mkleczek commented May 14, 2026

Uh oh!

wolfgangwalther commented May 14, 2026

Uh oh!

steve-chavez commented May 14, 2026

Uh oh!

wolfgangwalther commented May 15, 2026

Uh oh!

steve-chavez commented May 15, 2026

Uh oh!

wolfgangwalther commented May 15, 2026

Uh oh!

Uh oh!

steve-chavez May 18, 2026

Choose a reason for hiding this comment

Uh oh!

mkleczek May 19, 2026

Choose a reason for hiding this comment

Uh oh!

steve-chavez commented May 19, 2026

Uh oh!

mkleczek commented May 19, 2026

Uh oh!

steve-chavez commented May 19, 2026

Uh oh!

taimoorzaeem commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

mkleczek commented Feb 25, 2026 •

edited

Loading

mkleczek commented Apr 14, 2026 •

edited

Loading