Skip to content

Keep connections alive after BEGIN statement for non-FATAL errors#2589

Open
VashDD wants to merge 6 commits into
jackc:masterfrom
VashDD:begin-no-die-on-recoverable-error
Open

Keep connections alive after BEGIN statement for non-FATAL errors#2589
VashDD wants to merge 6 commits into
jackc:masterfrom
VashDD:begin-no-die-on-recoverable-error

Conversation

@VashDD

@VashDD VashDD commented Jun 23, 2026

Copy link
Copy Markdown

Problem

For more context, see: #2583

Conn.BeginTx calls c.die() on any error from BEGIN, unconditionally destroying the connection:

_, err := c.Exec(ctx, txOptions.beginSQL())
if err != nil {
    c.die()
    return nil, err
}

The assumption ("begin should never fail unless the connection is broken") is true for connecting directly to postgres, but with a connection pooler in between, there are errors that should not kill the connection, as it is reusable, for instance errors due to load shedding.

Change

Only die() when the connection is actually compromised due to:

  • non-PgError (transport error / context cancellation)
  • FATAL/PANIC PgError (server tearing the session down)

Tests

  • TestBeginTxNonFatalErrorKeepsConnAlive — a BEGIN failing with a non-FATAL error leaves IsClosed() == false and the connection still answers select 1.
  • TestBeginTxFatalErrorKillsConn — a BEGIN that terminates its own backend (pg_terminate_backend(pg_backend_pid())) returns a FATAL error and the connection is closed.
  • pgxpool.TestConnReleaseWhenBeginFail — updated: a recoverable BEGIN failure now releases the connection back to the pool alive (TotalConns stays 1, reusable) instead of destroying it.
  • pgxpool.TestConnDestroyedWhenBeginFailsFatally — a BEGIN that terminates its own backend fails fatally, so the pool destroys the connection (TotalConns drops to 0) instead of reusing it.

Conn.BeginTx called c.die() on any BEGIN error, unconditionally destroying
the connection. That is wrong when the server returns a normal (non-FATAL)
ErrorResponse followed by ReadyForQuery, leaving the connection reusable:

  - a pooler/proxy shedding load on BEGIN with SQLSTATE 53000, which keeps the
    connection alive; destroying it amplifies load under retry-heavy clients.
  - a malformed custom BeginQuery / invalid isolation level, which fails with a
    plain syntax error and leaves the session intact.

Only die() when the connection is actually compromised: a non-PgError
(transport error / context cancellation) or a FATAL/PANIC server error.
Otherwise return the error and keep the connection.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@VashDD VashDD changed the title fix(tx): keep the connection on a recoverable BEGIN error Keep connections alive on a recoverable BEGIN error Jun 23, 2026
@VashDD VashDD changed the title Keep connections alive on a recoverable BEGIN error Keep connections alive after BEGIN statement for non-FATAL errors Jun 23, 2026
VashDD and others added 3 commits June 23, 2026 11:43
Complements TestConnReleaseWhenBeginFail (recoverable error -> conn kept) with
the fatal case: a BEGIN that terminates its own backend fails fatally, so the
pool destroys the connection (TotalConns -> 0) instead of reusing it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread pgxpool/pool_test.go Outdated
assert.NoError(t, err)
}

assert.EqualValues(t, 1, db.Stat().TotalConns())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this a requre?

Comment thread pgxpool/pool_test.go Outdated
defer db.Close()

tx, err := db.BeginTx(ctx, pgx.TxOptions{BeginQuery: "select pg_terminate_backend(pg_backend_pid())"})
assert.Error(t, err)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this a requre?

Comment thread pgxpool/pool_test.go Outdated
assert.Error(t, err)
if !assert.Zero(t, tx) {
err := tx.Rollback(ctx)
assert.NoError(t, err)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require?

Comment thread tx.go
if severity == "" {
severity = pgErr.Severity
}
return severity == "FATAL" || severity == "PANIC"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably unrelated to this PR, but I wish these magic strings were constants that could be traced through the source.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I don't mind doing a follow up for that, but should take part in a different PR.

Address review: switch the begin-failure preconditions to require, and
simplify the fatal pool test to require.Error + require.Zero (FailNow still
runs deferred db.Close, so no cleanup is leaked).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@VashDD

VashDD commented Jun 25, 2026

Copy link
Copy Markdown
Author

Thanks for the review, addressed comments, ready for another pass.

Mirror TestConnDestroyedWhenBeginFailsFatally: require.Error + require.Zero
instead of the assert-guarded defensive rollback (FailNow still runs the
deferred db.Close, so nothing leaks).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants