Skip to content

Close read txn at request end (afterCompletion) + idle-in-xact safety net (#118)#58

Merged
jensens merged 7 commits into
mainfrom
fix/idle-in-xact
Apr 13, 2026
Merged

Close read txn at request end (afterCompletion) + idle-in-xact safety net (#118)#58
jensens merged 7 commits into
mainfrom
fix/idle-in-xact

Conversation

@jensens
Copy link
Copy Markdown
Member

@jensens jensens commented Apr 13, 2026

Summary

Stops `PGJsonbStorageInstance` leaking REPEATABLE READ transactions across requests, so they no longer block `CREATE INDEX CONCURRENTLY` and similar maintenance operations.

  • `afterCompletion` on `PGJsonbStorageInstance` (root-cause fix). ZODB calls this hook on every transaction commit/abort and on `Connection.close()`. Previously the class did not define it, so the hook was a silent no-op and the read tx lingered until the connection was eventually evicted. Method is idempotent (no-op when no read tx is open) and never propagates exceptions (a killed conn is logged, not raised).
  • `idle_in_transaction_session_timeout = 60_000` on every connection the instance pool hands out — defense in depth for any future leak path (SIGKILL-ed worker, third-party plugin). Tunable via `ZODB_PGJSONB_IDLE_IN_XACT_TIMEOUT_MS` (set to `0` to disable).

Closes bluedynamics/plone-pgcatalog#118.

Why mid-request transactions don't appear to benefit

ZODB's `Connection.afterCompletion` calls `storage.afterCompletion()` then immediately calls `newTransaction()` → `poll_invalidations()` for the next implicit transaction — which re-opens the read tx. So intra-request, the read tx stays effectively open. The actual win is at `Connection.close()` (pool return / request end), where there's no follow-up `newTransaction`. End-to-end test pins this down via `pg_stat_activity` showing `state = 'idle'` after close.

Tests

  • 4 `afterCompletion` integration tests: closes-read-txn, idempotent, no-op-after-tpc-begin, safe-when-killed.
  • 1 end-to-end test: ZODB read-only transaction → `pg_stat_activity` shows `state = 'idle'` (not `idle in transaction`).
  • 3 idle-timeout tests: default 60_000 ms, env override, disabled-when-zero.

Full suite: 456 passed (was 449), no regressions.

Spec

`docs/superpowers/specs/2026-04-13-idle-in-xact-design.md` — full root-cause analysis, trade-off table (Fix 1 / 2 / 3 / 4 / E), test plan.

🤖 Generated with Claude Code

jensens and others added 6 commits April 13, 2026 12:51
…etion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… end (#118)

ZODB calls _storage.afterCompletion() after every transaction
commit/abort and on Connection.close().  Previously
PGJsonbStorageInstance did not define it, so the hook was a silent
no-op and the REPEATABLE READ snapshot opened by poll_invalidations()
persisted across requests.  Each lingering read tx held a virtualxid
that blocks CREATE INDEX CONCURRENTLY indefinitely under load.

The new method delegates to _end_read_txn() (idempotent) and never
propagates exceptions — a connection killed externally just gets
logged; the next operation rebuilds via the pool.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nn, end-to-end)

The end-to-end test verifies that after Connection.close(), the
backend's pg_stat_activity state is 'idle' (not 'idle in transaction')
— exactly the conditions that previously blocked CIC.

Note: mid-request transaction.commit/abort fires our afterCompletion,
but ZODB.Connection.afterCompletion immediately follows up with
newTransaction() → poll_invalidations() which re-opens the read tx
for the next implicit transaction.  The fix's actual win is at
Connection.close() (pool return / request end), which the end-to-end
test pins down.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…out (#118)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Defense in depth for any leaked virtualxid that bypasses
afterCompletion (SIGKILL-ed worker, third-party plugin, etc.).
Default 60_000 ms, env-overridable via ZODB_PGJSONB_IDLE_IN_XACT_TIMEOUT_MS.
Set to 0 to disable.

Replaces the inline `lambda` configure with a named helper that also
applies the timeout setting alongside autocommit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jensens added a commit to bluedynamics/plone-pgcatalog that referenced this pull request Apr 13, 2026
release_request_connection() now issues an explicit conn.rollback()
before pool.putconn(conn).  Pool conns aren't autocommit by default,
so any prior SELECT left an implicit transaction open, holding a
virtualxid that blocks CREATE INDEX CONCURRENTLY.

rollback() on a conn with no open txn is a cheap no-op; on autocommit
conns it's also a no-op.  Wrapped in contextlib.suppress so a dead
conn doesn't break pool return.

Companion to bluedynamics/zodb-pgjsonb#58 (storage-conn path).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jensens jensens merged commit a9dd2e3 into main Apr 13, 2026
5 checks passed
@jensens jensens deleted the fix/idle-in-xact branch April 13, 2026 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CREATE INDEX CONCURRENTLY blocks indefinitely: read connections held in REPEATABLE READ keep virtualxids alive

1 participant