Skip to content

Latest commit

 

History

History
347 lines (232 loc) · 9.59 KB

File metadata and controls

347 lines (232 loc) · 9.59 KB

Section 16 — Transactions

What you'll practice: BEGIN, COMMIT, ROLLBACK, savepoints, and isolation levels. Some exercises in this section ask you to open two psql sessions side-by-side.

Mental model

A transaction is a group of statements that either all take effect or none do. This is Postgres' version of "all-or-nothing" and it's the foundation of every bank transfer, every order checkout, every "don't half-create the row" story you've ever heard.

  • BEGIN — start a transaction
  • COMMIT — make it permanent
  • ROLLBACK — throw away everything since BEGIN
  • SAVEPOINT — a named point inside a transaction you can roll back to without ending the whole transaction

Learning objectives

  • Use BEGIN / COMMIT to bundle two statements into one atomic unit
  • Use ROLLBACK to undo mistakes
  • Create and roll back to savepoints
  • Observe isolation: what one transaction sees of another
  • Set an isolation level (READ COMMITTED, REPEATABLE READ, SERIALIZABLE)
  • Hold a row lock with SELECT ... FOR UPDATE

How to run these exercises

psql -U postgres -d sql_exercise -f 04-advanced-databases/16-transactions/schema.sql
psql -U postgres -d sql_exercise -f 04-advanced-databases/16-transactions/seed.sql
psql -U postgres -d sql_exercise

Inside psql:

SET search_path TO adv_tx;

Heads up: exercises 9 – 12 need you to open a second terminal and run psql -U postgres -d sql_exercise in it. The two sessions let you see what one transaction observes of another transaction's work.


Exercises

Exercise 1 — Your first explicit transaction

Move £200 (20000 cents) from Ada (id 1) to Alan (id 2). Do it inside an explicit transaction and COMMIT.

BEGIN;
UPDATE account SET balance = balance - 20000 WHERE id = 1;
UPDATE account SET balance = balance + 20000 WHERE id = 2;
INSERT INTO ledger (account_id, amount, description) VALUES (1, -20000, 'Transfer to Alan');
INSERT INTO ledger (account_id, amount, description) VALUES (2,  20000, 'Transfer from Ada');
COMMIT;

Verify: SELECT holder, balance FROM account; — Ada should have £800, Alan should have £700.


Exercise 2 — ROLLBACK on an error

Try to move £100,000 out of Grace (she only has £50).

BEGIN;
UPDATE account SET balance = balance - 10000000 WHERE id = 3;
-- ERROR: new row for relation "account" violates check constraint "account_balance_check"

Question: after the error, run SELECT * FROM account WHERE id = 3;. Did anything change? Now run ROLLBACK; explicitly and confirm the transaction is over.

Key fact: in PostgreSQL, once a statement in a transaction errors, the whole transaction is in an "aborted" state and can only be ROLLBACK'd. Every subsequent command until ROLLBACK gets: ERROR: current transaction is aborted, commands ignored until end of transaction block.


Exercise 3 — Deliberate ROLLBACK

Start a transaction, change Ada's balance to 0, then decide you didn't mean it and roll back.

BEGIN;
UPDATE account SET balance = 0 WHERE id = 1;
SELECT balance FROM account WHERE id = 1;  -- see 0 inside the tx
ROLLBACK;
SELECT balance FROM account WHERE id = 1;  -- back to whatever it was

Exercise 4 — Savepoints

Inside a single transaction, make two updates, create a savepoint between them, then roll back only the second one:

BEGIN;
UPDATE account SET balance = balance + 100 WHERE id = 1;
SAVEPOINT before_second;
UPDATE account SET balance = balance + 999999 WHERE id = 1;
ROLLBACK TO SAVEPOINT before_second;
-- Only the +100 survives
COMMIT;

Verify: Ada's balance increased by exactly 100, not 1,000,099.


Exercise 5 — SELECT inside a transaction

Start a transaction and run a SELECT that reads Ada and Alan's balances. Don't commit yet. What happens if another session (pretend for now) also wants to read them?

The short answer: SELECT without FOR UPDATE doesn't lock rows in Postgres. You'll see this more concretely in Exercise 9.


Exercise 6 — SELECT FOR UPDATE

Lock Ada's row inside a transaction:

BEGIN;
SELECT * FROM account WHERE id = 1 FOR UPDATE;
-- do some computation...
UPDATE account SET balance = balance - 5000 WHERE id = 1;
COMMIT;

Question: what does FOR UPDATE do that a plain SELECT doesn't?


Exercise 7 — Transfer function: a composable pattern

Imagine you're writing a real app. Write the "transfer £X from A to B" logic as a single multi-statement transaction that fails safely if either the withdraw or the deposit fails. (You're not writing a PL/pgSQL function yet — just the statements inside a BEGIN ... COMMIT block.)

Transfer £50 (5000 cents) from Margaret (id 4) to Grace (id 3).


Exercise 8 — Atomic transfer with a failing side

Now try to transfer £100 out of Grace (she doesn't have enough) to Margaret, inside a single transaction. Make sure Margaret does NOT get the money if Grace's withdrawal fails.

BEGIN;
UPDATE account SET balance = balance - 10000 WHERE id = 3;   -- this will fail
UPDATE account SET balance = balance + 10000 WHERE id = 4;
-- Because the transaction is aborted after the first error, this second
-- UPDATE never runs.
ROLLBACK;

Verify: neither balance changed.


Exercise 9 — Two sessions: observing READ COMMITTED

Open a second psql session now. In both sessions, run:

SET search_path TO adv_tx;

Session A (your first window):

BEGIN;
UPDATE account SET balance = balance - 10000 WHERE id = 1;
-- Don't commit yet

Session B (the new window):

SELECT balance FROM account WHERE id = 1;

Question: does Session B see the new balance (A's uncommitted change) or the old balance?

Answer preview: Session B sees the OLD balance. PostgreSQL's default isolation level is READ COMMITTED, which means you never see another transaction's uncommitted writes.

Now in Session A, run COMMIT;. Re-run the SELECT in Session B.


Exercise 10 — Non-repeatable read under READ COMMITTED

Session A:

BEGIN;
SELECT balance FROM account WHERE id = 1;  -- note this value

Session B:

BEGIN;
UPDATE account SET balance = balance + 1 WHERE id = 1;
COMMIT;

Session A (still open):

SELECT balance FROM account WHERE id = 1;  -- now it's different!
COMMIT;

Question: Session A ran two identical SELECTs inside one transaction and got different answers. What's this called? (Hint: "non-repeatable read".)


Exercise 11 — REPEATABLE READ prevents that

Session A:

BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT balance FROM account WHERE id = 1;

Session B:

BEGIN;
UPDATE account SET balance = balance + 1 WHERE id = 1;
COMMIT;

Session A:

SELECT balance FROM account WHERE id = 1;  -- same as the first select!
COMMIT;
SELECT balance FROM account WHERE id = 1;  -- NOW you see the new value

Question: how does REPEATABLE READ manage that? (It takes a snapshot at BEGIN time.)


Exercise 12 — SERIALIZABLE conflict

SERIALIZABLE is the strictest level. If two transactions would conflict, one of them is aborted.

Session A:

BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT SUM(balance) FROM account;
-- Pretend we use that sum to decide something, then:
UPDATE account SET balance = balance - 100 WHERE id = 1;

Session B (before Session A commits):

BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT SUM(balance) FROM account;
UPDATE account SET balance = balance - 100 WHERE id = 2;
COMMIT;

Session A:

COMMIT;  -- One of them (probably Session A) gets:
-- ERROR: could not serialize access due to read/write dependencies

Lesson: under SERIALIZABLE, if Postgres can't prove the transactions ran "as if" they were one after the other, it kills one. The app is expected to retry.


Exercise 13 — Tiny deadlock demo

Two sessions acquire two row locks in the opposite order.

Session A:

BEGIN;
UPDATE account SET balance = balance + 1 WHERE id = 1;

Session B:

BEGIN;
UPDATE account SET balance = balance + 1 WHERE id = 2;

Session A:

UPDATE account SET balance = balance + 1 WHERE id = 2;   -- blocks waiting on B

Session B:

UPDATE account SET balance = balance + 1 WHERE id = 1;   -- deadlock!

Postgres detects the deadlock and kills one transaction with: ERROR: deadlock detected.

Lesson: always acquire locks in a consistent order to avoid deadlocks.


Exercise 14 — DDL inside a transaction

Most databases can't do this but Postgres can. Wrap a CREATE TABLE and a DROP TABLE in a transaction and ROLLBACK:

BEGIN;
CREATE TABLE scratch (id INT);
INSERT INTO scratch VALUES (1), (2), (3);
SELECT * FROM scratch;
ROLLBACK;
SELECT * FROM scratch;  -- ERROR: relation "scratch" does not exist

Question: why is "transactional DDL" important for migrations?


Bonus challenges

Bonus 1 — RETURNING inside a transaction

Rewrite the transfer from Exercise 7 so the UPDATEs use RETURNING balance and you can print the new balances at the end of the transaction.

Bonus 2 — DO block

Read \h DO. Wrap the Exercise 7 transfer in an anonymous PL/pgSQL DO block. (This is a preview of the next section.)

Bonus 3 — Idle transaction detection

Run a BEGIN and don't commit. Open another psql and run SELECT pid, state, query FROM pg_stat_activity WHERE state = 'idle in transaction';. What does that tell you? Why is "idle in transaction" an antipattern to watch for in production?


Cleanup

DROP SCHEMA IF EXISTS adv_tx CASCADE;