You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: guides/migration_anatomy.md
+2-101Lines changed: 2 additions & 101 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -250,58 +250,6 @@ end
250
250
>
251
251
> Be aware that these callbacks are not called when `@disable_ddl_transaction true` is configured because they rely on the transaction being present.
252
252
253
-
## Inspecting Locks In a Query
254
-
255
-
Before we dive into safer migration practices, we should cover how to check if a migration could potentially block your application. In Postgres, there is a `pg_locks` table that we can query that reveals the locks in the system. Let's query that table alongside our changes from the migration, but return the locks so we can see what locks were obtained from the changes.
256
-
257
-
```sql
258
-
BEGIN;
259
-
-- Put your actions in here. For example, validating a constraint
-- end your transaction with a SELECT on pg_locks so you can see the locks
263
-
-- that occurred during the transaction
264
-
SELECT locktype, relation::regclass, mode, transactionid AS tid, virtualtransaction AS vtid, pid, granted FROM pg_locks;
265
-
COMMIT;
266
-
```
267
-
268
-
The result from this SQL command should return the locks obtained during the database transaction. Let's see an example: We'll add a unique index without concurrency so we can see the locks it obtains:
1.`relation | pg_locks | AccessShareLock` - This is us querying the `"pg_locks"` table in the transaction so we can see which locks are taken. It has the weakest lock which only conflicts with `AccessExclusive` which should never happen on the internal `"pg_locks"` table itself.
294
-
1.`relation | schema_migrations | RowExclusiveLock` - This is because we're inserting a row into the `"schema_migrations"` table. Reads are still allowed, but mutation on this table is blocked until the transaction is done.
295
-
1.`virtualxid | _ | ExlusiveLock` - Querying `pg_locks` created a virtual transaction on the `SELECT` query. We can ignore this.
296
-
1.`relation | weather_city_index | AccessExclusiveLock` - We're creating the index, so this new index will be completely locked to any reads and writes until this transaction is complete.
297
-
1.`relation | schema_migrations | ShareUpdateExclusiveLock` - This lock is acquired by Ecto to ensure that only one mutable operation is happening on the table. This is what allows multiple nodes able to run migrations at the same time safely. Other processes can still read the `"schema_migrations"` table, but you cannot write to it.
298
-
1.`transactionid | _ | ExclusiveLock` - This lock is on a transaction that is happening; in this case, it has an `ExclusiveLock` on itself; meaning that if another transaction occurring at the same time conflicts with this transaction, the other transaction will acquire a lock on this transaction so it knows when it's done. I call this "lockception".
299
-
1.`relation | weather | ShareLock` - Finally, the reason why we're here. Remember, we're creating a unique index on the `"weather"` table without concurrency. This lock is our red flag. Notice it acquires a ShareLock on the table. This means it blocks writes! That's not good if we deploy this and have processes or web requests that regularly write to this table. `UPDATE`, `DELETE`, and `INSERT` acquire a `RowExclusiveLock` which conflicts with the ShareLock.
300
-
301
-
To avoid this lock, we change the command to `CREATE INDEX CONCURRENTLY ...`; when using `CONCURRENTLY`, it prevents us from using database transactions which is unfortunate because now we cannot easily see the locks the command obtains. We know this will be safer however because `CREATE INDEX CONCURRENTLY` acquires a `ShareUpdateExclusiveLock` which does not conflict with `RowExclusiveLock` (See Reference Material in the [Safe Migrations guide](safe_migrations.html)).
302
-
303
-
This scenario is revisited later in [Safe Migrations](safe_migrations.html).
304
-
305
253
## Safeguards in the database
306
254
307
255
It's a good idea to add safeguards so no developer on the team accidentally locks up the database for too long. Even if you know all about databases and locks, you might have a forgetful day and try to add an index non-concurrently and bring down production. Safeguards are good.
@@ -332,7 +280,7 @@ There are two ways to apply this lock:
332
280
333
281
Let's go through those options:
334
282
335
-
#### Transaction lock_timeout
283
+
#### Transaction `lock_timeout`
336
284
337
285
In SQL:
338
286
@@ -421,7 +369,7 @@ ALTER ROLE myuser SET lock_timeout = '10s';
421
369
422
370
If you have a different user that runs migrations, this could be a good option for that migration-specific Postgres user. The trade-off is that Elixir developers won't see this timeout as they write migrations and explore the call stack since database role settings are in the database which developers don't usually monitor.
423
371
424
-
####Statement Timeout
372
+
### Statement Timeout
425
373
426
374
Another way to ensure safety is to configure your Postgres database with statement timeouts. These timeouts apply to all statements, including migrations and the locks they obtain.
427
375
@@ -441,22 +389,6 @@ Now any statement automatically times out if it runs for more than 10 minutes; o
441
389
442
390
Setting this `statement_timeout` requires discipline from the team; if there are runaway queries that fail (for example) at 10 minutes, an exception will likely occur somewhere. You will want to equip your application with sufficient logging, tracing, and reporting so you can replicate the query and the parameters it took to hit the timeout, and ultimately optimize the query. Without this discipline, you risk creating a culture that ignores exceptions.
443
391
444
-
#### Timeouts for Non-Transactional Migrations
445
-
446
-
When `@disable_ddl_transaction true` is set, the `after_begin/0` callback is not called, so you cannot rely on it to set timeouts. Instead, set the timeout directly in your migration:
447
-
448
-
```elixir
449
-
@disable_ddl_transactiontrue
450
-
@disable_migration_locktrue
451
-
452
-
defchangedo
453
-
execute "SET lock_timeout TO '5s'"
454
-
create index("posts", [:slug], concurrently:true)
455
-
end
456
-
```
457
-
458
-
Note that `SET` without `LOCAL` sets the timeout for the session. Since there's no transaction, `SET LOCAL` would have no effect.
459
-
460
392
### Handling Failed Concurrent Operations
461
393
462
394
When `CREATE INDEX CONCURRENTLY` fails (due to timeout, deadlock, or other errors), PostgreSQL leaves behind an **invalid index**. This index:
@@ -489,37 +421,6 @@ end
489
421
>
490
422
> Always check for invalid indexes after a failed concurrent migration. They won't go away on their own and can silently degrade write performance.
491
423
492
-
### Monitoring Locks During Migrations
493
-
494
-
When running migrations, especially on large tables, it's helpful to monitor for lock contention. You can run this query in a separate session to see blocked queries:
495
-
496
-
```sql
497
-
SELECT
498
-
blocked.pidAS blocked_pid,
499
-
blocked.queryAS blocked_query,
500
-
blocked.wait_event_type,
501
-
blocking.pidAS blocking_pid,
502
-
blocking.queryAS blocking_query,
503
-
now() -blocked.query_startAS blocked_duration
504
-
FROM pg_stat_activity blocked
505
-
JOIN pg_locks blocked_locks ONblocked.pid=blocked_locks.pidAND NOT blocked_locks.granted
This shows you which queries are waiting for locks and what's blocking them. If you see your migration blocking many queries, you may want to cancel it and use a safer approach.
522
-
523
424
---
524
425
525
426
This guide was originally published on [Fly.io Phoenix Files](https://fly.io/phoenix-files/anatomy-of-an-ecto-migration/).
0 commit comments