Skip to content

Commit 647b502

Browse files
committed
Address feedback
1 parent e6f2017 commit 647b502

2 files changed

Lines changed: 10 additions & 30 deletions

File tree

guides/backfilling_data.md

Lines changed: 2 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Backfilling Data
22

3-
When I say "backfilling data", I mean that as any attempt to change data in bulk. This can happen in code through migrations, application code, UIs that allow multiple selections and updates, or in a console connected to a running application. Since bulk changes affect a lot of data, it's always a good idea to have the code reviewed before it runs. You also want to check that it runs efficiently and does not overwhelm the database. Ideally, it's nice when the code is written to be safe to re-run. For these reasons, please don't change data in bulk through a console!
3+
Backfilling data refers to any attempt to change data in bulk. This can happen in code through migrations, application code, UIs that allow multiple selections and updates, or in a console connected to a running application. Since bulk changes affect a lot of data, it's always a good idea to have the code reviewed before it runs. You also want to check that it runs efficiently and does not overwhelm the database. Ideally, it's nice when the code is written to be safe to re-run. For these reasons, please don't change data in bulk through a console!
44

55
We're going to focus on bulk changes executed though Ecto migrations, but the same principles are applicable to any case where bulk changes are being made. Typical scenarios where you might need to run data migrations is when you need to fill in data for records that already exist (hence, backfilling data). This usually comes up when table structures are changed in the database.
66

@@ -288,12 +288,7 @@ defmodule MyApp.Repo.DataMigrations.BackfillWeather do
288288
)
289289
results = results |> Enum.map(& &1.id) |> Enum.sort()
290290

291-
not_updated =
292-
mutations
293-
|> Enum.map(& &1[:id])
294-
|> MapSet.new()
295-
|> MapSet.difference(MapSet.new(results))
296-
|> MapSet.to_list()
291+
not_updated = Enum.map(mutations, & &1[:id]) -- results
297292

298293
Enum.each(not_updated, &handle_non_update/1)
299294
repo().delete_all(from(r in @temp_table_name, where: r.id in ^results))

guides/safe_migrations.md

Lines changed: 8 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,14 @@ A guide on common migration recipes and how to avoid trouble.
99
| Add index | Blocks writes | Use `concurrently: true` and disable transactions |
1010
| Drop index | Blocks writes | Use `concurrently: true` and disable transactions |
1111
| Add foreign key | Blocks writes on both tables | Use `validate: false`, then validate separately |
12-
| Add column with default | Table rewrite (pre-PG11) | Add column first, then set default |
12+
| Add column with default | Table rewrite (volatile defaults) | Add column first, then set default |
1313
| Add NOT NULL | Full table scan | Use check constraint, validate, then add NOT NULL |
1414
| Add check constraint | Full table scan | Create with `validate: false`, then validate separately |
1515
| Change column type | Table rewrite | Create new column, migrate data, swap reads, drop old column |
1616
| Remove column | Query failures | Remove from schema first, then drop column |
1717
| Rename column | Query failures | Use `source:` option in schema instead |
1818
| Rename table | Query failures | Rename schema module instead |
19-
| Add enum value | Transaction error (pre-PG12) | disable transactions |
19+
| Add enum value | Transaction error | disable transactions |
2020
| Add extension | Transaction error | disable transactions |
2121

2222
## All Scenarios
@@ -92,7 +92,7 @@ def change do
9292
end
9393
```
9494

95-
If you're using Phoenix and PhoenixEcto, you will likely appreciate disabling
95+
If you're using Phoenix, you will likely need to disable
9696
the migration lock in the CheckRepoStatus plug during dev to avoid hitting and
9797
waiting on the advisory lock with concurrent web processes. You can do this by
9898
adding `migration_lock: false` to the CheckRepoStatus plug in your
@@ -273,8 +273,7 @@ end
273273
```
274274

275275
Note: we cannot use `Ecto.Migration.modify/3` as it will include updating the column type as
276-
well unnecessarily, causing Postgres to rewrite the table. For more information,
277-
[see this example](https://github.com/fly-apps/safe-ecto-migrations/issues/10).
276+
well unnecessarily, causing Postgres to rewrite the table.
278277

279278
Schema change to read the new column:
280279

@@ -308,8 +307,7 @@ end
308307
```
309308

310309
The issue is that we cannot use `Ecto.Migration.modify/3` as it will include updating the column type as
311-
well unnecessarily, causing Postgres to rewrite the table. For more information,
312-
[see this example](https://github.com/fly-apps/safe-ecto-migrations/issues/10).
310+
well unnecessarily, causing Postgres to rewrite the table.
313311

314312
### Good
315313

@@ -638,7 +636,7 @@ def change do
638636
end
639637
```
640638

641-
If you're using Postgres 12+, you can add the NOT NULL to the column after validating the constraint. From the Postgres 12 docs:
639+
You can then add the NOT NULL to the column after validating the constraint. From the Postgres docs:
642640

643641
> SET NOT NULL may only be applied to a column provided
644642
> none of the records in the table contain a NULL value
@@ -649,11 +647,9 @@ If you're using Postgres 12+, you can add the NOT NULL to the column after valid
649647
650648
**However** we cannot use `Ecto.Migration.modify/3`
651649
as it will include updating the column type as well unnecessarily, causing
652-
Postgres to rewrite the table. For more information, [see this example](https://github.com/fly-apps/safe-ecto-migrations/issues/10).
650+
Postgres to rewrite the table.
653651

654652
```elixir
655-
# **Postgres 12+ only**
656-
657653
def change do
658654
execute "ALTER TABLE products VALIDATE CONSTRAINT active_not_null",
659655
""
@@ -695,18 +691,7 @@ end
695691

696692
## Adding a value to a PostgreSQL enum
697693

698-
Adding enum values inside a transaction fails in PostgreSQL < 12.
699-
700-
### Bad
701-
702-
```elixir
703-
def change do
704-
# Fails in PostgreSQL < 12: cannot run inside a transaction
705-
execute "ALTER TYPE status ADD VALUE 'archived'"
706-
end
707-
```
708-
709-
### Good
694+
Adding enum values inside a transaction can be done since PostgreSQL 12. However, if you need to support older versions or want to be safe, disable the DDL transaction.
710695

711696
```elixir
712697
@disable_ddl_transaction true

0 commit comments

Comments
 (0)