Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/upgrade-testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,13 @@ what the upgrade script handles, and any backward compatibility considerations.
- **Scenario B1 considerations:** The schema change is to table constraints only; the new `.so` issues the same column lists against `df.nodes`/`df.instances`, now with `ON CONFLICT ... DO NOTHING RETURNING id`. The instance reserve arbitrates on `id` (the primary key in both old and new schemas) and the node insert arbitrates on `(instance_id, id)` — an index that exists in both the pre-0.2.4 schema (the `nodes_instance_node_key` composite UNIQUE) and the new schema (the composite primary key) — so both statements stay valid against a schema that has not run `ALTER EXTENSION UPDATE`. The pre-generated-`root_id` reserve is also old-schema-safe: `instances_root_node_same_instance_fkey` is `DEFERRABLE INITIALLY DEFERRED` in every shipped schema, so `root_node` is not checked until commit, by which point the forced-ID root node row has been inserted within the same transaction. No `UPDATE df.instances` is issued, so the change relies only on the `INSERT (..., root_node, ...)` privilege every shipped `df.grant_usage()` already grants, not on any `UPDATE (root_node)` grant. One benign residual exists against the *old* schema only: a node ID that is globally duplicated but per-instance-unique would clash with the surviving single-column `nodes_pkey (id)`, which `ON CONFLICT (instance_id, id)` does not arbitrate, so it raises just as it did before this change — astronomically rare, strictly no worse than prior behavior, and eliminated once `ALTER EXTENSION UPDATE` swaps in the composite primary key. This covers **schema** compatibility only — the SQL stays valid against the old table shape. The separate in-flight *replay* break introduced by the changed activity-input shape is documented under "In-flight orchestration compatibility" above and requires draining before upgrade.
- **Scenario B2 considerations:** `ADD PRIMARY KEY (instance_id, id)` sets `NOT NULL` on both columns and builds a unique index over existing rows. `id` was already the old primary key (implicitly `NOT NULL`). `instance_id` carries a `nodes_instance_id_present_chk CHECK (instance_id IS NOT NULL)` constraint, but it was added `NOT VALID`, so it only guarantees rows written on 0.2.2+; in the unlikely event a database still holds pre-0.2.2 node rows with a NULL `instance_id`, the `ADD PRIMARY KEY` (and the explicit `ALTER COLUMN instance_id SET NOT NULL` that precedes it) will abort and the operator must backfill or remove those rows before retrying the upgrade. On an empty database the restructure is metadata-only; on a populated one PostgreSQL rebuilds the `df.nodes` primary-key index in place. Because `ADD PRIMARY KEY` / `ALTER COLUMN ... SET NOT NULL` take an `ACCESS EXCLUSIVE` lock on `df.nodes` and rebuild the index, on a large `df.nodes` the upgrade blocks concurrent access for a period that scales with the table's size; run `ALTER EXTENSION UPDATE` inside a maintenance window and consider `SET lock_timeout` for the session so the migration fails fast instead of queuing behind (or stalling in front of) long-running transactions. Combined with the in-flight replay break noted above, the recommended upgrade sequence is: stop new `df.start()` calls, drain or cancel in-flight instances, then run the upgrade.

#### Indexes on df.instances for ordered/paginated listing (issues #167/#87/#146)
- **DDL change (df schema):** `df.list_instances()` lists rows newest-first (`ORDER BY created_at DESC`), optionally filtered by status. The pre-0.2.4 `idx_instances_status(status)` covered only the status equality, so a status-filtered listing still required a sort and an unfiltered listing had no supporting index. Fresh installs (`src/lib.rs`) now create `idx_instances_status(status, created_at DESC, id)` and a new `idx_instances_created_at(created_at DESC, id)`. The upgrade script `sql/pg_durable--0.2.3--0.2.4.sql` drops any existing copies (`DROP INDEX IF EXISTS`) then recreates both indexes with the same definitions. The trailing `id` prepares the access path for the keyset pagination planned for `df.list_instances` (`ORDER BY created_at DESC, id ASC`); `df.list_instances()` does not yet order by `id`, so PR2 does not change the current result ordering — it only positions the index to serve that future deterministic order as an index-only scan.
- **Design note (RLS):** `df.instances` has a row-level-security policy (`instances_user_isolation`) filtering `submitted_by = current_user::regrole`, so a per-user index leading with `submitted_by` would be more selective for an individual session. The `created_at`-leading design is intentional: it is optimal for the admin / external-client global-listing path (#146) that reads across submitters, and it still removes the per-query sort for the common case. A `submitted_by`-leading refinement can be revisited if profiling shows the per-user path dominates.
- **Scenario A considerations:** The upgrade script recreates the indexes with column lists and `DESC`/tiebreaker ordering identical to the fresh-install DDL, so `pg_get_indexdef()` for `idx_instances_status` and `idx_instances_created_at` is byte-identical on both paths and the Scenario A snapshot matches.
- **Scenario B1 considerations:** The new `.so` works against all previous schemas. The `df.list_instances()` queries (`ORDER BY created_at DESC LIMIT`, optionally `WHERE status = $1`) reference only the `created_at`/`status` columns, which exist in every shipped `df.instances` schema; against a schema that has not run `ALTER EXTENSION UPDATE` the queries stay valid and correct — they simply fall back to a sort without the new index until the upgrade is applied. This is a performance-only change with no correctness impact.
- **Scenario B2 considerations:** No data migration. `DROP INDEX` / `CREATE INDEX` rebuild access-path metadata only; row data is untouched. The `CREATE INDEX` statements take a `SHARE` lock on `df.instances` while they build, so on a large `df.instances` run `ALTER EXTENSION UPDATE` in a maintenance window for the same reasons noted above.

### v0.2.2 → v0.2.3

#### Rename duroxide provider schema to `_duroxide` for fresh installs
Expand Down
20 changes: 20 additions & 0 deletions sql/pg_durable--0.2.3--0.2.4.sql
Original file line number Diff line number Diff line change
Expand Up @@ -236,3 +236,23 @@ ALTER TABLE df.instances
FOREIGN KEY (id, root_node)
REFERENCES df.nodes (instance_id, id)
DEFERRABLE INITIALLY DEFERRED NOT VALID;

-- ============================================================================
-- Indexes for efficient instance listing (monitoring redesign, issues #167/#87/#146).
--
-- df.list_instances() returns rows newest-first (ORDER BY created_at DESC),
-- optionally filtered by status. The previous single-column
-- idx_instances_status(status) did not cover created_at, so a status-filtered
-- listing still required a sort, and an unfiltered listing had no supporting
-- index at all. Replace the single-column index with a composite
-- (status, created_at DESC, id) and add (created_at DESC, id) for the unfiltered
-- path. The trailing id prepares the access path for the keyset pagination planned
-- for df.list_instances (ORDER BY created_at DESC, id ASC); df.list_instances() does
-- not order by id yet, so this does not change the current result ordering. These
-- definitions are byte-identical to the fresh-install DDL in src/lib.rs, so the
-- Scenario A index snapshot matches.
-- ============================================================================
DROP INDEX IF EXISTS df.idx_instances_status;
CREATE INDEX idx_instances_status ON df.instances(status, created_at DESC, id);
DROP INDEX IF EXISTS df.idx_instances_created_at;
CREATE INDEX idx_instances_created_at ON df.instances(created_at DESC, id);
17 changes: 15 additions & 2 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -217,8 +217,21 @@ CREATE TABLE df.instances (
COMMENT ON COLUMN df.instances.submitted_by IS
'Effective role (current_user) at df.start() time - used for connection authentication and SQL execution';
-- Index for finding pending instances
CREATE INDEX idx_instances_status ON df.instances(status);
-- Index for status-filtered listing, newest-first
-- (df.list_instances() WHERE status = $1 ORDER BY created_at DESC). Also serves the
-- pending-instance scan via the leading status column. The trailing id prepares the
-- access path for the keyset pagination planned for df.list_instances
-- (ORDER BY created_at DESC, id ASC); df.list_instances() does not order by id yet,
-- so this does not change the current result ordering.
-- NOTE: keep these two index definitions byte-identical to the 0.2.3->0.2.4 upgrade
-- script (sql/pg_durable--0.2.3--0.2.4.sql) until 0.2.4 is released -- Scenario A
-- compares pg_get_indexdef() across the fresh-install and upgrade paths.
CREATE INDEX idx_instances_status ON df.instances(status, created_at DESC, id);
-- Index for unfiltered listing, newest-first (df.list_instances() ORDER BY created_at DESC).
-- The trailing id prepares the access path for the same future keyset pagination
-- (ORDER BY created_at DESC, id ASC); it does not affect the current ordering.
CREATE INDEX idx_instances_created_at ON df.instances(created_at DESC, id);
-- Index for finding nodes by instance
CREATE INDEX idx_nodes_instance ON df.nodes(instance_id);
Expand Down
Loading