microsoft · crprashant · Jun 26, 2026
diff --git a/docs/upgrade-testing.md b/docs/upgrade-testing.md
@@ -238,6 +238,13 @@ what the upgrade script handles, and any backward compatibility considerations.
 - **Scenario B1 considerations:** The schema change is to table constraints only; the new `.so` issues the same column lists against `df.nodes`/`df.instances`, now with `ON CONFLICT ... DO NOTHING RETURNING id`. The instance reserve arbitrates on `id` (the primary key in both old and new schemas) and the node insert arbitrates on `(instance_id, id)` — an index that exists in both the pre-0.2.4 schema (the `nodes_instance_node_key` composite UNIQUE) and the new schema (the composite primary key) — so both statements stay valid against a schema that has not run `ALTER EXTENSION UPDATE`. The pre-generated-`root_id` reserve is also old-schema-safe: `instances_root_node_same_instance_fkey` is `DEFERRABLE INITIALLY DEFERRED` in every shipped schema, so `root_node` is not checked until commit, by which point the forced-ID root node row has been inserted within the same transaction. No `UPDATE df.instances` is issued, so the change relies only on the `INSERT (..., root_node, ...)` privilege every shipped `df.grant_usage()` already grants, not on any `UPDATE (root_node)` grant. One benign residual exists against the *old* schema only: a node ID that is globally duplicated but per-instance-unique would clash with the surviving single-column `nodes_pkey (id)`, which `ON CONFLICT (instance_id, id)` does not arbitrate, so it raises just as it did before this change — astronomically rare, strictly no worse than prior behavior, and eliminated once `ALTER EXTENSION UPDATE` swaps in the composite primary key. This covers **schema** compatibility only — the SQL stays valid against the old table shape. The separate in-flight *replay* break introduced by the changed activity-input shape is documented under "In-flight orchestration compatibility" above and requires draining before upgrade.
 - **Scenario B2 considerations:** `ADD PRIMARY KEY (instance_id, id)` sets `NOT NULL` on both columns and builds a unique index over existing rows. `id` was already the old primary key (implicitly `NOT NULL`). `instance_id` carries a `nodes_instance_id_present_chk CHECK (instance_id IS NOT NULL)` constraint, but it was added `NOT VALID`, so it only guarantees rows written on 0.2.2+; in the unlikely event a database still holds pre-0.2.2 node rows with a NULL `instance_id`, the `ADD PRIMARY KEY` (and the explicit `ALTER COLUMN instance_id SET NOT NULL` that precedes it) will abort and the operator must backfill or remove those rows before retrying the upgrade. On an empty database the restructure is metadata-only; on a populated one PostgreSQL rebuilds the `df.nodes` primary-key index in place. Because `ADD PRIMARY KEY` / `ALTER COLUMN ... SET NOT NULL` take an `ACCESS EXCLUSIVE` lock on `df.nodes` and rebuild the index, on a large `df.nodes` the upgrade blocks concurrent access for a period that scales with the table's size; run `ALTER EXTENSION UPDATE` inside a maintenance window and consider `SET lock_timeout` for the session so the migration fails fast instead of queuing behind (or stalling in front of) long-running transactions. Combined with the in-flight replay break noted above, the recommended upgrade sequence is: stop new `df.start()` calls, drain or cancel in-flight instances, then run the upgrade.
 
+#### Indexes on df.instances for ordered/paginated listing (issues #167/#87/#146)
+- **DDL change (df schema):** `df.list_instances()` lists rows newest-first (`ORDER BY created_at DESC`), optionally filtered by status. The pre-0.2.4 `idx_instances_status(status)` covered only the status equality, so a status-filtered listing still required a sort and an unfiltered listing had no supporting index. Fresh installs (`src/lib.rs`) now create `idx_instances_status(status, created_at DESC, id)` and a new `idx_instances_created_at(created_at DESC, id)`. The upgrade script `sql/pg_durable--0.2.3--0.2.4.sql` drops any existing copies (`DROP INDEX IF EXISTS`) then recreates both indexes with the same definitions. The trailing `id` prepares the access path for the keyset pagination planned for `df.list_instances` (`ORDER BY created_at DESC, id ASC`); `df.list_instances()` does not yet order by `id`, so PR2 does not change the current result ordering — it only positions the index to serve that future deterministic order as an index-only scan.
+- **Design note (RLS):** `df.instances` has a row-level-security policy (`instances_user_isolation`) filtering `submitted_by = current_user::regrole`, so a per-user index leading with `submitted_by` would be more selective for an individual session. The `created_at`-leading design is intentional: it is optimal for the admin / external-client global-listing path (#146) that reads across submitters, and it still removes the per-query sort for the common case. A `submitted_by`-leading refinement can be revisited if profiling shows the per-user path dominates.
+- **Scenario A considerations:** The upgrade script recreates the indexes with column lists and `DESC`/tiebreaker ordering identical to the fresh-install DDL, so `pg_get_indexdef()` for `idx_instances_status` and `idx_instances_created_at` is byte-identical on both paths and the Scenario A snapshot matches.
+- **Scenario B1 considerations:** The new `.so` works against all previous schemas. The `df.list_instances()` queries (`ORDER BY created_at DESC LIMIT`, optionally `WHERE status = $1`) reference only the `created_at`/`status` columns, which exist in every shipped `df.instances` schema; against a schema that has not run `ALTER EXTENSION UPDATE` the queries stay valid and correct — they simply fall back to a sort without the new index until the upgrade is applied. This is a performance-only change with no correctness impact.
+- **Scenario B2 considerations:** No data migration. `DROP INDEX` / `CREATE INDEX` rebuild access-path metadata only; row data is untouched. The `CREATE INDEX` statements take a `SHARE` lock on `df.instances` while they build, so on a large `df.instances` run `ALTER EXTENSION UPDATE` in a maintenance window for the same reasons noted above.
+
 ### v0.2.2 → v0.2.3
 
 #### Rename duroxide provider schema to `_duroxide` for fresh installs

diff --git a/sql/pg_durable--0.2.3--0.2.4.sql b/sql/pg_durable--0.2.3--0.2.4.sql
@@ -236,3 +236,23 @@ ALTER TABLE df.instances
         FOREIGN KEY (id, root_node)
         REFERENCES df.nodes (instance_id, id)
         DEFERRABLE INITIALLY DEFERRED NOT VALID;
+
+-- ============================================================================
+-- Indexes for efficient instance listing (monitoring redesign, issues #167/#87/#146).
+--
+-- df.list_instances() returns rows newest-first (ORDER BY created_at DESC),
+-- optionally filtered by status. The previous single-column
+-- idx_instances_status(status) did not cover created_at, so a status-filtered
+-- listing still required a sort, and an unfiltered listing had no supporting
+-- index at all. Replace the single-column index with a composite
+-- (status, created_at DESC, id) and add (created_at DESC, id) for the unfiltered
+-- path. The trailing id prepares the access path for the keyset pagination planned
+-- for df.list_instances (ORDER BY created_at DESC, id ASC); df.list_instances() does
+-- not order by id yet, so this does not change the current result ordering. These
+-- definitions are byte-identical to the fresh-install DDL in src/lib.rs, so the
+-- Scenario A index snapshot matches.
+-- ============================================================================
+DROP INDEX IF EXISTS df.idx_instances_status;
+CREATE INDEX idx_instances_status ON df.instances(status, created_at DESC, id);
+DROP INDEX IF EXISTS df.idx_instances_created_at;
+CREATE INDEX idx_instances_created_at ON df.instances(created_at DESC, id);
diff --git a/src/lib.rs b/src/lib.rs
@@ -217,8 +217,21 @@ CREATE TABLE df.instances (
 COMMENT ON COLUMN df.instances.submitted_by IS
     'Effective role (current_user) at df.start() time - used for connection authentication and SQL execution';
 
--- Index for finding pending instances
-CREATE INDEX idx_instances_status ON df.instances(status);
+-- Index for status-filtered listing, newest-first
+-- (df.list_instances() WHERE status = $1 ORDER BY created_at DESC). Also serves the
+-- pending-instance scan via the leading status column. The trailing id prepares the
+-- access path for the keyset pagination planned for df.list_instances
+-- (ORDER BY created_at DESC, id ASC); df.list_instances() does not order by id yet,
+-- so this does not change the current result ordering.
+-- NOTE: keep these two index definitions byte-identical to the 0.2.3->0.2.4 upgrade
+-- script (sql/pg_durable--0.2.3--0.2.4.sql) until 0.2.4 is released -- Scenario A
+-- compares pg_get_indexdef() across the fresh-install and upgrade paths.
+CREATE INDEX idx_instances_status ON df.instances(status, created_at DESC, id);
+
+-- Index for unfiltered listing, newest-first (df.list_instances() ORDER BY created_at DESC).
+-- The trailing id prepares the access path for the same future keyset pagination
+-- (ORDER BY created_at DESC, id ASC); it does not affect the current ordering.
+CREATE INDEX idx_instances_created_at ON df.instances(created_at DESC, id);
 
 -- Index for finding nodes by instance
 CREATE INDEX idx_nodes_instance ON df.nodes(instance_id);