Skip to content

Follow-ups from #238 (#129): node-status scoping test, friendlier NULL-instance_id upgrade abort, short_id format test #264

Description

@crprashant

Follow-up to #238 (issue #129)

While hardening instance/node ID collisions in #238, the reviewers and I agreed to defer three small, independent items so they didn't expand that PR. Filing them together here. None is a correctness defect in the shipped change — they are test-coverage and operability improvements. They can be picked up independently.

Related: #129 (origin), #238 (the change), #241 (schema-snapshot gate follow-up), #261 (ID pruning, future).


1. Focused regression test for update_node_status instance scoping + placeholder/bind ordering (SF5)

Where: src/activities/update_node_status.rs

Problem. The activity builds three dynamic SQL shapes with conditional placeholder numbering — result-present uses $4, while running-no-result and non-running use $3 — and assembles the binds afterward. #238 added a single-row assertion (expected exactly 1) that now catches a wrong cardinality (0 or >1 rows), but it does not catch a wrong placeholder/bind that still updates exactly one row with the wrong column value. The only workflow that exercises this path (tests/e2e/sql/51_node_composite_pk.sql) hits just the happy path and never drives the no-result / non-running branches, so a placeholder↔bind regression in those branches could land silently.

Suggested action. Add a focused regression test that exercises all three branches and asserts the resulting row values (status, result, updated_at), plus behavior over duplicate (instance_id, id) fixtures (one matching instance, one decoy) to prove the instance_id scope is honored. Equivalently, extract the SQL/param-shape construction into a pure helper and unit-test the exact SQL string + ordered param roles for each branch.


2. Friendlier upgrade abort on a pre-0.2.2 NULL instance_id row (C4)

Where: sql/pg_durable--0.2.3--0.2.4.sql (the ALTER COLUMN instance_id SET NOT NULL in the nodes_pkey promotion block, ~L219–220)

Problem. nodes_instance_id_present_chk is NOT VALID, so it does not cover rows written before 0.2.2. If such a database still holds a NULL-instance_id node row, SET NOT NULL aborts with the bare native error column "instance_id" ... contains null values. The abort is safe — it rolls back atomically and the extension stays at 0.2.3 — but the remediation (backfill or remove the offending rows) is only described in a SQL comment and in docs/upgrade-testing.md, so the operator sees a cryptic message with no inline guidance.

Suggested action. Precede the SET NOT NULL with a DO block that counts offending rows and RAISE EXCEPTION with the offending count and the backfill instruction, preserving the atomic rollback. Add an upgrade-test fixture (a seeded NULL-instance_id row) asserting the clear error is raised and the transaction rolls back.


3. Residual: assert the real short_id() format invariant (SF6 — mostly done in #238)

Where: src/types.rs (short_id), src/dsl.rs (pick_id_with_retry + its unit tests)

Status. The core of the original SF6 finding was already addressed in #238: ID selection was extracted into the shared pick_id_with_retry helper, which now has unit tests for first-try success, re-roll on collision, exhaustion-without-returning-an-unverified-id, and claim-error propagation. This item is the small leftover only.

Problem. Those unit tests inject fake ID generators, so they never assert that the real short_id() actually produces the ^[0-9a-f]{8}$ shape the schema and retry logic assume. A future change to short_id() (length/charset) would not be caught.

Suggested action. Add a unit test asserting short_id() matches ^[0-9a-f]{8}$. Optionally, a light e2e assertion on the format of a freshly generated instance ID.


Acceptance criteria

  • update_node_status has a focused test covering all three SQL branches, asserting correct row values, plus correct scoping over duplicate (instance_id, id) fixtures. (SF5)
  • Upgrading a database that holds a NULL-instance_id node row raises a clear, actionable error (offending count + backfill instruction) and rolls back atomically; covered by an upgrade-test fixture. (C4)
  • short_id()'s ^[0-9a-f]{8}$ format invariant is unit-tested. (SF6 residual)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions