Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 13 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,14 @@ serde_json = { version = "1.0", features = ["preserve_order"] }
uuid = { version = "1.0", features = ["v4", "serde"] }

# duroxide integration
duroxide = "=0.1.29"
# Using git dependency on pinodeca/continue-parent-link branch which preserves the
# parent link when a sub-orchestration calls continue_as_new, unblocking the
# sub-orchestration approach for df.loop.
# Compatibility: duroxide-pg 0.1.34 has been verified to compile and run correctly
# against this branch (same public API as 0.1.29 — PR #31 is a runtime-only change).
# Once the branch is merged and a new duroxide release is published, revert both
# entries back to crates.io version pins as a compatible pair.
duroxide = { git = "https://github.com/microsoft/duroxide.git", branch = "pinodeca/continue-parent-link" }
duroxide-pg = "=0.1.34"
tokio = { version = "1", features = ["rt-multi-thread", "sync", "time"] }

Expand All @@ -55,6 +62,11 @@ tracing-subscriber = { version = "0.3", features = ["env-filter"] }
[dev-dependencies]
pgrx-tests = "=0.16.1"

# Override the crates.io duroxide with the git branch for both the direct dependency
# and duroxide-pg's transitive dependency on duroxide.
[patch.crates-io]
duroxide = { git = "https://github.com/microsoft/duroxide.git", branch = "pinodeca/continue-parent-link" }

[profile.dev]
panic = "unwind"

Expand Down
6 changes: 6 additions & 0 deletions USER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -946,6 +946,12 @@ SELECT df.start(

Use `@>` operator or `df.loop()` to create functions that run forever. Each iteration creates a new execution with fresh state (via continue-as-new).

> **Note:** Each `df.loop()` runs as its own child orchestration. The loop's iterations
> (the continue-as-new generations) are scoped to that child, so any nodes *before* the
> loop in the graph run exactly once and any nodes *after* the loop run once the loop
> exits. When inspecting `df.instances`, a looping function therefore shows a parent
> instance plus a separate child instance for the loop.

```sql
-- Simple heartbeat every 30 seconds (using @> operator)
SELECT df.start(
Expand Down
134 changes: 134 additions & 0 deletions docs/exec-id-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Plan: execution_id node state (TEMPORARY — do not commit)

> This plan file is scratch. Delete `docs/exec-id-plan.md` before the feature
> branch is finalized (last step below). It must not land in history.

## Goal

Replace the `node-state-model` PR's stored `status_reason` approach with an
`execution_id`-based model where **physical** statuses are written by the owning
orchestration and **implicit** statuses (`skipped`, `cancelled`) are *derived at
read time* in `df.instance_nodes` from each node's ancestors — with no duroxide
dependency in the read path.

---

## Step 1 — Schema: `status_details JSONB` on `df.nodes`

- Add `status_details JSONB` (nullable) to `df.nodes` in [src/lib.rs](src/lib.rs)
`create_tables` DDL.
- Stores the writer's `execution_id` (the ordered path; see Step 4) plus room
for small read-path payloads we already need (the IF decision; the RACE
winner — see Step 6).
- Incidental cleanup agreed earlier: **drop the dead `error` column** and stop
overloading `result` for non-result data.
- Keep `nodes_status_chk` limited to the **physical** statuses
(`pending`/`running`/`completed`/`failed`). `skipped`/`cancelled` are never
stored, so they must NOT be added to the check constraint.
- Grants: add `status_details` to nothing in the user INSERT column list (worker
writes it); remove `error` from any column lists in `df.grant_usage` /
`df.revoke_usage`.
- **Upgrade script** `sql/pg_durable--0.2.4--0.2.5.sql` (next version): `ADD
COLUMN status_details`, `DROP COLUMN error`, constraint + grant adjustments.
- **Binary backward-compat (B1):** the new `.so` must still run against older
schemas that lack `status_details`/still have `error`. The write path (Step 2)
and read path (Step 3) must degrade gracefully when the column is absent
(runtime column detection, or a guarded query) — verify with
`scripts/test-upgrade.sh`.

## Step 2 — Write path: fence status updates on `execution_id`

- In [src/activities/update_node_status.rs](src/activities/update_node_status.rs):
accept the caller's `execution_id`, write it to `status_details`, and make the
`UPDATE` **conditional**: apply only when the incoming `execution_id` is **not
superseded** by the stored one (`incoming >= stored`, equal allowed for
running→terminal within one execution).
- Ordering: JSONB does not compare the way we need. Store/compare via a
**comparable form** (e.g. an `int[]` of the execution numbers) so Postgres'
native lexicographic array order gives "first differing segment, larger number
wins". The JSONB can carry the human-readable path; the gate compares the
array form (column, generated column, or extracted in the `WHERE`).
- **Fire-and-forget / determinism:** the activity returns `Ok` regardless of
whether the row was updated; the orchestration must never branch on the
outcome (keep `let _ = ctx.schedule_activity(...)`).

## Step 3 — Read path: `df.instance_nodes` infers status from `df.nodes` only

- Rewrite [src/monitoring.rs](src/monitoring.rs) `instance_nodes` to:
1. **Drop the duroxide dependency** — read solely from `df.nodes` (one
`SELECT` of the instance's rows). Remove `list_executions` / the fabricated
`execution_id` cross join.
2. For each node, **leave `status` and the stored `execution_id` untouched**
and instead **add inferred fields to the returned `status_details`**:
- `inferred_status` — the *effective* state at read time (`skipped`,
`cancelled`, `pending`, or the node's own physical status when no
ancestor overrides it).
- `inferred_status_from_ancestor_id` — the node whose state *determined*
`inferred_status` (the first ancestor whose `execution_id` is not a
prefix of this node's). Makes the derivation explainable per row.
- We deliberately do NOT add `inferred_status_execution_id`: it is just the
`execution_id` of `inferred_status_from_ancestor_id`'s row, so a reader
can join to that node if needed.
- Rationale: the read path **must not compete with the write path**. By
augmenting (not overwriting) `status_details`, the helper never races the
orchestration's fenced writes. (If we later materialize this at the DB
level it would compete — defer that decision.)
3. **Derivation (leaf→root climb):** find the first ancestor whose
`execution_id` is not a prefix of the node's; that ancestor's state sets
`inferred_status`. Ancestor `failed` ⇒ `skipped`; `running`/newer execution
⇒ `pending`; `completed` but branch/iteration not taken ⇒ `skipped`; RACE
loser ⇒ `cancelled`. Use the IF decision and RACE winner recorded in
`status_details` (Step 6) to know which branch/iteration is "taken".
4. Children are reachable via `left_node`/`right_node`; the tree is immutable
after `df.start()`, so no parent column is required (optional safe
denormalization only).
- Output columns: `status` and `status_details` keep their stored values; the
inferred fields live inside the returned `status_details` JSONB only, never in
the base table.

## Step 4 — Document the `execution_id`

- Fold the structure from [docs/execution-id.md](docs/execution-id.md) into the
permanent doc (see Step 7): ordered path `root:n₀ : seg₁:n₁ : … : segₖ:nₖ`;
segment added only at sub-orchestration boundaries (loop iteration, join/race
branch); branch numbers = 1, loop numbers = 1..N; same node across executions
shares segment identities+length, differing only in numbers ⇒ total order.

## Step 5 — Document the state-transition graph + helper

- Add a diagram of node states with the **physical** transitions
(`pending→running→completed|failed`) drawn solid and the **implicit, never
materialized** ones (`→skipped`, `→cancelled`, re-entry `→pending` on a new
execution) drawn dashed, with a clear note that the dashed ones exist only in
`df.instance_nodes` output.
- Document the inference helper used by `instance_nodes` (the ancestor-climb /
derivation function): inputs, the prefix rule, each case, and that it emits
`inferred_status` + `inferred_status_from_ancestor_id` into `status_details`
rather than mutating the stored `status`/`execution_id`.

## Step 6 — RACE winner recording (small behavioral add)

- The root→leaves / branch-taken inference needs the RACE node to record its
**winning branch** in `status_details`. Add this write in the race
orchestration if not already present.

## Step 7 — Docs cleanup

- Delete [docs/execution-id.md](docs/execution-id.md) and
[docs/node-state-model.md](docs/node-state-model.md); their content is folded
into the permanent doc/diagram (Steps 4–5).
- Delete this plan file `docs/exec-id-plan.md` (must not be committed on the
feature branch).

---

## Open dependency you may be assuming implicitly

**Loop boundary (prerequisite, not in your list).** Clean per-iteration
`execution_id` segments require a nested loop to run as a **sub-orchestration
rooted at the loop node**. Today the loop uses `ctx.continue_as_new` from
`graph.root_node_id` (the only `continue_as_new` caller), which both produces the
existing nested-loop "restarts wrong root" bug and prevents a nested loop from
owning its own segment. This must be addressed (or explicitly deferred with
top-level-loops-only scope) for Steps 2–3 to be correct. Track it as a separate
work item.
42 changes: 42 additions & 0 deletions docs/execution-id.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Execution IDs for Node State

## Principles

Each node row is written by a **single writer** (the orchestration that owns it). We keep changes **minimal**, and we **never materialize** implicit states (`skipped`, `cancelled`): they are *derived* at read time. This avoids extra writes, write contention, and conflict resolution.

## 1. `execution_id`

An `execution_id` is the **sub-orchestration path** from the root to the orchestration that last wrote a node, with an **execution number** per segment:

```
root:3:loopA:4:branchC:1
```

- A segment is added **only at a sub-orchestration boundary** — a loop iteration, or a join/race branch. Plain nodes (SQL, IF, sequencing, …) inherit their owning orchestration's `execution_id`; the path is therefore *coarser* than the full graph.
- Execution numbers: **1** for join/race branches; **1..N** for loop iterations (one per `continue_as_new`); the root counts root-level loop iterations.
- Two `execution_id`s that can ever write the **same** node share identical segment *identities* and length — they differ only in the **numbers**. This makes them **totally ordered**: at the first segment whose number differs, the larger number is more recent (“supersedes”).

## 2. `status_detail` column

Add `status_detail` to `df.nodes`. On every status write, the owning orchestration stamps the node's current `execution_id` there.

*(Incidental cleanup, not core to this proposal: `status_detail` lets us drop the unused `error` column and stop overloading `result` for non-result data.)*

## 3. Most-recent-execution wins

The status update becomes a **fenced conditional write**: apply only if the incoming `execution_id` is **not superseded** by the one already stored (`incoming >= stored`). A stale writer — e.g. a race loser still draining, or a previous loop iteration — cannot clobber a newer execution's state, and the row converges to the most-recent execution regardless of arrival order. The write stays single-writer-*effective* and order-independent; the orchestration never branches on whether the write landed.

## 4. Monitoring: it's enough for both tree walks

With `status` + `status_detail`, plus what we already store (the IF decision in `result`, and the race winner on the race node), a single `SELECT` of an instance's nodes supports both directions:

- **Leaf → root:** climb a node's ancestors; the first ancestor whose `execution_id` is not a prefix of the node's tells you the node's *effective* state (ancestor failed ⇒ skipped; running ⇒ pending in a new execution; completed ⇒ branch/iteration not taken ⇒ skipped).
- **Root → leaves:** descend carrying the live lineage; decisions at IF/RACE/LOOP paint whole sub-graphs at once (not-taken ⇒ skipped, race loser ⇒ cancelled, under-failure ⇒ skipped, unreached ⇒ pending).

Intuitively: structure tells you *what could run*, and `execution_id` tells you *which execution each node's status belongs to* — together enough to color the current state of **every sub-graph**, with no materialized transitions.

## Out of scope / dependencies

- **Loop boundary**: requires loops to run as a sub-orchestration rooted at the loop node so iterations get a clean `execution_id` segment — see separate note.
- **Race winner**: the root→leaves walk needs the race node to record its winning branch (small add if not already present).
- **Representation & upgrade**: `execution_id` ordering can be stored in a comparable form (implementation detail); `status_detail` is a new nullable column and the fenced `UPDATE` is compatible with pre-existing rows.
Loading