Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
340 changes: 340 additions & 0 deletions docs/content/deep-dives/compiler-magic-swc-plugin-blog.mdx

Large diffs are not rendered by default.

452 changes: 452 additions & 0 deletions docs/content/deep-dives/compiler-magic-swc-plugin-reference.mdx

Large diffs are not rendered by default.

49 changes: 49 additions & 0 deletions docs/content/deep-dives/compiler-magic-swc-plugin-social.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
title: "One File In. Three Bundles Out."
description: A concise explainer on how the Workflow DevKit SWC compiler plugin transforms a single source file into three execution targets — making durable execution feel like writing normal JavaScript.
type: conceptual
summary: The Workflow DevKit SWC plugin takes a single file with "use workflow" and "use step" directives and produces three bundles — step (bodies preserved), workflow (step bodies replaced with WORKFLOW_USE_STEP proxy calls), and client (safe references with workflowId). Stable IDs derived from file path and function name ensure all three modes agree on function identity.
prerequisites:
- /docs/foundations/workflows-and-steps
related:
- /docs/how-it-works/code-transform
- /deep-dives/step-execution-model-reference
- /deep-dives/durability-replay-reference
- /deep-dives/durable-streaming-reference
---

You write one file with two directives. The compiler produces three bundles. That's the trick behind Workflow DevKit's programming model — and it eliminates an entire category of infrastructure code you'd otherwise write by hand.

## The Split

Mark functions with `"use step"` (side effects) or `"use workflow"` (orchestration). The SWC plugin generates three outputs from the same source:

```ts
export async function handleUserSignup(email: string) {
'use workflow';
const user = await createUser(email);
await sendWelcomeEmail(user);
await sleep('5s');
const webhook = createWebhook();
await sendOnboardingEmail(user, webhook.url);
await webhook;
console.log('Webhook Resolved');
return { userId: user.id, status: 'onboarded' };
}

async function createUser(email: string) {
'use step';
console.log(`Creating a new user with email: ${email}`);
return { id: crypto.randomUUID(), email };
}
```

The compiler produces three bundles from this file. The **Step bundle** preserves `createUser`'s body and registers it via `registerStepFunction()`. The **Workflow bundle** replaces `createUser`'s body with a `WORKFLOW_USE_STEP` proxy call — so `await createUser(email)` checks the Event log for a cached `step_completed` result or suspends the workflow if the step hasn't run yet. The **Client bundle** attaches `workflowId` to `handleUserSignup` so `start()` knows which workflow to queue.

Each function gets a stable ID derived from the file path and function name — like `step//./workflows/user//createUser`. Same ID in all three modes. Rename the function, and the ID changes everywhere in the next build.

Here's the key insight: step bodies are excluded from the Workflow bundle because replay must never re-run side effects. When a step hasn't completed yet, the proxy throws a `WorkflowSuspension`. The suspension handler writes a `step_created` event to the Event log (persisting the input) and sends a Queue message to trigger execution. The step handler runs the real function, writes `step_completed`, and re-queues the workflow. On replay, the proxy finds the cached result and returns it immediately. Workflow state is reconstructed by replaying code against the Event log — the VM's deterministic `Math.random()`, `Date.now()`, `crypto.getRandomValues()`, and `crypto.randomUUID()` ensure every replay sees identical values.

## Why It Matters

No manual file separation. No hand-maintained ID registries. No server-side imports leaking into client bundles. You write ordinary JavaScript, add two directive strings, and the compiler handles the three-way split that makes durable execution possible. One file in, three bundles out.
358 changes: 358 additions & 0 deletions docs/content/deep-dives/cost-model-fluid-compute-blog.mdx

Large diffs are not rendered by default.

320 changes: 320 additions & 0 deletions docs/content/deep-dives/cost-model-fluid-compute-reference.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,320 @@
---
title: Cost Model and Fluid Compute
description: How Workflow DevKit eliminates always-on workers by using queue-driven execution, idle-free suspension, and delayed re-enqueue to run workflow logic only when there is work to do.
type: conceptual
summary: A technical reference tracing the runtime control flow that makes workflow execution pay-per-use — from run creation through queue dispatch, suspension, step execution, delayed re-enqueue, and completion.
prerequisites:
- /docs/foundations/workflows-and-steps
related:
- /docs/how-it-works/event-sourcing
- /deep-dives/durability-replay-reference
- /deep-dives/step-execution-model-reference
- /deep-dives/durable-streaming-reference
---

<Callout>
Traditional long-running background jobs keep a process alive for the entire duration of a workflow — even when the workflow is waiting for a step to finish, an external webhook, or a timer to expire. Workflow DevKit inverts this model: the orchestrator runs only when there is a decision to make, suspends by returning from the handler, and wakes up again via queue delivery when new data arrives. The compute cost is proportional to the work performed, not the wall-clock time of the workflow.
</Callout>

## Overview

The cost model comes from three distinct re-entry paths in the runtime:

1. **Queue-driven invocation** — `start()` persists `run_created`, queues the workflow message, and returns a `Run` handle immediately.
2. **External-work suspension** — when orchestration hits a step, the handler persists `step_created`, queues the step, and exits. When orchestration hits a hook, the runtime records hook state durably and exits. The workflow stays suspended until external delivery records `hook_received`.
3. **Timed re-entry** — waits persist `wait_created` and resume through `timeoutSeconds`.

<Callout type="info">
`timeoutSeconds` is not the generic suspension mechanism. In the current runtime, waits return a positive `timeoutSeconds`, and `hook_conflict` returns `timeoutSeconds: 0` for an immediate replay. Step completion resumes the workflow because the step handler writes `step_completed` and explicitly re-queues the workflow.
</Callout>

These three properties mean a workflow that sleeps for a week consumes compute only during the brief moments when it replays state and dispatches or collects results — not for the seven days in between.

## Lifecycle

The following diagram traces a single workflow run from creation through suspension, step execution, timed waits, and completion:

```mermaid
flowchart TD
A["Client calls start()"] --> B["run_created event persisted"]
B --> C["Workflow message queued"]
C --> D["Workflow handler invoked"]
D --> E["run_started event — status: running"]
E --> F["Event log loaded, VM replays workflow code"]
F --> G{"Next pending operation?"}
G -->|"step / hook"| H["step_created / hook_created events persisted"]
H --> I["Step messages queued"]
I --> J["Handler returns — compute released"]
J --> K["Step handler or external delivery completes"]
K --> L["step_completed / hook_received event persisted"]
L --> M["Workflow message re-enqueued"]
M --> D
G -->|"wait / sleep"| N["wait_created event persisted"]
N --> O["Handler returns timeoutSeconds"]
O --> P["Queue delays next delivery"]
P --> D
F -->|"No pending work"| Q["Workflow runs to completion"]
Q --> R["run_completed event persisted"]
```

## Code Walkthrough

### Run creation and initial queue dispatch

When client code calls `start()`, two things happen in sequence: a `run_created` event is persisted to the event log, and a message is placed on the workflow queue. The function returns immediately with a `Run` handle — it does not wait for the workflow to execute.

```ts title="packages/core/src/runtime/start.ts" lineNumbers {29-30,33-39}
// Generate runId client-side so we have it before serialization
const runId = `wrun_${ulid()}`;

// ...

// Create run via run_created event (event-sourced architecture)
const result = await world.events.create(
runId,
{
eventType: 'run_created',
specVersion,
eventData: {
deploymentId: deploymentId,
workflowName: workflowName,
input: workflowArguments,
executionContext: { traceCarrier, workflowCoreVersion },
},
},
{ v1Compat }
);

// ...

await world.queue(
getWorkflowQueueName(workflowName),
{
runId,
traceCarrier,
} satisfies WorkflowInvokePayload,
{
deploymentId,
}
);

return new Run<TResult>(runId);
```

The call to `world.queue()` is asynchronous but non-blocking from the caller's perspective — `start()` returns the `Run` object as soon as the message is accepted. The workflow handler that processes this message runs in a separate invocation, potentially on a different compute instance.

### Suspension: recording pending work and exiting

When the workflow VM encounters a step, hook, or sleep, it throws a `WorkflowSuspension` — a structured control-flow signal, not an error. The workflow handler in `runtime.ts` catches this, delegates to `handleSuspension`, and returns the result:

```ts title="packages/core/src/runtime.ts" lineNumbers
// WorkflowSuspension is normal control flow — not an error
if (WorkflowSuspension.is(err)) {
const result = await handleSuspension({
suspension: err,
world,
run: workflowRun,
span,
requestId,
});

if (result.timeoutSeconds !== undefined) {
return { timeoutSeconds: result.timeoutSeconds };
}

// Suspension handled, no further work needed
return;
}
```

Inside `handleSuspension`, each pending item is recorded as an event and dispatched:

- **Steps** get a `step_created` event and a queue message to the step handler
- **Hooks** get a `hook_created` event — the workflow stays suspended until external delivery records `hook_received`
- **Waits** get a `wait_created` event with a `resumeAt` timestamp

<Callout type="info">
Hooks are durable suspension points, not queued jobs. The create phase records `hook_created`; the receive phase records `hook_received`. Unlike a step, hook creation does not itself queue executable work.
</Callout>

The handler then calculates the minimum timeout from any pending waits:

```ts title="packages/core/src/runtime/suspension-handler.ts" lineNumbers
// Calculate minimum timeout from waits
const now = Date.now();
const minTimeoutSeconds = waitItems.reduce<number | null>(
(min, queueItem) => {
const resumeAtMs = queueItem.resumeAt.getTime();
const delayMs = Math.max(1000, resumeAtMs - now);
const timeoutSeconds = Math.ceil(delayMs / 1000);
if (min === null) return timeoutSeconds;
return Math.min(min, timeoutSeconds);
},
null
);

// ...

if (hasHookConflict) {
return { timeoutSeconds: 0 };
}

if (minTimeoutSeconds !== null) {
return { timeoutSeconds: minTimeoutSeconds };
}

return {};
```

Timed waits are the normal source of delayed workflow wake-ups here. A step-driven suspension usually returns `{}` from `handleSuspension`; the later wake-up happens when the step handler persists `step_completed` and explicitly re-enqueues the workflow. One edge case exists: hook conflicts force an immediate replay via `timeoutSeconds: 0` so the next invocation can surface `hook_conflict` deterministically.

When `timeoutSeconds` is returned, the queue infrastructure uses it to schedule the next delivery. **The handler exits and the compute is freed.** No process sleeps or polls during the delay.

### Delayed re-enqueue in production (Vercel Queue Service)

On Vercel, the queue handler uses `delaySeconds` to schedule a new message rather than holding the current one:

```ts title="packages/world-vercel/src/queue.ts" lineNumbers
if (typeof result?.timeoutSeconds === 'number') {
// When timeoutSeconds is 0, skip delaySeconds entirely for immediate re-enqueue.
// Otherwise, clamp to max delay (23h) - for longer sleeps, the workflow will chain
// multiple delayed messages until the full sleep duration has elapsed.
const delaySeconds =
result.timeoutSeconds > 0
? Math.min(result.timeoutSeconds, MAX_DELAY_SECONDS)
: undefined;

// Send new message BEFORE acknowledging current message.
// This ensures crash safety: if process dies after send but before ack,
// we may get a duplicate invocation but won't lose the scheduled wakeup.
await queue(queueName, payload, { deploymentId, delaySeconds });
}
```

For sleeps longer than 23 hours (the maximum single-message delay), the system chains messages automatically. Each time the delayed message fires, the workflow handler checks whether `now >= resumeAt`. If the sleep has not elapsed, it returns another `timeoutSeconds` and the cycle repeats.

### Delayed re-enqueue in local development

The local queue implements the same contract using `setTimeout`:

```ts title="packages/world-local/src/queue.ts" lineNumbers
if (response.ok) {
try {
const timeoutSeconds = Number(JSON.parse(text).timeoutSeconds);
if (Number.isFinite(timeoutSeconds) && timeoutSeconds >= 0) {
if (timeoutSeconds > 0) {
const timeoutMs = Math.min(
timeoutSeconds * 1000,
MAX_SAFE_TIMEOUT_MS
);
await setTimeout(timeoutMs);
}
continue;
}
} catch {}
return;
}
```

The local queue keeps the message in-process and uses a `setTimeout` to delay the next loop iteration. This simulates the production behavior where the message is invisible for the delay period.

### Step completion triggers workflow re-invocation

When a step finishes — whether successfully or with a terminal failure — the step handler re-enqueues the workflow so it can replay with the new result:

```ts title="packages/core/src/runtime/step-handler.ts" lineNumbers
if (EntityConflictError.is(err)) {
runtimeLogger.debug(
'Step in terminal state, re-enqueuing workflow',
{
stepName,
stepId,
workflowRunId,
error: err.message,
}
);

await queueMessage(world, getWorkflowQueueName(workflowName), {
runId: workflowRunId,
traceCarrier: await serializeTraceCarrier(),
requestedAt: new Date(),
});

return;
}
```

This is the mechanism that drives the workflow forward without a persistent orchestrator process. Each step completion is a discrete event that triggers exactly one workflow re-invocation.

### Run completion

When the workflow VM runs to the end of the function without throwing a `WorkflowSuspension`, the handler persists a `run_completed` event and returns without re-enqueueing:

```ts title="packages/core/src/runtime.ts" lineNumbers
await world.events.create(
runId,
{
eventType: 'run_completed',
specVersion: SPEC_VERSION_CURRENT,
eventData: {
output: workflowResult,
},
},
{ requestId }
);
```

No further messages are queued. The workflow is done, and no compute resources remain allocated.

## Why This Matters

The queue-driven execution model means workflow compute is consumed only during active processing:

- **No always-on worker loop** — workflow handlers are invoked on demand by queue messages. Between invocations, no process exists. This is fundamentally different from traditional job runners that poll a database or maintain long-lived connections.
- **Wall-clock time is free** — a `sleep('7d')` call costs the same as a `sleep('5s')` call in terms of compute. Both produce a delayed queue message and release all resources immediately.
- **Replay is cheap** — re-executing the workflow VM to reconstruct state takes milliseconds for typical workflows. Step results are cached in the event log, so replayed steps return instantly without re-executing their bodies.
- **Parallel steps share nothing** — `Promise.all([stepA(), stepB()])` dispatches both steps as independent queue messages. Each step runs in its own invocation with full Node.js access, and both can execute concurrently on separate compute instances.
- **Crash safety without cost** — if a handler crashes mid-execution, the queue automatically re-delivers the message. The event log ensures that any work already persisted is not repeated. Recovery is a normal re-invocation, not a special monitoring process.

<Callout type="info">
The cost characteristics described here are a consequence of the queue and suspension mechanics, not a separately configurable feature. Any deployment target that provides queue-based message delivery with delay support (such as Vercel Queue Service in production, or the local filesystem queue in development) inherits these properties automatically.
</Callout>

<Callout type="info">
Workflow execution is event-driven. The `Run.returnValue` convenience getter is separate client-side behavior and currently polls run state once per second.
</Callout>

```ts title="packages/core/src/runtime/run.ts" lineNumbers
private async pollReturnValue(): Promise<TResult> {
while (true) {
try {
const run = await this.world.runs.get(this.runId);

if (run.status === 'completed') {
const encryptionKey = await this.getEncryptionKey();
return await hydrateWorkflowReturnValue(
run.output,
this.runId,
encryptionKey
);
}

if (run.status === 'cancelled') {
throw new WorkflowRunCancelledError(this.runId);
}

if (run.status === 'failed') {
throw new WorkflowRunFailedError(this.runId, run.error);
}

throw new WorkflowRunNotCompletedError(this.runId, run.status);
} catch (error) {
if (WorkflowRunNotCompletedError.is(error)) {
await new Promise((resolve) => setTimeout(resolve, 1_000));
continue;
}
throw error;
}
}
}
```
Loading
Loading