opaque() — Workflow-side handles for non-serializable step-side values #1957
TooTallNate
announced in
RFCs
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
opaque()— Workflow-side handles for non-serializable step-side valuesSummary
This RFC proposes a new Workflow SDK primitive,
opaque(factory), that lets workflow code construct handles to values that don't yet exist — values whose actual instantiation happens later, on the step side, when the handle is passed into a step. The factory closure runs in the step bundle, can use full Node.js, and produces a value whose return type is allowed to be non-serializable (DB connections, AI SDK model instances, NestJS@Injectable()providers, file handles, …).The workflow VM never executes the factory and never observes the materialized value — it only forwards the handle. Existing closure-variable plumbing is reused for the factory's serializable inputs; existing
WORKFLOW_USE_STEP-style globals and registries are reused for the factory's per-bundle dispatch.This is a new primitive, not a change to existing semantics. It composes with
"use step"rather than replacing it.Motivation
Workflow today has a strict invariant: anything that crosses the workflow→step boundary must be serializable. This is correct and load-bearing — it's what makes step input/output reproducible across replays, encryptable at rest, and durable.
But there's a recurring class of values that should exist on the step side and should be parameterized by workflow-side data, yet have no useful serialized representation. Today, supporting them requires hand-written wrapper hacks. Three concrete examples:
Example 1 — AI SDK model providers (status quo)
packages/ai/src/providers/google.tsliterally is:The user's workflow code calls
google('gemini-2.5-flash')and receives a thunk. They thread the thunk intoDurableAgent, which is hand-coded to know it received a thunk and to call it on the step side. The "type" is() => Promise<LanguageModel>rather thanLanguageModel, leaking the implementation detail to every consumer. Every new provider re-implements the same shape.Example 2 — NestJS
@Injectable()tools (#1865)The user wants to write:
#1935fixed the lexical-thiscapture so this compiles correctly, but the runtime requirement — that theReadFileToolinstance survive the workflow→step boundary — forces the user to either (a) writeWORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZEfor every Nest provider used as a tool host (often impossible, since those instances hold DB connections, request-scoped state, etc.) or (b) write aModuleRefbridge by hand. What they actually want is "on the step side, give me whatever Nest's container would hand me right now."Example 3 — request-scoped resources
A workflow needs a
pg.Clientper step invocation: not durable, not serializable, but parameterized by a connection string from the workflow context. Today the user writes a step that opens the connection inside its body and threads results back as serializable scalars. That works for one-call patterns but breaks down when the resource needs to be passed to multiple cooperating steps as a single logical session.The proposal
User-facing API
The compiler:
'use step'hoisting today).opaque(() => …)call site with a workflow-bundle handle constructor:__createOpaque(opaqueId, () => ({ apiKey, userId })). The factory body is not included in the workflow bundle — same DCE pass that already strips step bodies from the workflow bundle covers this for free.Symbol.for("@workflow/core//registeredOpaques")(mirroringregisteredSteps).The runtime:
__createOpaquereturns a sentinel object{ __opaqueId, __closureVarsFn }. Calling it has no side effects, no event log entry, no replay risk.Opaquereducer on the workflow side reads__opaqueIdand the materialized closure vars, emits{ opaqueId, closureVars }. A matching reviver on the step side looks up the factory byopaqueId, calls it with the closure vars in scope, and substitutes the materialized value.Type story
A step function declared as
(model: LanguageModel) => Promise<string>accepts bothLanguageModel(from non-workflow code) andOpaque<LanguageModel>(from workflow code) as its argument, because the runtime substitutes the materialized value at the deserialization boundary. Concrete typing of this — likely via a mapped type that recurses over the args list — is a follow-up; the basic shape is "the workflow VM holdsOpaque<T>, the step body seesT."Compiler changes (
@workflow/swc-plugin)This reuses existing infrastructure rather than adding a parallel path:
'use step'arrow (today)opaque(() => ...)(proposed)=== opaque(imported fromworkflow)_anonymousStep<N>at module scope_opaque<N>at module scopeClosureVariableCollectorover the bodyglobalThis[Symbol.for("WORKFLOW_USE_STEP")](id, closureFn)globalThis[Symbol.for("WORKFLOW_CREATE_OPAQUE")](id, closureFn)Symbol.for("@workflow/core//registeredSteps")Symbol.for("@workflow/core//registeredOpaques")stepsmapopaquesmapRuntime changes (
@workflow/core)Two new pieces:
Plus a reducer/reviver pair that mirrors
getStepFunctionReducer/getStepFunctionReviverexactly — sameclosureVarspropagation, same property-presence-vs-truthiness handling.The workflow-VM-side
createOpaqueis implemented in core and registered onvmGlobalThis[WORKFLOW_CREATE_OPAQUE]next touseStep/createHook/sleep. It returns a sentinel and never invokes the factory.Worked examples
AI SDK provider — what it'd look like after migration
DurableAgentno longer needs to know about thunks. Its model argument's type isLanguageModel, full stop, and anyOpaque<LanguageModel>flows in transparently.NestJS — what
@workflow/nestcould shipUser code:
The
@nestjs/coreimport never enters the workflow bundle (factory body lives in the step bundle); the provider instance is never serialized.Replay determinism
The factory body executes in the step environment with full Node.js access — it is not seeded, frozen-time, or otherwise constrained. This is the same posture as
'use step'bodies, but there's a subtle difference worth calling out: anopaque(...)call site in workflow code looks like an ordinary expression (it returns synchronously, it's not awaited), so users may not realize the factory body is non-deterministic.The actual determinism guarantee is: the handle's identity in the workflow event log is fully determined by
(opaqueId, materialized closure vars)— both of which are deterministic on the workflow side. Whatever the factory does with that input on the step side is the user's responsibility, in the same way that'use step'body content is.We will document this with a recommended pattern: use
opaque()for resource construction, not for state. If you need determinism over the value itself, return it from a regular'use step'instead.Identity and caching
Each
opaque(...)call site, evaluated once in the workflow VM, produces a new handle. Each handle, when crossed into a step bundle, is materialized fresh — there's no cross-step memoization. Within a single step invocation, if the same handle is destructured multiple times, the materialization is cached for that step's lifetime (so twoOpaque<DbClient>arguments backed by the same handle don't open two connections).The workflow VM never sees the materialized value, so handle equality is defined as object identity of the sentinel —
a === bis true iff they came from the sameopaque(...)evaluation. Two separateopaque(() => x)calls produce distinct handles even if the factory bodies look identical.Lifecycle and cleanup
A
Opaque<T>materialized inside a step is alive for that step's invocation only. The step bundle does not retain materialized values across steps — each step that receives the handle gets its own materialization.If the factory needs cleanup (closing a DB connection, releasing a file handle), the factory can return an object whose method gets called by the step body, or we can extend the API with an explicit disposable form in a follow-up:
Out of scope for v1.
Boundary rules
These rules are enforced by the runtime; violations throw with a clear error message:
await opaque(...)or.valueaccessor. The handle is opaque to the workflow.Opaque<T>(or includes one in its return value), the run fails with a serialization error explaining the constraint. The client bundle has no factory map.arguments/this— there is no enclosing function to inherit from once hoisted. Same constraint as nested step bodies (and the SWC plugin emits the same diagnostics).Open questions
Naming.
opaque()is descriptive but generic. Alternatives:lazy(),deferred(),forStep(),stepValue(),bindRef().forStep(() => google('gemini-2.5-flash'))reads especially well at use sites.Migration of existing AI provider thunks. Do we keep the current
() => Promise<LanguageModel>public API and useopaque()internally, or do we makeOpaque<LanguageModel>the new public shape? The latter is a breaking change for any user who has hand-written a custom provider.Disposable form for v1?
opaque({ create, dispose })covers a lot of real use cases (DB clients, file handles) and is a one-line addition. Argument for shipping it now: deferring it lets users build patterns around the simple form that are awkward to retrofit.First-class observability. Should each materialization be visible in the run timeline (similar to
step_*events)? Argument for: makes "why did my step receive a stale model?" debuggable. Argument against: high-volume cases (per-step DB client materialization) would flood the event log. Likely answer: opt-in via a third argument or a separateobservableOpaque(...)helper.Recursive
Opaquecomposition. v1 forbids it. Is there a real use case that justifies lifting this? The natural shape would beopaque(() => ({ db: opaqueDb, agent: opaqueAgent }))— solvable by recursive resolution in the reviver, but the closure-var serializer would need to know not to evaluate the inner__closureVarsFnearly.Relationship to
"use step"factory functions. Should we eventually reframe'use step'arrows that return non-serializable values as desugaring toopaque(...)? The mental model would be cleaner —'use step'always returns a serializable value, full stop — but it's a substantial rework of existing code.Step input encryption. The closure variables of an
opaque(...)flow through the same encryption path as step arguments today. Is there ever a case where the factory itself (not its inputs) needs to be encrypted? I don't think so — the factory body lives in the step bundle and is identified by its registered ID, not transmitted over the wire — but worth confirming.Implementation phases
Phase 1 — Core primitive. SWC plugin transformation,
WORKFLOW_CREATE_OPAQUEglobal,registeredOpaquesregistry, reducer/reviver pair, manifest field, end-to-end test that round-trips a non-serializable handle through a step.Phase 2 — Diagnostics. Compile-time check that
opaque(...)argument is a function literal (not an arbitrary expression). Runtime errors for the boundary rules above with span pointers back to source. TypeScript type forOpaque<T>and the helper that maps step argument types.Phase 3 — AI provider migration. Convert
packages/ai/src/providers/*to useopaque()internally; exposeOpaque<LanguageModel>as the new public type alongside (or replacing) the thunk shape.Phase 4 —
@workflow/nestintegration. ShipinjectProvider(token)helper backed byModuleRef. Documents the Nest end-to-end story (#1865).Phase 5 (optional) — Disposable form.
opaque({ create, dispose }).Prior art
packages/ai/src/providers/.Refs: #1865, #1935.
Beta Was this translation helpful? Give feedback.
All reactions