Skip to content

Commit b7deccb

Browse files
Pin in-flight workflow runs to recorded definition fingerprint (#428)
Closes #428. v2 records `workflow_definition_fingerprint` in every `WorkflowStarted` event but the engine never reads it when picking the class to execute — `TypeRegistry::resolveWorkflowClass()` always resolves from the live `workflow_runs.workflow_class` column. When a deploy promotes a new class under the same `workflow_type` while a run is parked on a signal/ timer, the replayed run silently starts executing the new class's code path against the existing history. Adds a reverse-lookup registry on `WorkflowDefinition` and a fingerprint-aware resolver on `WorkflowDefinitionFingerprint`: - `WorkflowDefinition::fingerprint()` now also populates `$classesByFingerprint` so any class whose fingerprint has been computed is resolvable by its recorded hash. - `WorkflowDefinition::findClassByFingerprint()` is the explicit reverse lookup; null when no such class is known in the current process. - `WorkflowDefinitionFingerprint::resolveClassForRun()` prefers the class whose source fingerprint matches the run's `WorkflowStarted` payload. It falls back to `TypeRegistry::resolveWorkflowClass()` for legacy runs (no fingerprint), the fast-path match (recorded equals current), and for post-deploy runs whose old class is no longer loadable — `RunDetailView` surfaces the fingerprint drift either way. Wired into the two behavior-pinning call sites: - `WorkflowExecutor::run()` (task execution) - `QueryStateReplayer::replayState()` (query/memo replay) Controlled by `workflows.v2.compatibility.pin_to_recorded_fingerprint` (env `WORKFLOW_V2_PIN_TO_RECORDED_FINGERPRINT`), default `true`. Deploys that intentionally hot-swap definitions can opt out by setting it to false to restore the pre-#428 behavior. Also tightens the `VersionCall` dispatcher so `ensureStepHistoryCompatible` runs only on paths that actually advance the workflow sequence. The legacy-default path records no marker and does not advance, so the next yielded step (activity, timer, child, …) legitimately occupies the same sequence slot. Under the old ordering, a legacy run that yielded `getVersion` followed by `activity()` would record activity events at the same sequence and fail the subsequent replay with `HistoryEventShapeMismatchException` at the version-marker step. Unskips both V2VersionWorkflowTest tests tracked in #428: - testSameCompatibilityUsesRecordedDefinitionFingerprint… - testOlderCompatibilityFallsBackToDefaultVersionWithoutRecordingMarker… Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 8295e60 commit b7deccb

6 files changed

Lines changed: 154 additions & 12 deletions

File tree

src/V2/Support/QueryStateReplayer.php

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ public function replayState(WorkflowRun $run): ReplayState
5252
'childLinks.childRun.historyEvents',
5353
]);
5454

55-
$workflowClass = TypeRegistry::resolveWorkflowClass($run->workflow_class, $run->workflow_type);
55+
$workflowClass = WorkflowDefinitionFingerprint::resolveClassForRun($run);
5656
$workflow = new $workflowClass($run);
5757
$this->syncWorkflowCursor($workflow, 1);
5858
$entryMethod = EntryMethod::forWorkflow($workflow);

src/V2/Support/WorkflowDefinition.php

Lines changed: 45 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,24 @@ final class WorkflowDefinition
1919
*/
2020
private static array $fingerprints = [];
2121

22+
/**
23+
* Reverse index of {@see $fingerprints}: fingerprint hash → class-string.
24+
*
25+
* Populated lazily by {@see fingerprint()} so any workflow class whose
26+
* fingerprint has been computed is resolvable by its recorded hash.
27+
* Used by {@see WorkflowDefinitionFingerprint::resolveClassForRun()} to
28+
* pin in-flight runs to the definition snapshot captured in their
29+
* WorkflowStarted history event, even after `workflow_runs.workflow_class`
30+
* has been updated to point at a newer class for the same workflow_type.
31+
*
32+
* When two registered classes compute the same fingerprint (they would
33+
* produce identical replay behavior by definition), the first-registered
34+
* class wins — subsequent identical-fingerprint registrations are ignored.
35+
*
36+
* @var array<string, class-string>
37+
*/
38+
private static array $classesByFingerprint = [];
39+
2240
/**
2341
* @var array<class-string, list<string>>
2442
*/
@@ -304,14 +322,40 @@ public static function fingerprint(string $class): ?string
304322
self::collectFingerprintSources(new ReflectionClass($class), $sources, $seen);
305323
ksort($sources);
306324

307-
self::$fingerprints[$class] = $sources === []
325+
$fingerprint = $sources === []
308326
? null
309327
: 'sha256:' . hash('sha256', json_encode($sources, JSON_THROW_ON_ERROR));
328+
329+
self::$fingerprints[$class] = $fingerprint;
330+
331+
if ($fingerprint !== null && ! array_key_exists($fingerprint, self::$classesByFingerprint)) {
332+
self::$classesByFingerprint[$fingerprint] = $class;
333+
}
310334
}
311335

312336
return self::$fingerprints[$class];
313337
}
314338

339+
/**
340+
* Reverse lookup: return the workflow class whose source fingerprint
341+
* matches `$fingerprint`, or null if no such class has been seen in the
342+
* current process.
343+
*
344+
* The reverse index is populated lazily by {@see fingerprint()}, so this
345+
* resolver only sees classes whose fingerprint has already been computed.
346+
* Callers that want to pin an in-flight run to the class that produced its
347+
* `WorkflowStarted` fingerprint should warm the index by calling
348+
* `fingerprint()` for every candidate class at boot — the current run's
349+
* `workflow_class` plus every configured class under
350+
* `workflows.v2.types.workflows` is a good seed.
351+
*
352+
* @return class-string|null
353+
*/
354+
public static function findClassByFingerprint(string $fingerprint): ?string
355+
{
356+
return self::$classesByFingerprint[$fingerprint] ?? null;
357+
}
358+
315359
/**
316360
* @param class-string $class
317361
* @return list<string>

src/V2/Support/WorkflowDefinitionFingerprint.php

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
use Workflow\V2\Enums\HistoryEventType;
99
use Workflow\V2\Models\WorkflowHistoryEvent;
1010
use Workflow\V2\Models\WorkflowRun;
11+
use Workflow\V2\Workflow;
1112

1213
final class WorkflowDefinitionFingerprint
1314
{
@@ -41,6 +42,78 @@ public static function matchesCurrent(WorkflowRun $run): ?bool
4142
return hash_equals($recorded, $current);
4243
}
4344

45+
/**
46+
* Resolve the workflow class to use for a given in-flight run, preferring
47+
* the class that matches the `workflow_definition_fingerprint` recorded
48+
* in the run's `WorkflowStarted` history event.
49+
*
50+
* This keeps a run pinned to the definition it started under when a
51+
* deploy has promoted a new class under the same `workflow_type` while
52+
* the run is parked on a signal/timer. Without pinning the engine picks
53+
* the new class from `workflow_runs.workflow_class` and runs the wrong
54+
* code path against the existing history.
55+
*
56+
* Resolution order:
57+
* 1. Fast path — if no fingerprint was recorded (legacy run), or the
58+
* recorded fingerprint equals the current class's fingerprint, or
59+
* pinning is disabled via config, fall back to
60+
* {@see TypeRegistry::resolveWorkflowClass()}.
61+
* 2. Reverse-lookup — ask the definition registry for the class whose
62+
* source fingerprint matches the recorded hash. Requires the class
63+
* to have been seen by {@see WorkflowDefinition::fingerprint()} in
64+
* the current process.
65+
* 3. Fall back — if the registry has no match for the recorded
66+
* fingerprint, resolve via {@see TypeRegistry::resolveWorkflowClass()}
67+
* so the run still makes progress rather than failing hard. The
68+
* `RunDetailView` fingerprint-drift signal still surfaces the
69+
* mismatch to operators.
70+
*
71+
* Controlled by `workflows.v2.compatibility.pin_to_recorded_fingerprint`
72+
* (default `true`). Set to `false` to restore hot-swap behavior.
73+
*
74+
* @return class-string<Workflow>
75+
*/
76+
public static function resolveClassForRun(WorkflowRun $run): string
77+
{
78+
$fallbackClass = TypeRegistry::resolveWorkflowClass($run->workflow_class, $run->workflow_type);
79+
80+
if (! self::pinningEnabled()) {
81+
return $fallbackClass;
82+
}
83+
84+
$recorded = self::recordedForRun($run);
85+
86+
if ($recorded === null) {
87+
return $fallbackClass;
88+
}
89+
90+
// Warm the reverse index for the fallback class so repeated calls
91+
// for a run whose fingerprint matches the current class stay on the
92+
// fast path (O(1) hash-equals) instead of running a registry lookup.
93+
$currentFingerprint = WorkflowDefinition::fingerprint($fallbackClass);
94+
95+
if ($currentFingerprint !== null && hash_equals($recorded, $currentFingerprint)) {
96+
return $fallbackClass;
97+
}
98+
99+
$pinnedClass = WorkflowDefinition::findClassByFingerprint($recorded);
100+
101+
if ($pinnedClass !== null && is_subclass_of($pinnedClass, Workflow::class)) {
102+
return $pinnedClass;
103+
}
104+
105+
return $fallbackClass;
106+
}
107+
108+
private static function pinningEnabled(): bool
109+
{
110+
$configured = function_exists('config')
111+
? config('workflows.v2.compatibility.pin_to_recorded_fingerprint', true)
112+
: true;
113+
114+
return (bool) $configured;
115+
}
116+
44117
private static function workflowStartedEvent(WorkflowRun $run): ?WorkflowHistoryEvent
45118
{
46119
if ($run->relationLoaded('historyEvents')) {

src/V2/Support/WorkflowExecutor.php

Lines changed: 23 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ public function run(WorkflowRun $run, WorkflowTask $task): ?WorkflowTask
7373
return null;
7474
}
7575

76-
$workflowClass = TypeRegistry::resolveWorkflowClass($run->workflow_class, $run->workflow_type);
76+
$workflowClass = WorkflowDefinitionFingerprint::resolveClassForRun($run);
7777
$workflow = new $workflowClass($run);
7878
$entryMethod = EntryMethod::forWorkflow($workflow);
7979
$arguments = $workflow->resolveMethodDependencies($run->workflowArguments(), $entryMethod);
@@ -451,16 +451,33 @@ public function run(WorkflowRun $run, WorkflowTask $task): ?WorkflowTask
451451
return $this->restartAfterPendingUpdateFailure($run, $task);
452452
}
453453

454-
if (! $this->ensureStepHistoryCompatible($run, $task, $sequence, WorkflowStepHistory::VERSION_MARKER)) {
455-
return null;
456-
}
457-
458454
$versionMarkerEvent = $this->versionMarkerEvent($run, $sequence);
459455

460456
try {
461457
$resolution = VersionResolver::resolve($run, $versionMarkerEvent, $current, $sequence);
462-
$version = $resolution->version;
458+
} catch (Throwable $throwable) {
459+
$this->failRun($run, $task, $throwable, 'workflow_run', $run->id);
460+
461+
return null;
462+
}
463463

464+
// Only assert VERSION_MARKER history shape on paths that will
465+
// occupy a history slot at this sequence (recorded / fresh).
466+
// The legacy-default path records nothing and does not advance
467+
// the workflow sequence — the next yield (activity, timer…)
468+
// legitimately shares this slot, so checking for
469+
// VERSION_MARKER-compatibility here would spuriously reject
470+
// legacy replays that have since produced ACTIVITY/TIMER
471+
// events at the same sequence.
472+
if ($resolution->advancesSequence
473+
&& ! $this->ensureStepHistoryCompatible($run, $task, $sequence, WorkflowStepHistory::VERSION_MARKER)
474+
) {
475+
return null;
476+
}
477+
478+
$version = $resolution->version;
479+
480+
try {
464481
if ($resolution->shouldRecordMarker) {
465482
$versionMarkerEvent = $this->recordVersionMarker($run, $task, $sequence, $current, $version);
466483
}

src/config/workflows.php

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,18 @@
6969
'supported' => env('WORKFLOW_V2_SUPPORTED_COMPATIBILITIES', null),
7070
'namespace' => env('WORKFLOW_V2_COMPATIBILITY_NAMESPACE', null),
7171
'heartbeat_ttl_seconds' => (int) env('WORKFLOW_V2_COMPATIBILITY_HEARTBEAT_TTL', 30),
72+
// When true (the default), in-flight runs resolve their workflow
73+
// class from the `workflow_definition_fingerprint` recorded in
74+
// their WorkflowStarted history event instead of the live
75+
// `workflow_runs.workflow_class` column. This keeps a run pinned
76+
// to the definition snapshot it started under even after a deploy
77+
// swaps the class pointer for the same workflow_type.
78+
//
79+
// Set to false only if your deploy intentionally hot-swaps
80+
// workflow classes mid-run and wants the replacement class to
81+
// execute against the existing history from the next task
82+
// forward.
83+
'pin_to_recorded_fingerprint' => (bool) env('WORKFLOW_V2_PIN_TO_RECORDED_FINGERPRINT', true),
7284
],
7385
'history_budget' => [
7486
'continue_as_new_event_threshold' => (int) env('WORKFLOW_V2_CONTINUE_AS_NEW_EVENT_THRESHOLD', 10000),

tests/Feature/V2/V2VersionWorkflowTest.php

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -266,8 +266,6 @@ public function testVersionMarkersRecordAfterEarlierSignalsOnCurrentCompatibilit
266266

267267
public function testSameCompatibilityUsesRecordedDefinitionFingerprintToKeepOlderRunOnDefaultVersion(): void
268268
{
269-
$this->markTestSkipped('Fingerprint-pinned workflow class resolution is not yet implemented; tracked in #428.');
270-
271269
config()
272270
->set('workflows.v2.compatibility.current', 'build-b');
273271
config()
@@ -331,8 +329,6 @@ public function testSameCompatibilityUsesRecordedDefinitionFingerprintToKeepOlde
331329

332330
public function testOlderCompatibilityFallsBackToDefaultVersionWithoutRecordingMarkerAfterEarlierSignal(): void
333331
{
334-
$this->markTestSkipped('Fingerprint-pinned workflow class resolution is not yet implemented; tracked in #428.');
335-
336332
config()
337333
->set('workflows.v2.compatibility.current', 'build-b');
338334
config()

0 commit comments

Comments
 (0)