Skip to content

Commit c266e96

Browse files
d-csclaude
andauthored
feat(run-ops): run-store routing seam + run-engine read seams (#4116)
## What Introduces the run-store routing seam and the run-engine read seams that let run lifecycle operations be dispatched to either the control-plane database or a separately-generated run-ops database, depending on where a run/batch resides. - **run-store** (`internal-packages/run-store`): adds `runOpsStore.ts` and substantially expands `PostgresRunStore.ts` so the store can resolve residency and route reads/writes to the correct backing client. `types.ts` grows the routing/residency types; `NoopRunStore.ts` is removed. - **run-engine** (`internal-packages/run-engine`): adds `engine/controlPlaneResolver.ts` and routes the per-system read paths (dequeue, enqueue, waitpoint, checkpoint, run-attempt, ttl, delayed-run, execution-snapshot, pending-version, debounce, batch) through the resolver/store instead of talking to a single Prisma client directly. `engine/errors.ts`, `engine/types.ts`, and `engine/index.ts` are extended to support injecting the store/resolver. Three fixes are included on top of the seam work: - `c6cadd85f` — routes read-your-writes to the owning store's **writer**, not its lagging replica, so an operation immediately reading back what it just wrote sees a consistent result. - `05c912e05` — normalizes run-ops-generation Prisma errors to the control-plane error class at the store **write boundary**, so `instanceof` checks and the `P2002` → 422 handling continue to work across the separately-generated run-ops Prisma client. - `88d12907f` — resolves NEW-resident batches in `ApiBatchResultsPresenter` by routing the batch read through the store, so a dedicated-DB batch resolves instead of returning 404. The change is heavily test-first: the bulk of the diff is new unit/integration coverage for the store routing, residency, and each run-engine system's control-plane resolver path. ## Why PR4 of the run-ops split stack (PR1–PR3 land the ClickHouse test-container and earlier plumbing). This PR is the read-path foundation: it adds the seam and read-routing but leaves the write path to route through the same seam in a later PR. Behavior-changing where the three fixes above touch existing read-your-writes / error-normalization / batch-resolution paths; otherwise additive (new store module, new resolver, injectable dependencies with existing single-client behavior preserved when no dedicated store is configured). ## Tests Extensive new vitest coverage under `run-store/src/*.test.ts` (routing, residency, dual-schema select, cross-generation error normalization, read-after-write, idempotency dedup, mixed residency, waitpoint co-location) and `run-engine/src/engine/**/*.test.ts` (per-system `controlPlaneResolver` tests, injectability, block-edge residency, waitpoint read residency, trigger-create routing, lifecycle router). Testcontainers-backed; no mocks. ## Notes Draft, **stacked on #4114** (`runops/pr03-clickhouse-tc`). Review that first; this diff is against it. Server-change / changeset note to be added at stack-assembly time. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent ad74568 commit c266e96

73 files changed

Lines changed: 25520 additions & 1062 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
area: webapp
3+
type: feature
4+
---
5+
6+
Route run engine lifecycle operations through the run store and a control-plane resolver so run data can live on a dedicated backing store separate from the control plane.

apps/webapp/app/presenters/v3/ApiBatchResultsPresenter.server.ts

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -11,19 +11,22 @@ export class ApiBatchResultsPresenter extends BasePresenter {
1111
env: AuthenticatedEnvironment
1212
): Promise<BatchTaskRunExecutionResult | undefined> {
1313
return this.traceWithEnv("call", env, async (span) => {
14-
const batchRun = await this._prisma.batchTaskRun.findFirst({
15-
where: {
16-
friendlyId,
17-
runtimeEnvironmentId: env.id,
18-
},
19-
include: {
20-
items: {
21-
select: {
22-
taskRunId: true,
14+
// Route through the store so a NEW-resident batch resolves under the run-ops split (the
15+
// router probes NEW→LEGACY and drops this client hint) instead of 404ing on a control-plane read.
16+
const batchRun = await runStore.findBatchTaskRunByFriendlyId(
17+
friendlyId,
18+
env.id,
19+
{
20+
include: {
21+
items: {
22+
select: {
23+
taskRunId: true,
24+
},
2325
},
2426
},
2527
},
26-
});
28+
this._prisma
29+
);
2730

2831
if (!batchRun) {
2932
return undefined;
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
// Run-ops split resolution LOCK for ApiBatchResultsPresenter.
2+
//
3+
// GET /api/v1/batches/:id/results constructs the presenter BARE (no injected client), so it must
4+
// resolve a batch that lives in the NEW run-ops DB on its own. The presenter routes the batch-row
5+
// lookup through the `runStore` singleton, whose split router probes NEW→LEGACY. This drives a
6+
// NEW-resident (ksuid) batch through a REAL two-physical-DB split router and asserts the bare
7+
// presenter finds it. Fails before the fix (the presenter read the control-plane DB directly and
8+
// 404'd on a NEW-resident batch).
9+
10+
import { heteroRunOpsPostgresTest } from "@internal/testcontainers";
11+
import { PostgresRunStore, RoutingRunStore } from "@internal/run-store";
12+
import type { RunOpsPrismaClient } from "@internal/run-ops-database";
13+
import type { Organization, PrismaClient, Project } from "@trigger.dev/database";
14+
import { generateKsuidId } from "@trigger.dev/core/v3/isomorphic";
15+
import { expect, vi } from "vitest";
16+
import { ApiBatchResultsPresenter } from "~/presenters/v3/ApiBatchResultsPresenter.server";
17+
import type { AuthenticatedEnvironment } from "~/services/apiAuth.server";
18+
19+
// The split router built over the two testcontainer DBs; injected in place of the db.server-backed
20+
// singleton the presenter imports. Populated per-test before the presenter is constructed.
21+
let testRunStore: RoutingRunStore;
22+
23+
// Presenter reads the batch row via `runStore`; child-run reads also go through it. Neutralize the
24+
// real db.server singleton (no env DB) and the runStore singleton (use the split router below).
25+
// The getter defers to `testRunStore` so each test can set its own router before constructing.
26+
vi.mock("~/db.server", () => ({ prisma: {}, $replica: {} }));
27+
vi.mock("~/v3/runStore.server", () => ({
28+
get runStore() {
29+
return testRunStore;
30+
},
31+
}));
32+
33+
vi.setConfig({ testTimeout: 60_000 });
34+
35+
function makeSplitRouter(prisma14: PrismaClient, prisma17: RunOpsPrismaClient) {
36+
const legacyStore = new PostgresRunStore({
37+
prisma: prisma14,
38+
readOnlyPrisma: prisma14,
39+
schemaVariant: "legacy",
40+
});
41+
const newStore = new PostgresRunStore({
42+
prisma: prisma17 as never,
43+
readOnlyPrisma: prisma17 as never,
44+
schemaVariant: "dedicated",
45+
});
46+
return new RoutingRunStore({ new: newStore, legacy: legacyStore });
47+
}
48+
49+
function authEnv(environmentId: string): AuthenticatedEnvironment {
50+
return {
51+
id: environmentId,
52+
type: "DEVELOPMENT",
53+
project: { id: "proj_split" } as Project,
54+
organization: { id: "org_split" } as Organization,
55+
orgMember: null,
56+
} as unknown as AuthenticatedEnvironment;
57+
}
58+
59+
heteroRunOpsPostgresTest(
60+
"a bare ApiBatchResultsPresenter resolves a NEW-resident (ksuid) batch under the split",
61+
async ({ prisma14, prisma17 }) => {
62+
testRunStore = makeSplitRouter(prisma14, prisma17);
63+
64+
const environmentId = "env_split_res";
65+
// ksuid internal id → classifies to the NEW store, seeded in the NEW (prisma17) DB. The
66+
// friendlyId probe fans out NEW→LEGACY regardless of id shape, so the NEW seed is what matters.
67+
const batchInternalId = generateKsuidId();
68+
const batchFriendlyId = `batch_${generateKsuidId()}`;
69+
70+
await prisma17.batchTaskRun.create({
71+
data: {
72+
id: batchInternalId,
73+
friendlyId: batchFriendlyId,
74+
runtimeEnvironmentId: environmentId,
75+
},
76+
});
77+
78+
// Bare construction — exactly how the results route builds it.
79+
const presenter = new ApiBatchResultsPresenter();
80+
const result = await presenter.call(batchFriendlyId, authEnv(environmentId));
81+
82+
// Before the fix this 404s (undefined) because a control-plane read misses the NEW-resident batch.
83+
expect(result).toEqual({ id: batchFriendlyId, items: [] });
84+
}
85+
);
86+
87+
heteroRunOpsPostgresTest(
88+
"a bare ApiBatchResultsPresenter still returns undefined for a genuinely missing batch",
89+
async ({ prisma14, prisma17 }) => {
90+
testRunStore = makeSplitRouter(prisma14, prisma17);
91+
92+
const presenter = new ApiBatchResultsPresenter();
93+
const result = await presenter.call("batch_does_not_exist", authEnv("env_none"));
94+
95+
expect(result).toBeUndefined();
96+
}
97+
);

0 commit comments

Comments
 (0)