Skip to content

[pull] main from triggerdotdev:main#160

Merged
pull[bot] merged 7 commits into
Dustin4444:mainfrom
triggerdotdev:main
May 22, 2026
Merged

[pull] main from triggerdotdev:main#160
pull[bot] merged 7 commits into
Dustin4444:mainfrom
triggerdotdev:main

Conversation

@pull

@pull pull Bot commented May 22, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

ericallam and others added 7 commits May 22, 2026 12:29
…3683)

## Summary

`new TriggerClient({...})` exposes the management API (tasks, runs,
schedules, envvars, batch, queues, deployments, prompts, auth) as an
explicit instance with its own auth, preview branch, and baseURL.
Multiple clients can coexist in one process without mutating shared
global state — useful when a single service triggers across multiple
projects, environments, or preview branches.

```ts
import { TriggerClient } from "@trigger.dev/sdk";

const prod = new TriggerClient({ accessToken: process.env.TRIGGER_PROD_KEY });
const preview = new TriggerClient({
  accessToken: process.env.TRIGGER_PREVIEW_KEY,
  previewBranch: "signup-flow",
});

await prod.tasks.trigger("send-email", payload);
await preview.runs.list({ status: ["COMPLETED"] });
```

The existing global `configure()` API keeps working unchanged.

## Design

Instance methods enter an `AsyncLocalStorage`-backed scope (`sdkScope`)
before delegating to the existing module-level functions. The four
"pollution" points that previously read globals now consult the scope
first:

- `apiClientManager.{baseURL, accessToken, branchName}` and
`clientOrThrow` — identity fields are scope-only when scoped; `baseURL`
still falls back to `TRIGGER_API_URL` because plumbing (where the API
lives) is not identity.
- `taskContext.{ctx, worker, isWarmStart, isInsideTask}` — masked inside
an isolated scope so a `client.tasks.trigger(...)` from inside a task
doesn't leak the parent's `parentRunId` / `lockToVersion` / `isTest`
into a trigger that hits a different project.
- Inline `getEnvVar("TRIGGER_VERSION")` reads in `shared.ts` go through
a `scopedEnvVar` helper that returns `undefined` inside an isolated
scope.

The `TriggerClient` class itself is a thin wrapper that captures the
scope in its constructor and proxies each namespace method to enter that
scope before calling the existing impl. Generic inference (e.g.
`client.tasks.trigger<typeof t>(...)`) is preserved via `Pick<typeof ns,
keyof curatedSubset>` typings.

Two correctness fixes uncovered along the way are folded in:

- `apiClientManager.setGlobalAPIClientConfiguration` no longer silently
no-ops on the second call. `configure()` now actually overrides as users
expect (this is the root cause behind some "I changed the config but
nothing happened" reports).
- `apiClientManager.runWithConfig` (and therefore `auth.withAuth`) is
now backed by `sdkScope.withScope` instead of "mutate the global and
restore in finally". Two parallel `withAuth` calls with different
configs no longer stomp each other.

Surface curation: instance namespaces drop methods that don't make sense
per-instance — `batch.*AndWait` (runtime-dependent), `schedules.task` /
`schedules.timezones` (definition-time / stateless), `prompts.define`
(definition-time), `auth.configure` / `auth.withAuth` (global-only).

## Test plan

- [x] 9 runtime unit tests in `triggerClient.test.ts` cover: required
accessToken, instance auth + branch headers, no env fallback for
identity fields, no leakage between global and instance, four parallel
calls across two clients stay isolated, taskContext masking +
`inheritContext: true` override, `configure()` second-call override,
parallel `auth.withAuth` isolation.
- [x] 10 type-level assertions in `triggerClient.types.test.ts` using
`expectTypeOf` + `@ts-expect-error` lock in generic inference, return
type passthrough, overload preservation, and curated-surface drift.
- [x] Full SDK suite (219 tests) and core suite (530 tests) pass.
- [x] Webapp typecheck clean.
- [x] End-to-end smoke test against local webapp and a
freshly-provisioned cloud project — six concurrent multi-client triggers
all returned 200 with run IDs, headers per-client as expected.
- [ ] Reviewer: run `references/multi-client` per its `README.md` to
reproduce the smoke test locally.

## Try it

`references/multi-client` is a new reference workspace that exercises
this end-to-end:

- `src/trigger/echo.ts` — trivial target task
- `src/trigger/fanOut.ts` — opens two `TriggerClient`s from inside a
task, fires `echo` through each in parallel
- `src/external/main.ts` — external Node script with two clients
triggering `echo` sequentially and concurrently; logs every outgoing
request's `authorization` + `x-trigger-branch`
- `src/external/isolation.ts` — interleaves global `configure()` and an
instance call, asserts the captured fetch sequence shows no leakage
either way
Added `OrganizationDataStore` which allows orgs to have data stored in
specific separate services.

For now this is just used for ClickHouse. When using ClickHouse we get a
client for the factory and pass in the org id.

Particular care has to be made with two hot-insert paths:
1. RunReplicationService
2. OTLPExporter

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
## Summary

Stamp every Sentry event with the signed-in user and the tenant (org /
project / env) the request belongs to, so "Users Impacted" counts
distinct humans and events become filterable per tenant.

**Design after review (current):**

- `user.id = real user cuid` (from `requireUser`). "Users Impacted"
counts humans, not tenants.
- Tenant context (org / project / env slugs, IDs, env type) moves
entirely onto tags: `org_slug`, `project_slug`, `env_slug`, `org_id`,
`project_id`, `project_ref`, `environment_id`, `env_type`, plus
`impersonating` when set.
- Backed by an `AsyncLocalStorage` scope established at the HTTP entry.
Each entry point fills what it knows; loaders enrich the same scope with
what they already have.

**Zero new database queries.** The middleware does a regex match only.
Dashboard loaders that already query Prisma gain a couple of extra
selected columns; nothing new round-trips.

## How it's wired

- **Express middleware (`tenantContextResolver.server.ts`)** — parses
the URL with a regex and always opens an ALS scope. Populates whatever
subset of slugs is present: `/orgs/:o` → just `orgSlug`;
`/orgs/:o/projects/:p` adds `projectSlug`; the full triple adds
`envSlug`. Non-tenant paths get an empty scope so loaders can still
enrich.
- **`_app/route.tsx`** — already calls `requireUser`. Adds
`tenantContext.enrich({ userId: user.id })` for every authenticated
dashboard request. No new query.
- **Env layout loader (`_app.orgs.$o.projects.$p.env.$e/route.tsx`)** —
its existing `prisma.project.findFirst` gains two columns in `select`
(`externalRef`, `organization.id`). After it picks an env, calls
`tenantContext.enrich({ orgId, projectId, projectRef, envId, envType
})`. Same query, +2 columns.
- **API path (`apiBuilder.server.ts`)** — wraps every handler in
`tenantContext.run(tenantContextFromAuthEnvironment(authenticationResult.environment),
…)`. The mapper pulls `userId` from `env.orgMember?.userId` (already
selected by `authIncludeBase` — no schema change). Covers
`createLoaderApiRoute`, `createActionApiRoute`, and
`createMultiMethodApiRoute`.
- **Event processor (`sentryTenantContext.server.ts`)** — registered in
`entry.server.tsx` so it lives in the Remix bundle and shares the same
`tenantContext` ALS instance as the middleware and loaders. Stamps
whatever's present; nothing forced.

## Example events from local verification

| URL | `user.id` | Tags |
|-----|-----------|------|
| `/orgs/:o/projects/:p/env/:e/...` | real user cuid | `org_slug`,
`project_slug`, `env_slug`, `org_id`, `project_id`, `project_ref`,
`environment_id`, `env_type` |
| `/orgs/:o/settings` (non-env-scoped) | real user cuid | `org_slug`
only |
| API request with `orgMember` | `orgMember.userId` | full tenant set |
| API request without `orgMember` | (unset) | full tenant set |

## Trade-offs

1. On env-scoped pages, errors that fire before the env layout loader's
enrich callback runs get slugs + `user.id` but not the tenant IDs /
`env_type`. Realistic errors deep in async work get the full set. (Same
race as before, narrower window now that slugs/`user.id` are populated
up-front by the middleware and `_app` enrich.)
2. API requests where the environment has no `orgMember` get tenant tags
but no `user.id`. Those events still show in the issue but don't
contribute to "Users Impacted".

## Out of scope (deferred)

Background workers (`redis-worker`, `schedule-engine`) and socket
handlers. Those entry points don't set `tenantContext.run` yet — their
events ship without tenant attribution until each is wired in a
follow-up.

## Tests

31 unit tests across 4 files. New tests notably cover:

- `parseTenantPath`: org-only, org+project, and full-triple URL
variants.
- `tenantContext.enrich`: in-place patch, no-op outside `run()`,
concurrent-scope isolation, empty-scope + enrich pattern (for non-tenant
pages).
- `tenantContextFromAuthEnvironment`: with and without `orgMember` —
verifies the API path's `user.id` mapping.
- `addTenantContextToEvent`: empty scope, userId-only, slugs-only, full
enrichment, conditional tag emission, preservation of prior `event.user`
fields.

## Test plan

- [ ] `pnpm run typecheck --filter webapp`
- [ ] `pnpm run test --filter webapp -- test/tenantContext.test.ts
test/sentryTenantContext.test.ts test/tenantContextResolver.test.ts
test/tenantContextFromAuthEnvironment.test.ts`
- [ ] Local manual: with `SENTRY_DSN` set, hit a dashboard URL and an
API route, confirm the captured events carry `user.id` + the expected
tag set in Sentry.
- [ ] After ship: confirm "Users Impacted" on a real Sentry issue
reflects distinct users (not tenants).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Workloads bundled with CLI versions before v4.4.4 use a strict zod enum
for `checkpoint.type` that only allows DOCKER and KUBERNETES. When a
customer's runs are routed via the compute path, those old runners
receive `type: "COMPUTE"` on `/snapshots/since/...` and `/dequeue`
responses and fail validation - blocking silent migration of existing
deployments.

The workload never reads the field - only validates the shape. Rewriting
COMPUTE -> KUBERNETES on the way out lets older runners keep parsing
while the database and internal services keep the real value. Limited to
the two workload-facing endpoints whose response includes a checkpoint;
`/continue`, `/attempts/start`, `/attempts/complete` all return shapes
without one.

Followup to #3114.
Make the Express server's `keepAliveTimeout` configurable via
`HTTP_KEEPALIVE_TIMEOUT_MS`. Default preserved at 65000 ms — no behavior
change if unset.
## Summary

Drops the unused composite Postgres index
`TaskRun_scheduleId_createdAt_idx`. The schedule list view reads from
ClickHouse, so this index served no Prisma query while still being
maintained on every `TaskRun` INSERT/UPDATE. Removing it reduces write
amplification on the primary database.

Sibling to the prior drop of `TaskRun_scheduleId_idx` and the earlier
removal of the `TaskRun.scheduleId` foreign key — all stemming from
migrating schedule-aware reads to ClickHouse.

## Verification

- Sampled `pg_stat_user_indexes` for `TaskRun` over multiple hours —
zero scans against this index.
- Grepped the codebase for any Prisma query filtering
`TaskRun.scheduleId` — none found. All schedule-aware listing routes
through `clickhouseRunsRepository`.
…3707)

## Summary

When a background worker registers, the engine resolves runs that were
queued before the worker was ready (status `PENDING_VERSION`). That
lookup used to scan a Postgres status index on `TaskRun`. Move it to
ClickHouse: query candidate run ids from `task_runs_v2`, then refetch
the actual rows from Postgres by primary key with a `status =
'PENDING_VERSION'` guard for idempotency.

## Design

The lookup is a pluggable interface on the run engine
(`PendingVersionRunIdLookup`). The webapp wires a ClickHouse-backed
implementation through the org-scoped `clickhouseFactory` using a new
`"engine"` client type, configured by `RUN_ENGINE_CLICKHOUSE_*` env
vars. The URL falls back to `CLICKHOUSE_URL` when unset, so self-hosted
deployments don't need new config to keep working.

When the lookup returns no candidates, one bounded retry is scheduled
~5s later to cover ClickHouse replication lag against `task_runs_v2`.
The Postgres status guard on both the candidate refetch and the inner
`updateMany` prevents double-promotion when a retry races with a
concurrent deploy.

Tests cover three existing PENDING_VERSION cases via a small
Postgres-backed test adapter; new ClickHouse-backed integration tests
will follow.
@pull pull Bot locked and limited conversation to collaborators May 22, 2026
@pull pull Bot added the ⤵️ pull label May 22, 2026
@pull pull Bot merged commit 61ca40b into Dustin4444:main May 22, 2026
0 of 4 checks passed
@pull pull Bot had a problem deploying to dependabot-summary May 23, 2026 09:54 Failure
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants