Skip to content

Update trigger#4

Open
tylerc-govsignals wants to merge 1041 commits into
GovSignals:ConProgramming/two-phase-deployfrom
triggerdotdev:main
Open

Update trigger#4
tylerc-govsignals wants to merge 1041 commits into
GovSignals:ConProgramming/two-phase-deployfrom
triggerdotdev:main

Conversation

@tylerc-govsignals

Copy link
Copy Markdown
Collaborator

Closes #

✅ Checklist

  • I have followed every step in the contributing guide
  • The PR title follows the convention.
  • I ran and tested the code works

Testing

[Describe the steps you took to test this change]


Changelog

[Short description of what has changed]


Screenshots

[Screenshots]

💯

ericallam and others added 30 commits May 12, 2026 11:03
…ges (#3559)

## Summary

Make `taskIdentifier` optional on the run-queue message schema. No
behavior change in this PR; readers continue to accept payloads that
include the field. A separate change will stop writing it on the wire to
shrink the per-run payload that lives in Redis while runs wait to be
dequeued.

## Design

The field is written into every payload at enqueue time but no consumer
reads it back on the dequeue path. Both the run-engine and supervisor
derive `taskIdentifier` from the loaded `TaskRun` row instead. Relaxing
the schema first means readers tolerate payloads that omit it, so the
writer-side change can ship without producing schema-parse errors during
a rolling deploy.

`projectId` is left required: `WorkerQueueResolver.#getOverride` reads
it for project-scoped runtime worker-queue overrides.

## Test plan

- [x] `pnpm run typecheck --filter @internal/run-engine`
- [x] `pnpm run typecheck --filter webapp`
- [x] `pnpm run test ./src/run-queue/tests/enqueueMessage.test.ts
./src/run-queue/tests/workerQueueResolver.test.ts --run` (28/28 passing)
### Style updates to the notifications
- Tightened up the typography
- Brighter background to make it stand out a bit more
- A bit more padding to make it more readable
- Show the close button on hover instead
- Turned the notification into a separate component as it's shared on
the admin page modal
- Minor tweaks to the behavior of toggling the notification beween
open/closed side menu states

### Before
<img width="224" height="313" alt="before"
src="https://github.com/user-attachments/assets/c9a9377c-4a3b-4477-921a-3c86385d3f0b"
/>

### After (with image)
<img width="239" height="284" alt="CleanShot 2026-05-11 at 17 22 01"
src="https://github.com/user-attachments/assets/311b4dbc-4853-4e6c-9f83-8173b38bd466"
/>

### After (no image)
<img width="239" height="189" alt="after"
src="https://github.com/user-attachments/assets/884e062b-3608-4cb3-a462-d50597257753"
/>

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
## Summary
1 improvement, 1 bug fix.

## Improvements
- Fail attempts on uncaught exceptions instead of hanging to
`MAX_DURATION_EXCEEDED`. A Node `EventEmitter` (e.g. `node-redis`)
emitting `"error"` with no `.on("error", ...)` listener escalates to
`uncaughtException`, which the worker previously reported but did not
act on — runs drifted to maxDuration with empty attempts. They now fail
fast with the original error and status `FAILED`, and respect the task's
normal retry policy. You should still attach `.on("error", ...)`
listeners to long-lived clients to handle errors gracefully.
([#3529](#3529))

## Bug fixes
- Fix dev workers spinning at 100% CPU after the parent CLI disconnects.
Orphaned `trigger-dev-run-worker` (and indexer) processes were caught in
an `uncaughtException` feedback loop: a periodic IPC send via
`process.send` would throw `ERR_IPC_CHANNEL_CLOSED` once the parent
closed the channel, which re-entered the same handler that itself called
`process.send`, scheduled via `setImmediate` and amplified by
source-map-support's `prepareStackTrace`. Fixed by (1) silently dropping
packets in `ZodIpcConnection` when the channel is disconnected, (2)
adding a `process.on("disconnect", ...)` handler in dev workers so they
exit cleanly when the CLI closes the IPC channel, and (3) wrapping all
`uncaughtException`-path `process.send` calls in a `safeSend` guard that
checks `process.connected` and swallows synchronous throws.
([#3491](#3491))

<details>
<summary>Raw changeset output</summary>

# Releases
## @trigger.dev/build@4.4.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.6`

## trigger.dev@4.4.6

### Patch Changes

- Fix dev workers spinning at 100% CPU after the parent CLI disconnects.
Orphaned `trigger-dev-run-worker` (and indexer) processes were caught in
an `uncaughtException` feedback loop: a periodic IPC send via
`process.send` would throw `ERR_IPC_CHANNEL_CLOSED` once the parent
closed the channel, which re-entered the same handler that itself called
`process.send`, scheduled via `setImmediate` and amplified by
source-map-support's `prepareStackTrace`. Fixed by (1) silently dropping
packets in `ZodIpcConnection` when the channel is disconnected, (2)
adding a `process.on("disconnect", ...)` handler in dev workers so they
exit cleanly when the CLI closes the IPC channel, and (3) wrapping all
`uncaughtException`-path `process.send` calls in a `safeSend` guard that
checks `process.connected` and swallows synchronous throws.
([#3491](#3491))
- Fail attempts on uncaught exceptions instead of hanging to
`MAX_DURATION_EXCEEDED`. A Node `EventEmitter` (e.g. `node-redis`)
emitting `"error"` with no `.on("error", ...)` listener escalates to
`uncaughtException`, which the worker previously reported but did not
act on — runs drifted to maxDuration with empty attempts. They now fail
fast with the original error and status `FAILED`, and respect the task's
normal retry policy. You should still attach `.on("error", ...)`
listeners to long-lived clients to handle errors gracefully.
([#3529](#3529))
-   Updated dependencies:
    -   `@trigger.dev/core@4.4.6`
    -   `@trigger.dev/build@4.4.6`
    -   `@trigger.dev/schema-to-json@4.4.6`

## @trigger.dev/core@4.4.6

### Patch Changes

- Fix dev workers spinning at 100% CPU after the parent CLI disconnects.
Orphaned `trigger-dev-run-worker` (and indexer) processes were caught in
an `uncaughtException` feedback loop: a periodic IPC send via
`process.send` would throw `ERR_IPC_CHANNEL_CLOSED` once the parent
closed the channel, which re-entered the same handler that itself called
`process.send`, scheduled via `setImmediate` and amplified by
source-map-support's `prepareStackTrace`. Fixed by (1) silently dropping
packets in `ZodIpcConnection` when the channel is disconnected, (2)
adding a `process.on("disconnect", ...)` handler in dev workers so they
exit cleanly when the CLI closes the IPC channel, and (3) wrapping all
`uncaughtException`-path `process.send` calls in a `safeSend` guard that
checks `process.connected` and swallows synchronous throws.
([#3491](#3491))
- Fail attempts on uncaught exceptions instead of hanging to
`MAX_DURATION_EXCEEDED`. A Node `EventEmitter` (e.g. `node-redis`)
emitting `"error"` with no `.on("error", ...)` listener escalates to
`uncaughtException`, which the worker previously reported but did not
act on — runs drifted to maxDuration with empty attempts. They now fail
fast with the original error and status `FAILED`, and respect the task's
normal retry policy. You should still attach `.on("error", ...)`
listeners to long-lived clients to handle errors gracefully.
([#3529](#3529))

## @trigger.dev/python@4.4.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.6`
    -   `@trigger.dev/build@4.4.6`
    -   `@trigger.dev/sdk@4.4.6`

## @trigger.dev/react-hooks@4.4.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.6`

## @trigger.dev/redis-worker@4.4.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.6`

## @trigger.dev/rsc@4.4.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.6`

## @trigger.dev/schema-to-json@4.4.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.6`

## @trigger.dev/sdk@4.4.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.6`

</details>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…3552)

Closes
[TRI-9234](https://linear.app/triggerdotdev/issue/TRI-9234/retry-task-process-sigsegv-errors-respecting-user-retry-config)

## What this changes

SIGSEGV crashes (`TASK_PROCESS_SIGSEGV`) will now be **retried when an
attempt fails**, in line with the task's configured retry settings
(`retry.maxAttempts` etc.) — the same path SIGTERM and uncaught
exceptions already use. Previously SIGSEGV was hard-classified as
non-retriable and failed the run on the first segfault, ignoring the
user's retry policy.

Tasks without a retry policy still fail fast on the first SIGSEGV.
Behaviour is unchanged for OOM kills (separate machine-bump retry path)
and SIGKILL_TIMEOUT.

## Deploy

**Only the webapp needs to ship.** The retry decision lives entirely in
the webapp:
- V2 path: `internal-packages/run-engine` (bundled into the webapp)
- V1 path: `apps/webapp/app/v3/services/completeAttempt.server.ts`

No supervisor, CLI, SDK, or customer-task-image changes required.
Customers do not need to redeploy. The `@trigger.dev/core` changeset is
just keeping the public package in sync — the published npm version
isn't what makes the fix work.

## Why retry

SIGSEGV in Node tasks is frequently non-deterministic across processes:

- **Native addon races** (`sharp`, `canvas`, `better-sqlite3`,
`node-rdkafka`, `bcrypt`, …) — libuv thread-pool work stepping on V8
handles. Different heap layout / thread schedule on a fresh process →
retry often succeeds.
- **JIT / GC interaction** — V8 turbofan deopt or GC during a native
callback. Timing-dependent.
- **Near-OOM in native code** — when RSS approaches the cgroup limit,
native allocations fail and poorly-written addons dereference NULL →
SIGSEGV instead of clean OOM-kill.
- **Host / hardware issues** — bit flips, kernel quirks. Retry lands on
a different host.

The genuinely deterministic case (a user-code bug always tripping the
same addon) is real, but a subset — and `maxAttempts` bounds the damage.

## Pre-existing inconsistency this resolves

- `shouldRetryError` returned `false` for `TASK_PROCESS_SIGSEGV` →
`fail_run`.
- `shouldLookupRetrySettings` already listed `TASK_PROCESS_SIGSEGV` as
retry-config-aware — but that branch was unreachable because
`shouldRetryError` short-circuited first in `retrying.ts:86-90`.
- We already retry `TASK_RUN_UNCAUGHT_EXCEPTION` (clearly a user-code
bug) under the user's retry policy; refusing to retry SIGSEGV was the
odd one out.

## Test plan

- [x] `pnpm exec vitest run test/errors.test.ts` in `packages/core` —
26/26 pass (4 new)
- [x] `pnpm run build --filter @trigger.dev/core`
- [ ] CI green on PR

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary

Adds `.claude/REVIEW.md` — a repo-specific source of truth for what AI /
agent code reviewers should treat as critical in this codebase
(rolling-deploy safety, hot-table indexes, recovery-path queries,
testcontainers usage, etc.). Pairs with a Claude-based PR audit that
flags drift between REVIEW.md and the code as it evolves.

## How the audit works

Mirrors the existing `.github/workflows/claude-md-audit.yml` pattern. On
non-draft, non-fork PRs that touch code, `anthropics/claude-code-action`
reads REVIEW.md, samples the PR diff, and posts a sticky comment with up
to 3 of:

- `[stale]` — rule cites a path / function / table that's been removed
or renamed
- `[contradiction]` — code in the PR violates a current rule
- `[missing]` — PR introduces a new pattern future reviewers should know
about
- `[obsolete]` — rule asserts a constraint the repo has moved past

If nothing's off, posts `✅ REVIEW.md looks current for this PR.`

## Test plan

- [ ] Convert this PR to ready-for-review, confirm the audit runs and
posts a sticky comment
- [ ] Verify the audit doesn't run on fork PRs (gated by
`head.repo.full_name == github.repository`)
- [ ] Verify suggestions are actionable on at least one follow-up PR
…3499)

## Summary

Consolidates the webapp's authentication and authorization into a small
set of route helpers, replacing the ad-hoc `requireUser` /
`requireUserId` / `authenticatedEnvironmentForAuthentication` calls
scattered across routes. Same security model, but the per-request flow
(authenticate → authorize → load) now lives in one place per route
family.

Introduces a plugin seam (`@trigger.dev/plugins`) that lets the cloud
build install a richer RBAC implementation without touching webapp code.
The OSS fallback keeps the pre-RBAC permissive behaviour intact, so
self-hosted deployments work unchanged.

Adds a comprehensive end-to-end auth test suite that didn't exist before
— 193 `it()` blocks (vitest reports ~199 after `it.each` expansion)
covering API key, PAT and JWT auth across the public API surface, plus
dashboard session auth for admin pages.

## Changes

### Plugin contract — `@trigger.dev/plugins`

`RoleBaseAccessController` interface authoritative for both OSS
(fallback) and cloud (enterprise plugin):
- `authenticateBearer(request, { allowJWT? })` — API-key / public-JWT
auth, returns env + ability
- `authenticateSession(request, { userId, organizationId?, projectId?
})` — dashboard auth, caller resolves `userId` from the session cookie
and passes it in (no `helpers.getSessionUserId` callback — decouples the
plugin host from session-cookie code)
- `authenticatePat(request, { organizationId?, projectId? })` — PAT
auth, returns identity + `lastAccessedAt` so the host can throttle the
per-request update
- `authenticateAuthorize*` variants for the auth-and-check-in-one-call
cases
- `isUsingPlugin(): Promise<boolean>` — capability flag for UI /
branching where plugin-present-ness matters; replaces the
sentinel-string coupling that had `personalAccessToken.server` matching
`"RBAC plugin not installed"` literally

### Dashboard auth (started, partial rollout)

Admin and settings pages migrated to a unified `dashboardLoader` /
`dashboardAction` helper that authenticates the session, runs an
authorization check, and exposes the result to the route. Other
dashboard routes still on the old pattern; remaining migration tracked
in TRI-8730.

Migrated routes:
- `admin.*` (14 admin / back-office / feature-flags / LLM-models /
notifications / orgs / concurrency pages)
- `_app.orgs.$organizationSlug.settings.team`
- `_app.orgs.$organizationSlug.settings.roles`

### API / realtime / engine auth (complete for the migrated families)

71 routes migrated to a unified `apiBuilder` that centralizes Bearer /
PAT / Public-JWT authentication and applies the per-route authorization
check before the handler runs. Includes:
- `api.v1.*` and `api.v2.*` and `api.v3.*` — tasks, runs, batches,
queues, prompts, deployments, query, sessions, waitpoints, packets,
workers, idempotency keys
- `realtime.v1.*` — runs, batches, sessions, streams
- `engine.v1.*` — dev / worker-action protocols

29 routes still on the legacy `authenticateApiRequest*` helpers —
tracked as a post-deploy follow-up in TRI-9228.

Multi-resource auth direction is now explicit at the call site via
`anyResource(...)` (OR) and `everyResource(...)` (AND). Bare arrays no
longer typecheck — fixes a class of bug where a JWT scoped to one
resource could implicitly access others under OR semantics.

PAT auth path consolidated: was three DB queries per request (legacy
`authenticateApiRequestWithPersonalAccessToken` findFirst +
`rbac.authenticatePat` join + `lastAccessedAt` update). Now one query in
the steady state — plugin returns `lastAccessedAt`, host smart-skips the
update via JS-side throttle when fresh.

Side effect: action aliases preserved historic JWT scope semantics where
the new model is stricter (e.g. a `write:tasks` JWT now also satisfies
`trigger` / `batchTrigger` / `update` actions on the same resource —
matched at the auth boundary, not in the route handler).

### Backwards-compat fixes

The strict-match model regressed several real-world JWT shapes. Each
preserved via explicit `anyResource(...)` entries in the route's authz
block:

- **Batch retrieve routes** (`api.v1.batches.$batchId`, `api.v2.*`,
`realtime.v1.batches.*`) accept `read:runs` JWTs again (pre-RBAC
literal-match superScope behaviour)
- **Runs list routes** (`api.v1.runs`, `realtime.v1.runs`) accept
type-level `read:tasks` / `read:tags` on unfiltered queries (matched the
legacy `Object.keys` iteration semantic)
- **PAT/OAT auth shape** normalized through `toAuthenticated` so all
auth methods return the same slim `AuthenticatedEnvironment` (was:
API-key returned the slim shape but PAT/OAT returned raw Prisma
`Decimal` / no `orgMember`)
- **Scope `:` preservation** in resource ids — `read:tags:env:staging`
now correctly identifies the tag id as `env:staging`, not `env`

### Slim `AuthenticatedEnvironment`

Extracted to `@trigger.dev/core/v3/auth/environment` — a structural
shape independent of `@trigger.dev/database`. The plugin contract
returns this; webapp consumers import from there; the cloud plugin
(Drizzle) returns the same shape without Prisma's `Decimal` class
leaking into the public surface. Lets internal-packages (run-engine,
etc.) refer to `AuthenticatedEnvironment` without pulling Prisma in.

### Auth test suite (new — `*.e2e.full.test.ts`)

193 e2e tests run against a real spawned webapp + Postgres (no mocks).
Coverage matrix:

- **API key auth** — read / write / trigger / batchTrigger / deploy
actions across runs, batches, deployments, prompts, queues, query,
sessions, input-streams, waitpoints, tasks, idempotency keys; multi-key
resources (a run carries batch / tag / task identifiers — auth must
accept any matching scope)
- **Personal Access Token auth** — comprehensive matrix: scope match,
scope mismatch, missing scope, expired token, malformed token
- **Public JWT auth** — sub-vs-URL environment resolution, expired JWTs,
signature verification, scope checking, otu (one-time-use) token
semantics, branch-environment signing-key fallback
- **Dashboard session auth** — admin-only pages reject non-admins;
per-action gating
- **Cross-cutting edge cases** — revoked API key grace window, JWT
cross-environment isolation, MissingResource branch behaviour

### Hygiene cleanups

- Deleted dead `app/services/authorization.server.ts` (legacy
`checkAuthorization` + types — no live consumers post-migration) and its
orphaned test
- Dropped the never-populated `scopes` field from
`ApiAuthenticationResultSuccess`
- `scheduleEmail` moved out of `email.server.ts` into its own module —
breaks a `commonWorker → marqs/V1` import chain that was poisoning the
auth test graph
- OSS Roles page shows a deployment-aware empty state ("Roles aren't
available in this self-hosted deployment" vs the plan-upsell copy) via
`rbac.isUsingPlugin()`
- Team action handler: explicit per-intent ability gates
(`manage:billing` for purchase-seats, `manage:members` for set-role +
remove-member with self-leave carve-out)

### Cross-repo coordination

All public-package contract changes paired in `triggerdotdev/cloud#763`
(rbac-packages branch) — the enterprise plugin implements the same
`RoleBaseAccessController` interface against Drizzle.

## Test plan

- [x] `pnpm run typecheck --filter webapp` clean
- [x] `pnpm --filter webapp exec vitest run --config
vitest.e2e.full.config.ts` — 193/193 pass (requires Docker for
testcontainers)
- [x] Spot-check an authed API endpoint with a valid + invalid API key
against a local stack
- [x] Spot-check the migrated admin pages render and gate non-admins

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… queues (#3558)

## Summary

Queues that use concurrency keys can no longer bypass the per-queue
length cap, and the "Queued | Running" columns in the dashboard now show
the true total across all CK variants instead of 0.

The cap and the dashboard both relied on `ZCARD` of the base queue key,
but CK-keyed runs live under `<base>:ck:<variant>` keys. Any queue that
used concurrency keys read 0 — letting a single CK variant grow
unbounded past the user's configured cap.

## Fix

Two per-base-queue counters are maintained inside the CK Lua scripts:
`<base>:lengthCounter` and `<base>:runningCounter`. Non-CK
enqueue/dequeue paths are untouched.

Counters are lazy-initialized the first time a CK enqueue (or nack)
lands on a queue: the Lua script sums `ZCARD` across the variants
tracked by `ckIndex`, sets the counter, then `INCR`s. Pre-existing CK
backlog on already-populated queues is captured automatically — no batch
migration required.

`INCR`/`DECR` is gated on `ZADD`/`SADD` returning 1 (a new entry vs an
idempotent no-op), so duplicate enqueues or re-dequeues don't inflate
the counter.

The counter is `SET` with a 24-hour TTL on init. `INCR`/`DECR` do not
extend the TTL, so the counter expires daily and the next CK operation
re-seeds it from `ckIndex`. This bounds any drift that accumulates
during the rolling-deploy overlap window — where old (un-Tracked) and
new (Tracked) webapp instances briefly coexist — to ≤24 hours, with no
admin sweep or background reconciler needed.

Read paths pipeline `ZCARD`/`SCARD` on the base key + `GET` on the
counter and sum. A missing counter is treated as 0, so pure non-CK
queues see the same answer as before.

The counter-aware scripts ship alongside the originals with a `Tracked`
suffix for rolling-deploy safety; a follow-up PR will drop the originals
once this has rolled out.

## Test plan

- [ ] `pnpm run test --filter @internal/run-engine` — 116 tests pass,
including a new `ckCounters.test.ts` covering lazy init from
pre-existing backlog, churn, floor-at-zero, the non-CK regression case,
mixed CK + non-CK on the same base queue, idempotent re-enqueue
(ZADD-already-exists), 24h TTL on the counter, and nack re-seeding after
counter expiry.
- [ ] Verified end-to-end against a live local environment:
- Triggered 24 CK enqueues across 4 variants → `lengthCounter=16`,
`runningCounter=8`, dashboard showed Queued=16 / Running=8 for the CK
queue.
- Set the env queue cap to 16, triggered 12 more enqueues → 8 succeeded,
4 rejected with `QueueSizeLimitExceededError`.
- Deleted the counter on a queue with 31 messages already sitting in CK
variants, triggered one more enqueue → counter materialized to 31 from
the `ckIndex` sum, then INCR'd.
## Summary

Local ClickHouse was burning ~325% CPU endlessly merging its own
telemetry tables (`metric_log`, `asynchronous_metric_log`, `part_log`,
`trace_log`) after the container had been running long enough to
accumulate hundreds of GB of system-log data. OrbStack Helper reflected
this on the host (~400% CPU).

These tables are not used by anything in the dev stack. They only exist
for ClickHouse to log itself, so disabling them eliminates the merge
churn entirely.

## Changes

- Adds `docker/config/clickhouse-disable-system-logs.xml`, mounted into
`/etc/clickhouse-server/config.d/`, that removes the noisy system log
tables via `<table remove="1"/>`.
- Mounts the override file in `docker/docker-compose.yml`.

After applying, idle CPU dropped from 325% to ~12% on my machine.

## Test plan

- [ ] `pnpm run docker` brings up the stack cleanly
- [ ] `docker stats clickhouse` shows low idle CPU
- [ ] App functionality unaffected (system log tables are not queried by
the webapp)
…mpling (#3567)

## Summary

Follow-up to #3561. The drift-audit workflow timed out on PR #3542 (92
files, +5962 lines) by hitting `--max-turns 15` before reaching a
verdict, leaving a red ❌ on that PR with no sticky comment.

## Changes

- `--max-turns` bumped from 15 to 30.
- Prompt now opens with an explicit "Strategy" section: read REVIEW.md
once, scan the file-list only, open at most 5 files (3-5 on PRs >50
files), and bias toward finishing over exploring.
- Final rule: *"when in doubt between one more file read and finish now
— finish now."*

The audit is allowed to miss things. It is not allowed to time out and
leave a red X.

## Test plan

- [ ] Verify this PR's audit posts `✅ REVIEW.md looks current for this
PR.` (small diff)
- [ ] After merge, retry the audit on #3542 or a similarly large PR and
confirm it completes
…#3564)

## Summary

- Users on production are hitting `QuotaExceededError: Failed to execute
'setItem' on 'Storage'` when navigating runs, because their localStorage
is full of orphaned `panel-group-react-aria<n>-:<rid>:` entries.
- Each entry is a session-unique key written by the resizable panel
library; they accumulated to thousands per user over the last two months
and now block legitimate `setItem` calls (the run-view inspector can no
longer persist its layout, and the page crashes mid-render).
- This PR evicts the legacy entries once on client boot. The leak itself
is already plugged by the v1.1.3 upgrade in #XXXX — this is the cleanup
that recovers the wasted quota on existing users' machines.

## Root cause (already fixed, for context)

In v0.4.1 of the underlying library, `PanelGroupImpl` defaulted
`autosaveStrategy` to `"localStorage"` unconditionally — so *every*
`PanelGroup` wrote to localStorage on every autosave trigger, including
the four in `QueryEditor`, the one in `ReplayRunDialog`, the storybook
routes, etc. Without an `autosaveId`, the key fell back to
`panel-group-${useId()}`, and React Aria's `useId()` produces a new
session-unique prefix each visit. Result: entries accumulated without
bound across sessions.

The condition was introduced when
[#3282](#3282) removed
the wrapper's explicit `autosaveStrategy="cookie"` override (to fix HTTP
431 cookie-size errors). That worked, but the library default that took
over silently caused this leak.

The v1.1.3 upgrade in the resizable-panel PR changed the default to
`autosaveStrategy = autosaveId ? "localStorage" : undefined`, so no new
entries are being written. Existing residue still needs to be removed
from users' browsers.

## Changes

- New file
[`apps/webapp/app/clientBeforeFirstRender.ts`](apps/webapp/app/clientBeforeFirstRender.ts)
— exports a `clientBeforeFirstRender()` function that runs
synchronously, before React hydrates. Encapsulates a small cleanup
helper that scans `localStorage` and removes:
- Every key starting with `panel-group-react-aria` (the legacy
auto-generated keys).
- The orphan `panel-run-parent-v2` key from before the autosaveId v2→v3
bump.
- [`apps/webapp/app/entry.client.tsx`](apps/webapp/app/entry.client.tsx)
— imports and invokes `clientBeforeFirstRender()` once, before
`hydrateRoot()`. This guarantees the cleanup completes before any
`ResizablePanelGroup` mounts and tries to write.

The cleanup is wrapped in `try/catch` so private-browsing /
disabled-storage scenarios fail silently. Idempotent: subsequent loads
find no matching keys and exit immediately.

## Test plan

- [x] Locally seed ~50 fake `panel-group-react-aria…` entries plus a
`panel-run-parent-v2` entry via DevTools console, hard reload → legacy
entries gone, real entries (`panel-run-parent-v3`, `panel-run-tree`)
preserved.
- [x] Idempotency: reload a second time, no errors, no state changes.
- [x] Add a control entry (`panel-run-parent-v3-but-different-suffix`) —
confirmed not over-matched.
- [x] Simulate broken `Storage.setItem` throwing — page still renders,
cleanup swallows the error.
- [x] Typecheck clean.

## Notes

- Customer report: `QuotaExceededError: Failed to execute 'setItem' on
'Storage': Setting the value of 'panel-run-parent-v3' exceeded the
quota.`
- The cleanup runs once per page load. Once a user has loaded the app
after this deploys, their localStorage is clean and the function becomes
a no-op forever.
## Summary
- Recommend deploying NodeLocal DNS and lowering `ndots` to `1` in the
Kubernetes self-hosting guide.
- Recommend storing task events in ClickHouse
(`EVENT_REPOSITORY_DEFAULT_STORE=clickhouse_v2`) in both the Docker and
Kubernetes guides, plus a new row in the webapp env var reference.
`pr_checks` runs the full matrix on every PR. #3609 touched only
`apps/webapp/app/routes/admin.tsx` and still ran the 4-job CLI e2e
matrix and 5-job sdk-compat suite.

Adds a `changes` job using `dorny/paths-filter` and gates each tier:

- webapp + e2e-webapp: `apps/webapp/**`, `packages/**`,
`internal-packages/**`
- packages: `packages/**`
- internal: `internal-packages/**` + `packages/**` (cross-deps)
- e2e (cli-v3): `packages/{cli-v3,build,core,schema-to-json}/**`
- sdk-compat: `packages/{trigger-sdk,core}/**`

`.configs/**`, `package.json`, `pnpm-lock.yaml`, `pnpm-workspace.yaml`,
`turbo.json` are also included in every filter since they affect the
whole workspace.

Inlines the `units` reusable-workflow children so each can be gated
independently (status check names also flatten from `units / webapp /
...` to `webapp / ...`). `unit-tests.yml` is unaffected - still used by
`publish.yml`.

Adds an `all-checks` gate that always runs and short-circuits to success
when every dependent is success-or-skipped. With this in place a single
required status check (`All PR Checks`) is enough; before this,
`paths-ignore` would have left required checks Pending on docs/changeset
PRs ([gh
docs](https://docs.github.com/en/actions/managing-workflow-runs/skipping-workflow-runs)).
…nizations (#3609)

Switching between the Users and Organizations tabs in the admin
dashboard now keeps the current `?search=` value, so you can flip
between the two without re-typing your filter. Other admin tabs don't
take `search` and so don't carry it.
Adds Sessions, a durable, run-aware stream primitive that scopes
session.in / session.out records to a session (not a single run).
Records survive run boundaries; reconnect-from-last-event-id is built in.

Server foundation:
- New /realtime/v1/sessions/:session/:io/append + /records routes
- sessionRunManager + sessionsRepository + clickhouseSessionsRepository
- mintRunToken for short-lived per-session tokens
- s2Append retry-with-backoff + undici cause diagnostics
- /api/v[12]/packets/* exempt from customer rate limits
- BackgroundWorker schema gains taskKind enum (TASK, AGENT, SCHEDULED)
- TaskRun.taskKind column + clickhouse 029_add_task_kind_to_task_runs_v2

Core types:
- new sessionStreams, inputStreams, realtimeStreams packages in @trigger.dev/core
- session-streams-api / realtime-streams-api surface

Sessions dashboard UI (the primitive's own viewer):
- /sessions index + detail routes
- SessionsTable, SessionFilters, SessionStatus, CloseSessionDialog
- AGENT/SCHEDULED filter in RunFilters + TaskTriggerSource

Includes the sessions-primitive changeset.
`tasks.trigger`, `tasks.batchTrigger`, `batch.create`,
`wait.createToken`, `wait.forDuration`, and the input/session stream
waitpoint endpoints all accept a caller-supplied `idempotencyKey` and
store it verbatim against a composite-unique index on `TaskRun`,
`BatchTaskRun`, or `Waitpoint`. The schemas had no length cap, so a
sufficiently long high-entropy key produced an index row larger than the
underlying storage layer can hold. The insert failed at the database,
and the caller saw a generic 500 from
`RunEngineTriggerTaskService.call()` / `CreateBatchService` / waitpoint
creation, depending on the endpoint.

Keys produced by `idempotencyKeys.create()` are 64-character SHA-256
hashes and never trip this — it only manifests for direct REST callers
(or SDK callers passing a raw string they generated themselves).
Low-entropy keys also sail through, because the storage layer compresses
repeated bytes before they reach the index, which is why the failure
mode is intermittent and tied to caller-side key shape.

## Fix

Add `.max(2048, "<field> must be 2048 characters or less")` to the seven
schemas that feed an indexed `idempotencyKey` column:

- `TriggerTaskRequestBody.options.idempotencyKey`
- `BatchTriggerTaskItem.options.idempotencyKey`
- `CreateBatchRequestBody.idempotencyKey`
- `CreateWaitpointTokenRequestBody.idempotencyKey`
- `CreateInputStreamWaitpointRequestBody.idempotencyKey`
- `CreateSessionStreamWaitpointRequestBody.idempotencyKey`
- `WaitForDurationRequestBody.idempotencyKey`

Plus the `idempotency-key` HTTP header on the trigger route (and the
three batch routes that re-export `HeadersSchema`). The header schema is
lifted out of `api.v1.tasks.$taskId.trigger.ts` into
`apps/webapp/app/v3/triggerHeaders.server.ts` so it can be exercised in
tests without dragging the route's import-time side effects.

The 2048 character ceiling is chosen to sit safely under the per-row
index limit while staying generous against existing callers — keys that
fit before still fit. Oversized keys now return a structured Zod 400
instead of a generic 500.

Limit is documented under `Idempotency key` in `docs/limits.mdx` and as
a `<Note>` on `docs/idempotency.mdx`.

## Test plan

- [x] 15 schema unit tests added
(`packages/core/src/v3/schemas/idempotencyKey.test.ts`,
`apps/webapp/test/routes/triggerHeaders.test.ts`) —
rejection-with-message + boundary acceptance for each capped schema. The
webapp test exercises the extracted `TriggerHeadersSchema` directly with
no mocks.
- [x] `pnpm run build --filter @trigger.dev/core`
- [x] `pnpm run typecheck --filter webapp`
- [x] End-to-end verified locally: baseline (small key) → 200; 3000-char
high-entropy header → 400 with the expected Zod error; same key at the
2048 boundary → 200; same key with the cap reverted → the database
rejected the insert and the route returned 500 to the caller. Cap
restored.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…3542)

## Summary

A `/sessions` dashboard for inspecting durable Sessions, an `AGENT` /
`SCHEDULED` task-kind filter for the runs list, and the server-side
hardening (rate-limit exemption for packets, retry-with-backoff on
stream appends, typed too-large-chunk error) that the `chat.agent`
runtime in #3543 needs. Builds on the Sessions primitive shipped in
#3417.

## Design

The Sessions list + detail routes mirror the run inspector pattern.
`TaskTriggerSource` gains `AGENT` and `SCHEDULED` values, persisted on
`BackgroundWorker.taskKind` and `TaskRun.taskKind` (plus a matching
Clickhouse column), so the runs list can filter by kind.

New `@trigger.dev/core` modules — `sessionStreams`, `inputStreams`, a
`sessionStreamInstance` for realtime streams, and the
`realtime-streams-api` / `session-streams-api` surfaces — expose the
typed shapes that chat.agent will use to drive `session.out`.
`ChatChunkTooLargeError` lets the runtime drop oversized chunks with a
typed surface instead of failing the run. `s2Append` retries transient
failures with exponential backoff. `/api/v[12]/packets/*` is exempt from
customer rate limits so chat snapshot reads and writes don't get
throttled under load.

## Stack

Part of a 4-PR stack. Merge bottom-up.

1. **This PR** (#3542) → `main`
2. #3543#3542 — `chat.agent` runtime + browser transport
3. #3545#3543 — agent-view dashboard
4. #3546#3545 — ai-chat reference + MCP tooling

Replaces #3173 (closed).

<!-- GitButler Footer Boundary Top -->
---
This is **part 5 of 5 in a stack** made with GitButler:
- <kbd>&nbsp;5&nbsp;</kbd> #3612
- <kbd>&nbsp;4&nbsp;</kbd> #3546
- <kbd>&nbsp;3&nbsp;</kbd> #3545
- <kbd>&nbsp;2&nbsp;</kbd> #3543
- <kbd>&nbsp;1&nbsp;</kbd> #3542 👈 
<!-- GitButler Footer Boundary Bottom -->
The `code` paths filter currently matches `**` minus a tiny exclusion
list, so a PR that only touches `.github/workflows/*.yml` still flips
`code == true` and runs typecheck (~2 min on the runner).

Exclude `.github/**` from `code`, then re-include just `pr_checks.yml`
and `typecheck.yml` so a change to either of those still triggers the
full code check matrix.

Effect:
- workflow-only PRs (this one, future dependabot/codeql/etc.) skip
typecheck; `all-checks` treats the skipped job as non-failure so the
required status passes.
- modifying `pr_checks.yml` or `typecheck.yml` themselves still triggers
typecheck.
- the existing per-suite filters (`webapp`, `packages`, `internal`,
`cli`, `sdk`) already re-include the specific workflows that gate them,
so they're unaffected.
Adds a Mon 08:00 UTC workflow that posts a summary of open Dependabot
alerts and PRs to Slack. Uses env-scoped secrets so the alerts PAT and
Slack token are only available to this workflow.
Adds the chat.agent({...}) task definition (server runtime) and the
browser-side TriggerChatTransport + AgentChat that drives it from a
React or Next.js app. The runtime sits on top of the Sessions primitive
and handles the durable conversational task lifecycle.

Server runtime:
- chat.agent({...}) — session-aware task definition
- Lifecycle hooks: onChatStart, onTurnStart, onTurnComplete, onAction,
  onValidateMessages, hydrateMessages
- chat.history read primitives for HITL flows
- chat.local, chat.headStart, chat.handover, oomMachine
- Delta-only wire + S3 snapshot reconstruction at run boot
- Actions are no longer turns

Browser transport:
- TriggerChatTransport (ai-sdk Transport): delta-only wire sends,
  SSE reconnection with lastEventId resume, stop/abort cleanup,
  dynamic accessToken refresh
- AgentChat: direct programmatic API
- useTriggerChatTransport (React hook)
- chat-tab-coordinator: cross-tab leader election

Includes the chat-agent, chat-agent-delta-wire-snapshots,
chat-history-read-primitives, chat-head-start, chat-actions-no-turn,
chat-session-attributes, agent-skills, and mock-chat-agent-test-harness
changesets.
## Summary

Adds `chat.agent({...})`, a durable conversational task runtime, plus
the browser-side `TriggerChatTransport` + `AgentChat` that drive it from
a React or Next.js app. Conversations survive page refreshes, network
blips, idle suspend, and process restarts, with built-in tools, HITL
approvals, multi-turn state, and stop-mid-stream cancellation. Builds on
#3542.

## Design

Each `/in/append` request carries at most one new message. The agent
reconstructs prior history at run boot from an object-store snapshot
plus a `session.out` replay tail, so conversation context lives
server-side instead of bloating the wire. Awaited snapshot writes after
every `onTurnComplete` keep the chain durable across idle suspend.
Registering `hydrateMessages` short-circuits both paths for customers
who own their own conversation store.

Lifecycle hooks — `onChatStart`, `onTurnStart`, `onTurnComplete`,
`onAction`, `onValidateMessages`, `hydrateMessages` — cover validation,
persistence, and post-turn work. `chat.history` exposes read primitives
(`getPendingToolCalls`, `getResolvedToolCalls`, `extractNewToolResults`,
`findMessage`, `all`) for HITL flows. `chat.local` gives per-run typed
state with Proxy access and dirty tracking. `chat.headStart` bridges
first-turn TTFC via a customer HTTP handler. `oomMachine` opts a chat
into one-shot OOM-retry on a larger machine.

`TriggerChatTransport` is a `Transport` implementation for Vercel's
ai-sdk `useChat`: delta-only wire sends, SSE reconnection with
`lastEventId` resume, stop/abort cleanup, dynamic `accessToken` refresh,
`X-Peek-Settled` fast-close. `AgentChat` is the direct programmatic
equivalent. A cross-tab coordinator does leader election so multiple
open tabs share a single SSE.

```ts
import { chat } from "@trigger.dev/sdk/ai";
import { streamText } from "ai";

export const myChat = chat.agent({
  id: "my-chat",
  run: async ({ messages, signal }) =>
    streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }),
});
```
#3610)

Concurrent `POST /api/v1/deployments` requests for the same environment
race on the `WorkerDeployment(environmentId, version)` unique
constraint. Both requests read the same latest deployment via
`findFirst`, compute the same next version via
`calculateNextBuildVersion`, and both attempt
`prisma.workerDeployment.create()` — one wins, the other crashes with
Prisma `P2002`. The bug is a classic TOCTOU between the version read and
the version write; it's been latent since the version-assignment logic
was first added but only fires when two deploys land within milliseconds
of each other (CI matrices, retried CLI calls, webhook-triggered
redeploys).

## Approach

Extracts the version assignment + create into a small helper
`createDeploymentWithNextVersion`
(`apps/webapp/app/v3/services/initializeDeployment/createDeploymentWithNextVersion.server.ts`).
The helper retries on `P2002 (environmentId, version)` up to 5 times
with randomised 5–50ms jitter so N concurrent racers don't loop in
lockstep. Each attempt re-reads the latest version, recomputes via
`calculateNextBuildVersion`, and re-runs the caller's `buildData`
callback so version-dependent fields (image ref tag, friendlyId) are
always consistent with the version actually persisted. A `logger.warn`
fires per collision so the retry rate is observable in production logs.

When retries are exhausted, the helper throws a dedicated
`DeploymentVersionCollisionError` carrying `environmentId`, `attempts`,
and `lastAttemptedVersion`, with the original
`PrismaClientKnownRequestError` attached as `cause`. Sentry walks the
`cause` chain natively, so contention exhaustion shows up as a
distinguishable wrapper exception linked to the underlying `P2002`
rather than a generic unique-constraint violation that looks identical
to every other duplicate-key bug.

The behavioural change is limited to "catch P2002 and retry instead of
crashing." The image ref computation stays inside the builder callback
(same call site as before the refactor), so ECR / non-ECR behaviour, S2
stream creation order, and all downstream side effects are unchanged.

## Non-goals

- No new database migrations, no schema changes, no isolation-level /
locking changes. A serialisable transaction or advisory lock would also
fix this; retry-on-conflict is the smaller change that keeps the
existing version-allocation logic intact.
- Does not touch the analogous `calculateNextBuildVersion` call in
`createBackgroundWorker.server.ts`, which likely has the same race shape
against `BackgroundWorker`'s unique constraint — flagged as a follow-up.

## Test plan

- [x] `pnpm run typecheck --filter webapp` passes (no new errors in the
modified files).
- [x] Three real-Postgres tests in
`apps/webapp/test/createDeploymentWithNextVersion.test.ts` via
`containerTest`:
- 5 concurrent calls all produce distinct, persistable versions
(`Set(versions).size === concurrency`). The naive read-then-create
version of the helper fails this test with the exact same `P2002` seen
in production; the retry version passes.
- Non-`P2002` errors raised from the `buildData` callback propagate
immediately without retry, builder invoked exactly once.
- With `maxRetries: 0`, concurrent racers surface the wrapped
`DeploymentVersionCollisionError` (not a raw `P2002`); `environmentId`,
`attempts`, `lastAttemptedVersion` are populated and `error.cause.code
=== "P2002"`.
- [x] Existing `apps/webapp/test/getDeploymentImageRef.test.ts` still
green (the file was untouched in the final diff).

## Follow-ups (not in this PR)

- `createBackgroundWorker.server.ts` likely has the same TOCTOU shape
against its background-worker version unique constraint — should use the
same helper.
- Sentry visibility check: confirm `error.cause` chain renders as a
linked exception in the Sentry UI when the wrapped error fires (requires
a sandboxed triggering of the exhaustion path).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dashboard surfaces for inspecting and debugging chat.agent runs.
Depends on the Sessions primitive (L1) and chat.agent runtime (L2+L3).

Run inspector — chat-aware:
- AgentView + AgentMessageView (run inspector tab for chat.agent runs)
- AIChatMessages + AISpanDetails + types.ts (per-span chat message
  rendering, tool-call/tool-output handling)
- PromptSpanDetails (gen_ai.* span detail panel)
- StreamdownRenderer + shikiTheme (markdown renderer with shiki
  highlighting and v2 patch)
- useAutoScrollToBottom hook

Playground UI (interactive chat.agent debugger):
- /playground index + /playground/$agentParam routes
- /agents route + AgentListPresenter
- PlaygroundPresenter (per-org basin variants, clientData wiring)
- realtime session routes for playground + run inspector chat
- AI-generate-payload + AIPayloadTabContent for the test panel

Navigation + theming:
- SideMenu links for Agents and Playground
- BlankStatePanels copy updates
- tailwind config + tailwind.css storybook hooks
- streamdown@2 dep in apps/webapp/package.json

Includes agent-view-sessions, playground-trigger-config-fields,
run-agent-view, and streamdown-v2-upgrade .server-changes.
## Summary

A chat-aware run inspector and a `/playground` UI for testing
`chat.agent` tasks interactively. Builds on #3543's runtime.

## Design

The run inspector grows a new tab that renders the conversation chain
for any `chat.agent`-kind run. It subscribes to the run's session
streams, threads chat parts through a per-message renderer, and uses a
shared markdown + Shiki component for code highlighting (also used by
the test-payload panel).

The playground is a standalone `/playground` route that lets you drive a
deployed chat agent from the dashboard — pick a task, send messages,
watch tool calls render, and see span detail on every turn. The matching
`/agents` list view shows all deployed agents in the project.
Top of the chat.agent stack: a full Next.js reference project that
exercises chat.agent end-to-end, plus the CLI MCP tools that drive
agent runs from Claude Code / Cursor / etc.

references/ai-chat:
- Full Next.js app with prisma persistence, multi-chat sidebar,
  per-chat model picker, debug panel, tool examples, smoke tests
- Reference tools: getCurrentTime, searchHackerNews, createGithubIssue,
  PR review helpers, code sandbox
- chat-client-test orchestrator for concurrent-send stress
- references/hello-world chatAgent + triggerAndSubscribe examples

CLI MCP tooling for chat.agent:
- mcp/tools/agentChat.ts (start_agent_chat, send_agent_message,
  close_agent_chat)
- mcp/tools/agents.ts + tasks.ts (list agents, agent run details)
- dev-run-controller OOM kill + taskRunProcessPool tweaks
- dev/managed entry-point hooks for skills bundling
- buildWorker + bundleSkills (agent skills support)

Includes ai-tool-helpers + mcp-agent-chat-sessions changesets, plus
the streamdown@2 patch and pnpm-lock reconciliation.

(Will be renamed to feature/ai-chat-reference-and-cli before push.)

fix(cli): preserve lastEventId after sendMessage fallback to avoid stale turn-complete replay
## Summary

A complete Next.js reference project that exercises `chat.agent`
end-to-end, plus the CLI MCP tools that let Claude Code, Cursor, and
similar IDE agents drive a deployed `chat.agent` task from the editor.
Builds on #3545.

## Design

`references/ai-chat` is a full Next.js app: prisma-backed persistence,
multi-chat sidebar, per-chat model picker, debug panel, tool examples
(`getCurrentTime`, `searchHackerNews`, `createGithubIssue`, PR review
helpers, code sandbox), and smoke tests. It's intended both as a
copy-paste starting point and as a place to regression-test SDK changes.

The CLI gains MCP tools (`start_agent_chat`, `send_agent_message`,
`close_agent_chat`, `list_agents`) so an IDE agent can converse with a
deployed `chat.agent` task. The dev runtime adds one-shot OOM kill on
the run controller and skills bundling in the build pipeline.
Follow-up to #3615. The `code` filter currently fires typecheck for any
change outside `docs/`, `.changeset/`, `hosting/`, or `.github/` - so a
docs-only PR like #3623 (touching `references/ai-chat/.env.example` +
`README.md`) triggered the typecheck job. None of the `references/*`
packages declare a `typecheck` script either, so even when a real code
change lands there, `turbo run typecheck` skips them. Running the job is
pure cost.

Tightens the filter to also exclude:

- `references/**` - playground projects, none of them contribute to
`turbo run typecheck` today
- `**/*.md` - markdown anywhere
- `**/.env.example` - example env files anywhere

Two known gaps left open:

- references/ have no real CI typecheck coverage. Separate question -
either add `typecheck` scripts to each (or top-level `tsc -p`), or
accept playground status.
- `changes` job still runs (it's a path-filter step) but the dependent
jobs all skip on irrelevant PRs.
…builds (#3626)

## Summary

`LocalsKey<T>` (the type returned by `locals.create()`) was branded with
a
module-level `declare const __local: unique symbol`. Each such
declaration
is its own nominal type, and `tshy` emits separate `.d.ts` files for the
ESM and CJS outputs — each gets its own `__local` symbol. Under certain
pnpm hoisting layouts a single TypeScript compilation can resolve
`LocalsKey` from both the ESM source path and the CJS dist path within
the same call site, producing two structurally-incompatible variants of
the same type. TS surfaces this as the misleading error:

```
Argument of type 'LocalsKey<X>' is not assignable to parameter of type
'LocalsKey<X>'. Property '[__local]' is missing in type 'LocalsKey<X>'
but required in type 'BrandLocal<X>'.
```

The error has been hitting CI on PRs opened since the chat.agent stack
landed (e.g. #3625 typecheck job), but doesn't reproduce on developer
machines where the pnpm node_modules layout was built up incrementally.

## Fix

Replace the `unique symbol` brand with an optional phantom field that
carries `T` at the type level:

```ts
// before
declare const __local: unique symbol;
type BrandLocal<T> = { [__local]: T };
export type LocalsKey<T> = BrandLocal<T> & {
  readonly id: string;
  readonly __type: unique symbol;
};

// after
export type LocalsKey<T> = {
  readonly id: string;
  readonly __type: symbol;
  /** Phantom carrier for the value type — never read at runtime. */
  readonly __valueType?: T;
};
```

The ESM and CJS `.d.ts` outputs now produce structurally identical
types,
so cross-output resolution no longer produces a mismatch. `T` is still
carried at the type level via the optional phantom field. The runtime
shape is unchanged — `manager.ts` was already casting via `as unknown`,
which is no longer needed.

## Test plan

- [ ] `pnpm run typecheck --filter @trigger.dev/core --filter
@trigger.dev/sdk`
- [ ] `pnpm run build --filter @trigger.dev/core --filter
@trigger.dev/sdk`
      (clean rebuild) — confirms the ESM and CJS dist `.d.ts` outputs
      no longer carry distinct `unique symbol` declarations
- [ ] `pnpm --filter @trigger.dev/core test test/mockTaskContext.test.ts
--run`
- [ ] `pnpm --filter @trigger.dev/sdk test test/mockChatAgent.test.ts
--run`
## Summary

A "Google auth conflict" Sentry alert fires whenever a user signs in via
Google whose Google account is linked to one user row but whose
Google-provided email is now on a *different* user row. The handler in
`apps/webapp/app/models/user.server.ts:236` already does the right thing
— it returns the existing auth-linked user and skips the update path so
neither row gets mutated — but it logs the situation with
`logger.error`, which routes to Sentry as an exception and pages the
on-call channel.

There's no exception to chase here: the branch is the intended outcome
for a known data shape (user changed their email on one account after
originally signing up via Google on another). Downgrading the call to
`logger.warn` keeps the diagnostic record in our logs (with all the same
context fields — email, both user IDs, authIdentifier) but stops it
firing the production error alert.

## Change

- `logger.error` → `logger.warn` for the conflict branch in
`findOrCreateGoogleUser`. Context payload is unchanged.

## Test plan

- [x] Typecheck only — there's no behavioural change to test, the log
level is the entire diff.
…3625)

## Summary

The trigger-task hotpath used to early-return without a DB query when a
caller passed both a queue override and a per-trigger TTL — the hottest
configuration on the trigger API. Adding `triggerSource` to the resolver
so the runs-list "Source" filter could distinguish STANDARD / SCHEDULED
/
AGENT runs removed those early-returns, costing +2 DB queries per
trigger
on non-locked calls and +1 on locked calls.

This change caches `BackgroundWorkerTask` metadata (`ttl`,
`triggerSource`,
`queueId`, `queueName`) in Redis so the resolver can satisfy every
caller
configuration with a single `HGET` on the warm path. PG fallback on miss
back-fills the cache.

Follow-up to #3542.

## Design

Two key spaces:

- `task-meta:env:{envId}` — the "current worker" view, refreshed at
every
  deploy promotion. 24h safety TTL.
- `task-meta:by-worker:{workerId}` — used for `lockToVersion` triggers.
  Immutable post-create. 30d sliding TTL so historical workers age out.

Cache writes use Lua scripts via `defineCommand` so `DEL` + `HSET` +
`EXPIRE` land atomically — concurrent readers never see the empty
intermediate state of a naive pipeline. Read-path back-fill uses
single-field upserts so concurrent back-fills don't wipe each other's
siblings.

The cache lives behind its own `TASK_META_CACHE_REDIS_*` env-var prefix
that falls back to the default `REDIS_*` set, so operators can route the
cache to a dedicated Redis instance if they want.

The service/instance file split (`taskMetadataCache.server.ts` for the
pure class, `taskMetadataCacheInstance.server.ts` for the env-wired
singleton) mirrors the existing `runsReplicationService` /
`runsReplicationInstance` pattern.

## Test plan

- [ ] `pnpm run typecheck --filter webapp`
- [ ] `pnpm run test ./test/engine/triggerTask.test.ts --run` — 8
      existing tests untouched + 5 new tests covering warm cache, cold
      miss with back-fill, queue + ttl path, by-worker vs env keyspace,
      and the promotion cache write
- [ ] End-to-end against a dev worker: registering writes both keyspaces
with the expected TTLs, and `redis-cli HGETALL
"tr:task-meta:env:<envId>"`
      returns the cached entries


## Benchmark

Measured `DefaultQueueManager.resolveQueueProperties` against a real
Postgres + Redis (vitest `containerTest`, single-host docker). 500
sequential calls and 2,000 parallel calls (concurrency=50) per scenario,
request shaped as `{ taskId, queue: "bench-queue", ttl: "5m" }` — the
hot path this PR restores.

```
sequential (one in flight at a time):
[noop cache (baseline)]  n=500   mean=1.423ms  p50=1.394ms  p95=1.735ms  p99=2.629ms  max=11.100ms
[redis cache, cold   ]  n=500   mean=1.346ms  p50=1.283ms  p95=1.688ms  p99=2.463ms  max=5.058ms
[redis cache, warm   ]  n=500   mean=0.084ms  p50=0.078ms  p95=0.105ms  p99=0.156ms  max=1.129ms
speedup (warm vs baseline, sequential): 16.95x

parallel (concurrency=50):
[noop cache (baseline)]  n=2000  mean=10.069ms  p50=8.850ms  p95=14.718ms  p99=31.887ms  total=405ms  ops/s=4,940
[redis cache, warm   ]  n=2000  mean=0.614ms   p50=0.568ms  p95=1.189ms   p99=1.432ms   total=25ms   ops/s=80,389
throughput speedup (warm vs baseline, parallel): 16.27x
```

Read:

- **Warm cache cuts resolver latency 17×** at p50 — from ~1.4 ms to ~78
µs per call.
- **Cold cache is on par with baseline** — the extra `HGET` miss adds
<50 µs against the two Postgres queries that follow, so the worst case
is not worse than today.
- **Under burst load (50 concurrent triggers)**, the baseline's p99
jumps to ~32 ms as Postgres connections queue up; warm stays at ~1.4 ms.
The cache moves the saturation point from ~5k ops/s (PG pool) to ~80k
ops/s (single-client Redis pipelining).

Caveats: single-host docker, local Postgres + Redis, resolver-only
measurement (excludes the rest of the trigger transaction). Prod adds
region-local Redis RTT (~0.3–0.8 ms) which shifts warm absolute numbers
up but keeps the ratio intact.
ericallam and others added 30 commits June 16, 2026 15:02
…3963)

## Summary

`chat.headStart` (the warm step-1 fast path) previously handed its
response over only to `chat.agent`. This extends handover to the other
two backends: `chat.customAgent` consumes it with
`conversation.consumeHandover({ payload })` on turn 0, and
`chat.createSession` surfaces it as `turn.handover` (call
`turn.complete()` with no source to finalize a pure-text handover). The
low-level `chat.waitForHandover()` and `accumulator.applyHandover()` are
exported for hand-rolled loops.

It also adds `triggerConfig` to `chat.headStart()` and
`chat.openSession()`, so the auto-triggered handover-prepare run
inherits tags, queue, machine, and the other session run options the
same way `chat.createStartSessionAction()` does. The `chat:{chatId}` tag
is prepended automatically. Because the session is created once on the
first head-start turn (idempotent on the chat id), this is the only
place those options can be set for a head-start chat's lifetime.

## Fix: tool-call resume

When the warm step-1 hands over a pending tool call (rather than pure
text), the agent loop resumes that tool round. For it to merge cleanly
the pipe threads the spliced partial as `originalMessages`, so the
resumed tool-output chunk attaches to the handed-over tool-call instead
of throwing `No tool invocation found`. `MessageAccumulator.addResponse`
now also dedups by id (replace-in-place), so the persisted history
doesn't carry a duplicate assistant message when the resumed response
reuses the partial's id.

Incorporates the `triggerConfig` work from
[#3933](#3933) by
@saasjesus, with `createStartSessionAction` extended to also forward
`maxDuration`, `region`, and `lockToVersion` so the two session entry
points stay consistent.

Verified end-to-end against a local environment: handover (pure-text and
tool-call) on both new backends, a `chat.agent` regression pass, and
`triggerConfig` tags and queue landing on the run.

---------

Co-authored-by: saasjesus <armin@chatarmin.com>
## Summary

Reworks the scheduled task page right-hand sidebar.

- Adds **Overview** / **Schedules** tabs. The Schedules tab is a
paginated table of all schedules attached to the task, declarative
first.
- Surfaces schedule fields (ID, CRON + human-readable description,
next/last run, status) directly in the Overview property table.
- Sidebar can be dragged much wider (up to 80% of the viewport).
- "No schedules attached" panel explains declarative vs imperative and
links to docs.
- Schedule **create / edit / enable / disable / delete** all happen
inside the existing Sheet — no more navigating to the standalone
schedule page. Toasts confirm each action.

## Test plan

- Open a scheduled task page and verify the new tabs
- Create, edit, enable/disable, and delete a schedule — confirm you stay
on the page and see a toast each time
- Visit a task with no schedules attached and confirm the info panel
renders
- Drag the sidebar wider; confirm pagination shows when there are >25
schedules
## Summary

Docs deploy from the `docs-live` branch via Mintlify, so merging to
`main` no longer publishes docs on its own. To publish, push a
`docs-release-*` tag at the commit you want live. The workflow runs the
Mintlify broken-links check against that commit, then fast-forwards
`docs-live` to it, which is what Mintlify deploys from.

## Design

The ref move uses the GitHub API with `force=false`, making it
fast-forward only: a tag that is not ahead of `docs-live` fails the job
rather than rewinding production. Mintlify's GitHub app reacts to the
resulting push and deploys, so no extra deploy credentials are needed.

Usage:

```bash
git tag docs-release-2026.06.16   # tag the main commit you want live
git push origin docs-release-2026.06.16
```
…3964)

## Summary

`chat.headStart` now works with the `chat.customAgent` and
`chat.createSession` backends (not just `chat.agent`), and takes a
`triggerConfig` option. These docs cover both.

The Fast starts guide gets a "Handover with custom agents" section
showing how each backend consumes the handover (`consumeHandover`
returning `{ isFinal, skipped }` for custom agents, `turn.handover` for
createSession), including threading `originalMessages` so a resumed tool
round merges into the handed-over assistant. The `chat.headStart` API
section documents `triggerConfig` (tags, queue, machine, and the rest)
on the auto-triggered run.

The reference picks up `ChatTurn.handover`, `turn.complete()` with no
source, `chat.waitForHandover`, and a new `HeadStartHandlerOptions`
table.

Docs for the SDK changes in
[#3963](#3963).
…served keys (#3966)

Fix Vercel onboarding wizard to properly filter out reserved TRIGGER_
env vars
## Summary

New `/ai-chat/prompt-caching` guide covering how to cache a chat agent's
prompt prefix with Anthropic prompt caching: the system prompt, the
conversation history (a `prepareMessages` breakpoint), and how caching
interacts with compaction. It also shows how to verify cache hits via
usage and the dashboard, the prefix-stability footguns, and an "Other
providers" section (OpenAI and Google cache automatically; Amazon
Bedrock uses `cachePoint` through `systemProviderOptions`).

Registered under Features in the AI Agents nav, next to Compaction.

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Eric Allam <ericallam@users.noreply.github.com>
## Summary

The "What extractNewToolResults returns" reference in the
tool-result-auditing guide did not match the SDK. It listed an `input`
field that `chat.history.extractNewToolResults()` never returns, and
marked `output` as optional when it is always present.

This corrects the block to the real `ChatNewToolResult` shape
(`toolCallId`, `toolName`, `output`, optional `errorText`). Every usage
example in the same guide already reads only those fields, so the
reference now matches both the examples and the code.
…3958)

## Summary

The Models page is now split into two tabs. **Your models** shows the
models your project has actually used in the selected time range, with
usage charts (cost over time, tokens over time, calls by model), a
per-model table of calls / cost / avg TTFC / avg tokens-per-sec, and
calls/tokens trend sparklines. **Model library** is the full catalog,
reordered from alphabetical to a relevance-based provider order
(Anthropic, OpenAI, Google, then the rest), newest models first within
each provider, with a "New" badge on models released in the last 7 days.

One time-range selector drives the whole Your models tab, so the charts,
the table, and the sparklines all share the same window. Opening a model
shows its own metrics with an independent range picker and a "View in AI
metrics" link that opens the AI metrics dashboard filtered to that
model. The active tab is kept in the URL so it survives a refresh and is
shareable.

## Prompt caching & cost accuracy

Both the Your models tab and the AI metrics dashboard now surface
prompt-cache usage: a cache-savings column plus per-model cached-tokens
and cache-hit-rate views, and a caching section on the dashboard (hit
rate, cached tokens, estimated savings, and hit rate by model).

Building this surfaced a cost bug. `input_tokens` is the total prompt
count and already includes cache-read and cache-creation tokens, but the
cost pipeline charged the full input at the input price and then added a
separate cache line, so cached tokens were billed twice (and on
Anthropic, cache reads were never discounted because their price is
keyed differently). The input price now applies only to the non-cached
remainder, with cache prices resolved across the provider-specific keys,
so LLM cost and the cache hit-rate metric are accurate. Hit rate is
computed as cached reads over total input.

## Notes

Also fixes React "invalid DOM property" console warnings from the
provider icons (the Llama and DeepSeek SVGs used raw `fill-rule` /
`clip-rule` / `clip-path` attributes), which this page surfaces by
rendering more provider icons.

## Screenshots

**Your models tab:** usage charts and a per-model table with
calls/tokens trend sparklines.

<img width="2560" height="1267" alt="1-your-models-tab"
src="https://github.com/user-attachments/assets/859bd24f-9047-4828-8bbb-83e5882846d6"
/>


**Model library:** provider-relevance ordering with a "New" badge on
models released in the last 7 days.

<img width="2560" height="1267" alt="2-model-library-tab"
src="https://github.com/user-attachments/assets/46dd54b9-80f9-4922-ade9-5935b08dfebc"
/>


**Model detail, Metrics tab:** per-model range picker and a "View in AI
metrics" link.

<img width="2560" height="1267" alt="3-model-detail-metrics"
src="https://github.com/user-attachments/assets/0f65d9d0-6142-4918-93f0-110bb277101a"
/>


**View in AI metrics:** the dashboard deep-linked and filtered to the
selected model.

<img width="2560" height="1267" alt="4-ai-metrics-filtered"
src="https://github.com/user-attachments/assets/821f256c-e305-493c-98c7-eafaf2f57f83"
/>
…#3939)

## Summary

The agent skills' deep guidance now ships inside `@trigger.dev/sdk` and
is read from `node_modules`, so it tracks the `@trigger.dev/sdk` version
installed in your project automatically. This updates the Skills page,
the Building with AI step, and the rules-redirect page to drop the old
"pinned to the CLI version, re-run to refresh" framing and describe the
version-pinned reference instead.

Pairs with the SDK/CLI change in #3937. Keep this draft until that
ships, since it describes behavior that is not released yet.
## Summary

Typing in the search bar on the task page could clear or reset the input
mid-keystroke. This fixes the re-render race so the field stays stable
while you type.

## Root cause

Two things compounded:

- `SearchInput`'s sync effect depended on `text`, so it re-ran on every
keystroke and could overwrite the input with the URL/controlled value
while focused.
- Each task row unmounted and remounted its activity chart during the
side-panel open/close animation (25 charts at once), forcing heavy
re-renders that the search effect raced against.

## Fix

- `SearchInput` now tracks the last synced value in a ref instead of
comparing against `text`, keeping the effect off the keystroke path. It
only writes to state when the incoming URL/controlled value actually
changes, and never while the input is focused.
- Activity charts are now hidden (`hidden` attribute) instead of
unmounted during the panel animation, so the rows don't churn the tree
and the resize stays smooth.

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ngs (#3970)

## Summary

Three improvements to the SDK-bundled agent skills (follow-up to the
skills installer):

- **`trigger-` namespace.** The installed skills (`authoring-tasks`,
`getting-started`, …) had generic names that collide with unrelated
skills in a shared agent skills directory. They're now prefixed —
`trigger-authoring-tasks`, `trigger-getting-started`, etc. — matching
the convention the public skills repo already uses.
- **New `trigger-cost-savings` skill.** An MCP-driven cost audit:
right-sizes machines, flags missing `maxDuration`, spots sequential
triggers that could batch, and reviews schedule frequency, using
`list_runs` / `get_run_details` for live analysis.
- **Bundle the full docs.** `@trigger.dev/sdk` now bundles the entire
"Documentation" section of the docs (157 pages) instead of a curated
55-page subset, so an agent has the complete, version-pinned reference
in `node_modules`.

## How the bundling works

`scripts/bundleSdkDocs.ts` now reads `docs/docs.json`, walks the
"Documentation" dropdown, and copies every page under it into the SDK.
The set tracks the docs navigation automatically — add a page to the nav
and it ships, no skill edits needed. The API reference and Guides &
examples dropdowns are intentionally excluded. A skill's `sources:`
frontmatter is now informational only.

The dropped idea of a dedicated `trigger-config` skill is replaced by
references to the bundled build-extension docs (`config/extensions/*`)
from the `trigger-authoring-tasks` config section and the chat-agent
skills.
Adds an opt-in mechanism to route a configurable percentage of
organizations onto the compute (MicroVM) backing of their region at
trigger time, without changing their stored region settings.

Routing is gated by three global feature flags -
`computeMigrationEnabled`, `computeMigrationFreePercentage`,
`computeMigrationPaidPercentage` - plus a per-org
`computeMigrationEnabled` override that wins in both directions. A
region's compute backing is resolved from a new
`WorkerInstanceGroup.region` column: a container group and its MicroVM
group share one geo `region`, so the migration swaps the resolved worker
queue to the backing group's queue. Orgs are bucketed deterministically
by id, so ramping a percentage down keeps a strict subset rather than
reshuffling, and a region with no compute backing is never touched.
Everything is off by default - behaviour is unchanged unless the flags
are set.

The flags and the worker-region groups are read on the trigger hot path
from in-memory snapshots rather than the database: a small
`createReloadingRegistry` helper loads each at startup and refreshes
them on an interval, so no per-trigger query is added and a percentage
or kill-switch change propagates within the reload interval. A cold
replica whose snapshot hasn't loaded yet reads as not-migrated (the
container path) and self-corrects on the next load - the same cold-start
contract as the datastore / LLM-pricing registries, with a
`reloading_registry_loaded` metric so a never-loaded registry is
alertable.

The same migration decision is consulted at deploy-time template
creation so a migrated org gets a compute template built ahead of its
first run. This runs in shadow mode (best-effort, never fails the
deploy) by default, or - when the `computeMigrationRequireTemplate` flag
is on - in required mode, built synchronously at deploy so the first run
never builds on-demand and template errors surface at deploy time.

So operators keep "which runs ran where" while customers only see
geography: the run's actual worker queue is stored raw, and the geo
region is stamped separately on `TaskRun.region` (and a new ClickHouse
`region` column) at trigger time. Read surfaces - the dashboard, the
API, and the Query/Logs page - show the geo region, falling back to the
worker queue for runs written before the column existed.

Minor follow-ups left out of scope: the percentage flags render as text
inputs on the admin flags page (the catalog UI has no numeric control
type yet), and `createReloadingRegistry` could later gain pub/sub for
sub-second cross-replica propagation if the reload interval proves too
slow.
## Summary
7 improvements.

## Improvements
- `@trigger.dev/sdk` now bundles the Trigger.dev agent skills and a
curated snapshot of the docs those skills reference. The skills that
`trigger skills` installs into your coding agent read this content from
node_modules, so the guidance your AI assistant follows is pinned to the
SDK version installed in your project and stays current across upgrades
instead of going stale until the next reinstall.
([#3937](#3937))
- Running a CLI command like `dev`, `deploy`, `preview`, or `update`
before initializing a project no longer crashes with a raw `Cannot find
matching package.json` stack trace. The CLI now detects the missing
project and points you to `npx trigger.dev@latest init` instead.
([#3929](#3929))
- The agent skills installed by `trigger skills` are now namespaced with
a `trigger-` prefix (e.g. `trigger-authoring-tasks`,
`trigger-getting-started`) so they don't collide with unrelated skills
in your coding agent's skills directory. Adds a `trigger-cost-savings`
skill for auditing and reducing compute spend (right-sizing machines,
`maxDuration`, batching, debounce), and `@trigger.dev/sdk` now bundles
the full Trigger.dev documentation so your agent can read the complete,
version-pinned reference directly from node_modules.
([#3970](#3970))
- The run span API response now includes `cachedCost` and
`cacheCreationCost` on the `ai` object, alongside the existing
`inputCost` / `outputCost` / `totalCost`. `inputCost` reflects only the
non-cached input, so these fields let you reconstruct the full cost
breakdown for prompt-cached calls.
([#3958](#3958))
- `chat.headStart` now works with the `chat.customAgent` and
`chat.createSession` backends, not only `chat.agent`. The warm step-1
response hands over to your loop the same way it does for a managed
agent. ([#3963](#3963))
  
  In a `chat.customAgent` loop, consume the handover on turn 0:
  
  ```ts
  const conversation = new chat.MessageAccumulator();
const { isFinal, skipped } = await conversation.consumeHandover({
payload });
  if (skipped) return; // warm handler aborted, so exit without a turn
  if (isFinal) {
await chat.writeTurnComplete(); // step 1 is the response, no streamText
  } else {
const result = streamText({ model, messages: conversation.modelMessages,
tools });
// Pass originalMessages so the handed-over tool round merges into the
    // step-1 assistant instead of starting a new message.
    const response = await chat.pipeAndCapture(result, {
      originalMessages: conversation.uiMessages,
    });
    if (response) await conversation.addResponse(response);
  }
  ```
  
With `chat.createSession`, the iterator surfaces it as `turn.handover`;
call `turn.complete()` with no argument on a final handover. The
lower-level `chat.waitForHandover()` and `accumulator.applyHandover()`
are also exported for hand-rolled loops.
- Cache your chat agent's system prompt with Anthropic prompt caching.
`chat.toStreamTextOptions()` now emits the system prompt as a cacheable
message when you opt in, so a large, stable system block is billed at
cache-read rates on every turn instead of full price.
([#3952](#3952))
  
  ```ts
  // at the streamText call site (Anthropic sugar)
  streamText({
...chat.toStreamTextOptions({ cacheControl: { type: "ephemeral" } }),
    messages,
  });
  
  // provider-agnostic equivalent
  chat.toStreamTextOptions({
systemProviderOptions: { anthropic: { cacheControl: { type: "ephemeral"
} } },
  });
  
  // or where the prompt is defined
  chat.prompt.set(SYSTEM_PROMPT, {
providerOptions: { anthropic: { cacheControl: { type: "ephemeral" } } },
  });
  ```
  
Without an option, `system` stays a plain string. Pairs with a
`prepareMessages` cache breakpoint to cache the conversation prefix
across turns too.
- Three fixes for custom agent loops (`chat.customAgent`,
`chat.createSession`, and hand-rolled `MessageAccumulator` loops):
([#3936](#3936))
  
- Continuation runs no longer replay already-answered user messages into
the first turn. The `.in` resume cursor is now seeded before any
listener attaches (the same boot logic `chat.agent` uses), so a chat
that continues after a cancel, crash, or upgrade only sees genuinely new
messages.
- Steering a hand-rolled loop mid-stream no longer wipes the in-flight
assistant response. `chat.pipeAndCapture` now stamps a server-generated
message id on the stream, so a `prepareStep` injection keeps the partial
text instead of replacing the message.
- Task-backed tools (`ai.toolExecute`) now work from custom agent loops:
the parent's session is threaded to the child run, so child tasks can
stream progress into the chat with `chat.stream.writer({ target: "root"
})` instead of failing with "session handle is not initialized".

<details>
<summary>Raw changeset output</summary>

⚠️⚠️⚠️⚠️⚠️⚠️

`main` is currently in **pre mode** so this branch has prereleases
rather than normal releases. If you want to exit prereleases, run
`changeset pre exit` on `main`.

⚠️⚠️⚠️⚠️⚠️⚠️

# Releases
## @trigger.dev/build@4.5.0-rc.7

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.7`

## trigger.dev@4.5.0-rc.7

### Patch Changes

- `@trigger.dev/sdk` now bundles the Trigger.dev agent skills and a
curated snapshot of the docs those skills reference. The skills that
`trigger skills` installs into your coding agent read this content from
node_modules, so the guidance your AI assistant follows is pinned to the
SDK version installed in your project and stays current across upgrades
instead of going stale until the next reinstall.
([#3937](#3937))
- Running a CLI command like `dev`, `deploy`, `preview`, or `update`
before initializing a project no longer crashes with a raw `Cannot find
matching package.json` stack trace. The CLI now detects the missing
project and points you to `npx trigger.dev@latest init` instead.
([#3929](#3929))
- The agent skills installed by `trigger skills` are now namespaced with
a `trigger-` prefix (e.g. `trigger-authoring-tasks`,
`trigger-getting-started`) so they don't collide with unrelated skills
in your coding agent's skills directory. Adds a `trigger-cost-savings`
skill for auditing and reducing compute spend (right-sizing machines,
`maxDuration`, batching, debounce), and `@trigger.dev/sdk` now bundles
the full Trigger.dev documentation so your agent can read the complete,
version-pinned reference directly from node_modules.
([#3970](#3970))
-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.7`
    -   `@trigger.dev/build@4.5.0-rc.7`
    -   `@trigger.dev/schema-to-json@4.5.0-rc.7`

## @trigger.dev/core@4.5.0-rc.7

### Patch Changes

- The run span API response now includes `cachedCost` and
`cacheCreationCost` on the `ai` object, alongside the existing
`inputCost` / `outputCost` / `totalCost`. `inputCost` reflects only the
non-cached input, so these fields let you reconstruct the full cost
breakdown for prompt-cached calls.
([#3958](#3958))

## @trigger.dev/python@4.5.0-rc.7

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/sdk@4.5.0-rc.7`
    -   `@trigger.dev/core@4.5.0-rc.7`
    -   `@trigger.dev/build@4.5.0-rc.7`

## @trigger.dev/react-hooks@4.5.0-rc.7

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.7`

## @trigger.dev/redis-worker@4.5.0-rc.7

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.7`

## @trigger.dev/rsc@4.5.0-rc.7

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.7`

## @trigger.dev/schema-to-json@4.5.0-rc.7

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.7`

## @trigger.dev/sdk@4.5.0-rc.7

### Patch Changes

- `@trigger.dev/sdk` now bundles the Trigger.dev agent skills and a
curated snapshot of the docs those skills reference. The skills that
`trigger skills` installs into your coding agent read this content from
node_modules, so the guidance your AI assistant follows is pinned to the
SDK version installed in your project and stays current across upgrades
instead of going stale until the next reinstall.
([#3937](#3937))

- `chat.headStart` now works with the `chat.customAgent` and
`chat.createSession` backends, not only `chat.agent`. The warm step-1
response hands over to your loop the same way it does for a managed
agent. ([#3963](#3963))

    In a `chat.customAgent` loop, consume the handover on turn 0:

    ```ts
    const conversation = new chat.MessageAccumulator();
const { isFinal, skipped } = await conversation.consumeHandover({
payload });
    if (skipped) return; // warm handler aborted, so exit without a turn
    if (isFinal) {
await chat.writeTurnComplete(); // step 1 is the response, no streamText
    } else {
const result = streamText({ model, messages: conversation.modelMessages,
tools });
// Pass originalMessages so the handed-over tool round merges into the
      // step-1 assistant instead of starting a new message.
      const response = await chat.pipeAndCapture(result, {
        originalMessages: conversation.uiMessages,
      });
      if (response) await conversation.addResponse(response);
    }
    ```

With `chat.createSession`, the iterator surfaces it as `turn.handover`;
call `turn.complete()` with no argument on a final handover. The
lower-level `chat.waitForHandover()` and `accumulator.applyHandover()`
are also exported for hand-rolled loops.

- Add `triggerConfig` support to `chat.headStart()` and
`chat.openSession()`, so the auto-triggered handover-prepare run
inherits tags, queue, machine, and other session trigger options the
same way `chat.createStartSessionAction()` does. The `chat:{chatId}` tag
is prepended automatically.
([#3963](#3963))

    ```ts
    export const POST = chat.headStart({
      agentId: "my-agent",
      triggerConfig: { tags: ["org:acme"], queue: "chat" },
run: async ({ chat }) => streamText({ ...chat.toStreamTextOptions(),
model }),
    });
    ```

Because the session is created once on the first head-start turn and is
idempotent on the chat id, this is the only place to set those options
for a head-start chat's lifetime. `chat.createStartSessionAction()` now
also forwards `maxDuration`, `region`, and `lockToVersion` so both
session entry points stay consistent.

- Cache your chat agent's system prompt with Anthropic prompt caching.
`chat.toStreamTextOptions()` now emits the system prompt as a cacheable
message when you opt in, so a large, stable system block is billed at
cache-read rates on every turn instead of full price.
([#3952](#3952))

    ```ts
    // at the streamText call site (Anthropic sugar)
    streamText({
...chat.toStreamTextOptions({ cacheControl: { type: "ephemeral" } }),
      messages,
    });

    // provider-agnostic equivalent
    chat.toStreamTextOptions({
systemProviderOptions: { anthropic: { cacheControl: { type: "ephemeral"
} } },
    });

    // or where the prompt is defined
    chat.prompt.set(SYSTEM_PROMPT, {
providerOptions: { anthropic: { cacheControl: { type: "ephemeral" } } },
    });
    ```

Without an option, `system` stays a plain string. Pairs with a
`prepareMessages` cache breakpoint to cache the conversation prefix
across turns too.

- Three fixes for custom agent loops (`chat.customAgent`,
`chat.createSession`, and hand-rolled `MessageAccumulator` loops):
([#3936](#3936))

- Continuation runs no longer replay already-answered user messages into
the first turn. The `.in` resume cursor is now seeded before any
listener attaches (the same boot logic `chat.agent` uses), so a chat
that continues after a cancel, crash, or upgrade only sees genuinely new
messages.
- Steering a hand-rolled loop mid-stream no longer wipes the in-flight
assistant response. `chat.pipeAndCapture` now stamps a server-generated
message id on the stream, so a `prepareStep` injection keeps the partial
text instead of replacing the message.
- Task-backed tools (`ai.toolExecute`) now work from custom agent loops:
the parent's session is threaded to the child run, so child tasks can
stream progress into the chat with `chat.stream.writer({ target: "root"
})` instead of failing with "session handle is not initialized".

- The agent skills installed by `trigger skills` are now namespaced with
a `trigger-` prefix (e.g. `trigger-authoring-tasks`,
`trigger-getting-started`) so they don't collide with unrelated skills
in your coding agent's skills directory. Adds a `trigger-cost-savings`
skill for auditing and reducing compute spend (right-sizing machines,
`maxDuration`, batching, debounce), and `@trigger.dev/sdk` now bundles
the full Trigger.dev documentation so your agent can read the complete,
version-pinned reference directly from node_modules.
([#3970](#3970))

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.7`

## @trigger.dev/plugins@4.5.0-rc.7

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.7`

</details>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Replicates `TaskRun.planType` into the `task_runs_v2` ClickHouse table
so run analytics can group by plan type.

Adds a `plan_type` column (goose migration `033`,
`LowCardinality(String)`), the replication insert mapping, and the
matching schema/column/type entries - same shape as the recent `region`
addition. Write-once at trigger, so it just rides along on existing
replicated rows. Internal analytics only; not exposed in the Query API.
#3960)

## Summary

Prisma infrastructure failures (P1xxx-class: database unreachable, timed
out, connection dropped, engine init/panic) carry the database hostname
in their `.message`. This captures them centrally for observability and
ensures they never reach API clients verbatim.

## Design

A `$allOperations` client extension on the writer and replica clients
logs infrastructure errors with the originating model and operation,
then rethrows the **original** error unchanged — call sites that branch
on `error.code` (unique-violation idempotency, not-found handling) and
transaction retries keep working. Only infrastructure errors are logged;
routine query/validation errors (P2xxx) are left alone.

`$allOperations` can't see the transaction boundary (`$transaction` is a
client method, not an operation), so infrastructure errors surfacing
from `$transaction()` without a Prisma code — e.g.
`PrismaClientInitializationError` — are logged separately at the
transaction wrapper, where the existing coded-error path would otherwise
miss them.

`clientSafeErrorMessage()` swaps an infrastructure error's message for
`"Internal Server Error"` at the API routes that previously returned
`error.message` raw. Status codes, headers, and every non-infrastructure
message are unchanged.

## Test plan

- [x] P2002 / P2025 rethrow with code intact and are not logged
- [x] Statement errors inside `$transaction` keep their code (retry
logic intact)
- [x] Raw queries wrapped without crashing on the undefined model
- [x] A genuine connectivity failure is logged with model/operation/code
- [x] `clientSafeErrorMessage` obfuscates infra messages, preserves all
others
- [x] `pnpm run typecheck --filter webapp` (12/12)

## Note

Overlaps with #3391 (Prisma 7 migration) on
`apps/webapp/app/db.server.ts` — coordinate rebasing.
The global feature flags admin page had a few rough edges.

The percentage flags are numeric (`z.coerce.number()`) but rendered as
free-text inputs, so you could type non-numeric values that only failed
validation after submitting - and the error surfaced behind the confirm
dialog. The control-type detection now recognises numbers and renders a
proper number input, with the min/max range as the placeholder so the
type is clear even when the field is unset. The save error also shows
inside the confirm dialog now, not just behind it.

The action buttons were unreachable without zooming out. The admin
layout wrapped each page in a plain block, so `h-full` page content
overran the viewport by the height of the tab bar and got clipped by the
`overflow-hidden` body. Making the layout a flex column bounds each page
to the space below the tabs, so the existing per-page scroll works and
the feature flags page scrolls like the Users/Orgs tabs. Also capped the
confirm dialog's diff list so its footer stays on screen when there are
many changes.
## Summary

Adds a "Duration and cost while paused" section to the human-in-the-loop
page. It explains that a HITL pause (a no-execute tool waiting on
`addToolOutput`) suspends the run and frees compute, so the human's
thinking time does not count against `maxDuration` (which measures
active CPU time and excludes suspended waitpoint time, the same as
`wait.for`). Customers don't need to raise `maxDuration` or end the run
to support long human waits.

This was a recurring point of confusion: readers assumed the pause holds
the run open and burns the budget. Also updates the how-it-works
pseudocode ("Agent suspends (compute freed)") and links `wait.for` and
`maxDuration` on first mention.
## Summary

Adds a "Stopping generation" section to the Custom agents page. It
documents how stop works when you drop down from `chat.agent` to
`chat.createSession`: pass `turn.signal` (a combined stop-and-cancel
`AbortSignal`) to `streamText`, and `turn.complete()` cleans up the
aborted partial, accumulates it as its own assistant message, and keeps
the run alive for the next turn. `turn.stopped` distinguishes a user
stop from a full run cancel.

Until now the createSession stop story only existed as scattered fields
in the reference table; the client side (`transport.stopGeneration`) and
the `chat.agent` run-callback signals were documented, but not the
custom-agent turn loop. Steering for these backends is already covered
on the pending messages page, which this page links to.
…lling routes (#3948)

## Summary

Several dashboard routes performed actions a restricted role should not
be able to do (cancel or replay runs, manage prompt versions, invite and
manage members, manage billing) without any permission check. This adds
role-based permission enforcement to those routes, and disables the
matching UI controls (with a tooltip) when the current role lacks
permission.

Covered actions:

- Runs: cancel and replay (single, bulk create, bulk abort)
- Prompts: create or edit override versions, and promote a version to
current
- Members: invite, resend invite, revoke invite
- Billing: change plan, billing alerts, and the customer portal

## How

Each affected route now goes through the `dashboardLoader` /
`dashboardAction` route builders with an `authorization` block declaring
the required permission (or a per-intent check where one route handles
several intents). Existing tenancy and data-scoping queries are
untouched; this only layers permission checks on top. The UI follows
disable-don't-hide: controls stay visible but disabled with a "You don't
have permission to ..." tooltip.

Two reusable pieces support this: `checkPermissions(ability, checks)`
turns a set of checks into a boolean map a loader returns to the client,
and `PermissionButton` / `PermissionLink` disable the underlying control
and show a tooltip when a permission flag is false.

## Behaviour

No change in the default configuration: permissions are permissive, so
every control stays enabled and every route behaves as before. The
checks only take effect when an RBAC plugin is installed. This also
makes role assignment on invite-accept non-fatal, so a failure there
cannot block joining an org.

Verified with `pnpm run typecheck --filter webapp`; `checkPermissions`
has unit tests.
A batch of technical-SEO fixes across the docs, all reader-facing
(titles, links, redirects):

- Canonicalize the duplicate CLI command pages: the bare `/cli-dev` and
`/cli-deploy` paths now permanently redirect to their `-commands`
equivalents, and a duplicate navigation entry is removed.
- Give the three pages that all rendered as "Overview" distinct titles
(Building with AI, self-hosting overview, Management API overview), with
sidebar labels unchanged.
- Replace the generic "Learn more" links in the introduction's
build-extension list with descriptive anchor text.
- Switch two http links to https in the Supabase guides, point a
troubleshooting page's help link to Discord, and add missing meta
descriptions to three help and troubleshooting pages.
## ✅ Checklist

- [x] I have followed every step in the [contributing
guide](https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md)
- [x] The PR title follows the convention.
- [x] I ran and tested the code works

---

## Testing

Ran the webapp locally with the change applied; it compiles and serves.
The edit only swaps the chart card title string from "LLM spend" to "LLM
spend ($)" on the agent landing page.

---

## Changelog

The agent dashboard "LLM spend" chart label now includes the currency
unit, reading "LLM spend ($)".

---

## Screenshots

_[Screenshots]_

💯
Pushes new organizations and users into the Attio CRM at signup time,
for Customer Success (TRI-10431).

- Orgs → Attio `workspaces`, users → Attio `users`, keyed on Attio's
built-in unique `workspace_id` / `user_id` so writes are idempotent
upserts.
- Runs on the common Redis worker (not inline), so a slow or unavailable
Attio never blocks the signup path; failures retry (3 attempts).
- Hooks: user-created (alongside the existing Loops call) and
org-created (`createOrganization`).
- Gated behind `ATTIO_API_KEY`, no key means the sync is skipped
entirely, so OSS / self-hosted installs are unaffected.

Only creation is covered here (the record "shell"); spend, runs, plan
changes, churn, and role/relationship linking are populated by the
scheduled full sync, tracked separately.

**Deploy note:** requires an Attio API key set as `ATTIO_API_KEY` in the
webapp env, with scopes **Records (read-write)** + **Object
Configuration (read)**, the assert/upsert endpoint reads object config
to resolve the matching attribute. Without the key the sync no-ops.

---------

Co-authored-by: Matt Aitken <matt@mattaitken.com>
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
## Summary

Adds the `4.5.0-rc.7` entry to the AI chat changelog, covering the
agent-facing changes in
[v4.5.0-rc.7](https://github.com/triggerdotdev/trigger.dev/releases/tag/v4.5.0-rc.7):

- `chat.headStart` now works with the `chat.customAgent` and
`chat.createSession` backends, not just `chat.agent`
- Opt-in Anthropic system-prompt caching via
`chat.toStreamTextOptions()`
- Three custom-agent-loop fixes: continuation replay, mid-stream
steering, and task-backed tools
- `trigger skills` follow-ups: `trigger-` namespacing, SDK-bundled docs,
and a new cost-savings skill

Generic, non-agent rc.7 items (the CLI uninitialized-project error
message, run-span cost fields) are intentionally left out to keep this
changelog scoped to AI chat agents.
## Summary

The Personal Access Tokens page now shows each token's maximum role in a
new column, so you can see at a glance what a token is capped to. The
column only appears when an RBAC plugin is installed, and shows "-" for
tokens with no cap. Its header tooltip reuses the same explanation shown
in the create-token panel.
## Summary

Replaces the multi-select popover task type filter on the Tasks page
with a single-select segmented control: **All** plus icon-only
**Agent**, **Standard**, and **Scheduled** segments. Each segment has a
tooltip showing its label and a number-key shortcut (0-3), and the
search field no longer autofocuses so the shortcuts work on page load.

## ✅ Checklist

- [x] I have followed every step in the [contributing
guide](https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md)
- [x] The PR title follows the convention.
- [x] I ran and tested the code works
#3997)

## Summary

Adds a short-lived, delegated token (`tr_uat_...`) that authenticates
against the API as a user without handing out a long-lived personal
access token. You mint one from a PAT, optionally narrow it to a set of
scopes, and give it a lifetime; the API then treats requests as that
user, subject to their role.

`trigger.dev mint-token` is the entry point (it uses your stored PAT):

```bash
UAT=$(trigger.dev mint-token --ttl 3600 --cap read:runs)
```

The token works anywhere a PAT does for user-level endpoints, and can be
exchanged for an environment JWT at `POST
/api/v1/projects/:ref/:env/jwt` to reach environment-scoped data (the
same exchange a PAT supports).

## How it works

A user-actor token is a short-lived JWT verified by a new first-class
`authenticateUserActor` method on the RBAC plugin. Self-hosters get a
built-in fallback; role-aware enforcement comes from the plugin.
Effective permissions are the intersection of the user's role and the
token's optional scope cap, so a token is only ever narrower than the
user, never broader.

Minting is restricted to personal access tokens (a token can't mint
another one, and an environment key can't mint one). Tokens default to a
1 hour lifetime (max 365 days). When exchanged for an environment JWT,
the user is stamped on it for attribution and the scope cap is carried
through.
## Summary

The CLAUDE.md audit job (`.github/workflows/claude-md-audit.yml`)
frequently hits its 15-turn cap before it finishes reviewing a PR, so
the job fails without posting a verdict. For example, the audit job
failed on [this
run](https://github.com/triggerdotdev/trigger.dev/actions/runs/27837408945/job/82390460772?pr=3990).

This raises `--max-turns` from 15 to 25 to give the review room to
complete, and pins `--model claude-opus-4-8` (the job previously
inherited the action default model).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.