Skip to content

[pull] main from triggerdotdev:main#217

Merged
pull[bot] merged 16 commits into
Dustin4444:mainfrom
triggerdotdev:main
Jun 12, 2026
Merged

[pull] main from triggerdotdev:main#217
pull[bot] merged 16 commits into
Dustin4444:mainfrom
triggerdotdev:main

Conversation

@pull

@pull pull Bot commented Jun 12, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )


This change is Reviewable

ericallam and others added 16 commits June 12, 2026 11:56
Vouches `saasjesus` as a contributor (vouch request #3915) so their PRs
clear the vouch check instead of being auto-closed.
…/plugins internal (#3919)

## What

`buildJwtAbility` — the decoder for public-token scope strings
(`read:tags:…`, `read:runs:run_abc`, `admin`, …) — now lives in
`@trigger.dev/plugins` as the single source of truth.
`@trigger.dev/rbac` re-exports it, so the built-in fallback and any auth
plugin interpret a token identically.

Scope strings are split on only the first **two** colons
(`action:type:id`), so a resource id that itself contains colons — e.g.
a tag like `user:123` — is matched in full rather than truncated to its
first segment. (The fallback already did this; this makes it the one
shared implementation.)

`@trigger.dev/plugins` is now **private (unpublished)** and gains a
`@triggerdotdev/source` export condition, so consumers bundle it from
source per-commit like `@trigger.dev/core` instead of resolving a
published version — no cross-version coordination.

## Why

Two hand-maintained copies of the scope grammar drift, and the
difference silently changes what a token grants. One shared decoder
removes that class of bug.

## Notes

- No changeset: `@trigger.dev/plugins` is now private and
`@trigger.dev/rbac` is internal — neither is published.
- Unit coverage for the colon-id path lives in
`internal-packages/rbac/src/ability.test.ts` (now exercising the shared
function).
## Problem

The item-streaming endpoint of the two-phase batch API (`POST
/api/v3/batches/:batchId/items`) processed streamed items strictly
sequentially. For a batch of many large payloads, each offloaded to
object storage inline, this serialized N object-store round-trips inside
a single request and could exceed Node's default `server.requestTimeout`
(300s). The webapp then returned `408`, which the SDK reads as `408
terminated` and retries up to 5 times, turning a slow ingest into a
failure that takes tens of minutes to surface.

## Fix

Ingest now runs through `p-map` over the NDJSON async iterable with
bounded concurrency (`STREAMING_BATCH_INGEST_CONCURRENCY`, default 10):

- `p-map` pulls lazily from the stream, so at most `concurrency` items
are read and in-flight at once. Peak memory stays bounded to roughly
`concurrency × STREAMING_BATCH_ITEM_MAXIMUM_SIZE` and request-body
backpressure is preserved.
- Set the env to `1` for fully sequential ingestion (escape hatch).

## Why this is safe (ordering and idempotency unchanged)

- Ordering derives from each item's index (enqueue `timestamp =
batch.createdAt + index`), not enqueue order.
- Dedup is atomic per index in `enqueueBatchItem`.
- The NDJSON parser now stamps oversized-item markers with their emit
position, removing the consumer's sequential `lastIndex` assumption (the
only order-dependent bit).
- The count-check and conditional-seal path is untouched.

## Scope

This speeds up every batch ingested through the streaming endpoint, not
just large-payload batches. Each item does a per-item Redis enqueue
regardless of size, and those now overlap. Large payloads benefit most
because they add an object-store offload round-trip on top of the
enqueue.

## Verification

Added an integration test (`streamBatchItems.test.ts`) that drives the
real service against Postgres + Redis + RunEngine and times a 150-item
batch at increasing concurrency. Object-store offload is modelled as a
fixed per-item latency (local round-trips are too small to compare
meaningfully):

```
runCount=150
  large payloads (10ms/item offload):
    concurrency=1   1739ms
    concurrency=10  192ms  (9.1x faster)
    concurrency=50  57ms   (30.7x faster)
  small payloads (Redis enqueue only, no offload):
    concurrency=1   90ms
    concurrency=10  24ms   (3.7x faster)
```

The test asserts correctness at every concurrency (all items accepted,
sealed, enqueued exactly once), that parallel ingest beats the
sequential floor, and that the small-payload case is strictly faster
than sequential, so the win is not specific to large payloads.

Also exercised end-to-end over real HTTP against a local server: a
20-item batch (12MB body) ingests and seals, a re-stream of the sealed
batch returns `sealed: true` with zero re-accepted items (idempotent
retry), and an oversized item still seals at its correct index.

Existing coverage stays green: concurrent ingest of a 100-item batch,
in-flight processing never exceeding the configured concurrency,
concurrent dedup on streaming retry, and emit-position marker indexing.

## Follow-ups (not in this PR)

- SDK pre-offload of large item payloads (send `application/store` refs
instead of raw blobs) to remove object-store work from the request hot
path and shrink the request body.
- Optional `server.requestTimeout` bump as a safety net.

## CI fix

Added `.github/workflows/codeql.yml` to replace GitHub's automatic
("dynamic") CodeQL scanning. The dynamic setup was failing to upload
SARIF results because the auto-generated `GITHUB_TOKEN` lacked the
`security-events: write` permission. The explicit workflow grants that
permission at the job level and pins all actions to commit SHAs,
consistent with the repo's security conventions.

## ✅ Checklist

- [ ] I have followed every step in the [contributing
guide](https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md)
- [ ] The PR title follows the convention.
- [ ] I ran and tested the code works

---

## Testing

- Integration test (`streamBatchItems.test.ts`) validates correctness
and performance at concurrency 1, 10, and 50 for both large and small
payloads.
- End-to-end verified over real HTTP: 20-item/12MB batch ingests and
seals, idempotent retry returns `sealed: true`, oversized item seals at
correct index.

---

## Changelog

Streaming batch ingest now processes items with bounded concurrency
instead of one at a time, so batches of many large payloads ingest far
faster and no longer time out. Concurrency is configurable via
`STREAMING_BATCH_INGEST_CONCURRENCY` (default 10); set it to 1 for fully
sequential ingestion.

---

## Screenshots

_[Screenshots]_

💯

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…3923)

`@trigger.dev/plugins` re-exported
`sanitizeBranchName`/`isValidGitBranchName` from `@trigger.dev/core` as
a convenience forwarder. Nothing actually imports them through this
package — every consumer (webapp, `@trigger.dev/rbac`, …) imports them
directly from `@trigger.dev/core/v3/utils/gitBranch`.

Removing the forwarder keeps the package entry free of **runtime** core
imports (only type re-exports + `buildJwtAbility` remain), so consumers
that bundle `@trigger.dev/plugins` from source don't pull an unrelated
core subpath into their build.

No behavior change; the helpers remain available from
`@trigger.dev/core` where they're defined.
… 404ing fresh sessions (#3914)

## Summary

Two read-replica races on the session APIs could break chats whose first
activity lands inside the replication window (or any time the replica
lags):

1. A session's first `.in` append or `.out` subscribe could fail with a
404 for a session that exists on the writer, because the route resolved
the Session row on the replica only.
2. `ensureRunForSession` probed run liveness on the replica, so a probe
miss on a run triggered moments earlier was judged "run is dead" and a
second live run was spawned for the same session. Both runs then
consumed the same input stream, producing duplicated turns and doubled
responses (and doubled LLM cost).

## Fix

Liveness now re-probes the writer before declaring the current run dead
(the old code already fell back to the writer, but only to recover the
friendlyId, after the wrong verdict was made). Session resolution on the
append and subscribe/init routes goes through a new
`resolveSessionWithWriterFallback`, which stays replica-first on the hot
path and only touches the writer on a miss.

Reproduced and verified against a local streaming replica with an
artificial apply delay: pre-fix, a send immediately after session
creation reliably produced either the 404 or two executing runs with a
doubled response; post-fix, the same flow produces exactly one run and
one response.

Also rides along: the local docker replica's default apply delay drops
from 150ms to a realistic 20ms (override via `REPLICA_APPLY_DELAY` when
you want to deliberately widen the race window).
…tinuation boots (#3920)

## Summary

Two `chat.createSession()` bugs that break chats at its abstraction
level:

1. **Stopping a generation wedged the run forever.** `turn.complete()`
bare-awaited the AI SDK's `totalUsage` promise, which never settles
after a stop-abort. The run stayed stuck inside the stopped turn (trace
shows a permanently partial `ai.streamText` span and no further `waiting
for next message`), so the chat could never take another message. Fixed
with the same 2s `Promise.race` guard `chat.agent`'s turn loop already
uses.

2. **Continuation runs invoked the model with an empty prompt.** The
first turn only waited for a message on `preload` boots. A continuation
run (spawned after a cancel, crash, or version upgrade) arrives with the
boot payload stripped, so the loop ran a turn with zero messages and
errored with `AI_InvalidPromptError: messages must not be empty`.
Message-less continuation boots now wait for the next session input
("waiting for first message (continuation)"), and `turn.continuation` is
preserved across the wait so user code can seed stored history off it.

Both reproduced and verified end-to-end against a live environment (stop
followed by a next turn; cancel followed by a continuation turn with
seeded history), plus the existing unit suite.
Adds the purchase UI for extra schedules, mirroring preview branches

## Changes
- `setSchedulesAddOn` platform client + `SetSchedulesAddOnService`
(purchase + quota-increase via Plain).
- `ScheduleListPresenter` surfaces add-on / quota / pricing;
`checkSchedule` counts purchased schedules toward the limit (`base +
purchased`).
- `PurchaseSchedulesModal` on the Schedules page — bought in **bundles
of 1,000 ($10/mo each)**; bundle increments enforced client-side and in
the action's zod schema.
## Summary
7 improvements, 1 bug fix.

## Improvements
- `trigger init` now sets up your AI coding assistant as part of project
setup: pick the MCP server, the agent skills, or both, then scaffold
with the CLI or hand off to your assistant. Adds a new `getting-started`
agent skill that teaches assistants how to bootstrap Trigger.dev
(install the SDK, write `trigger.config.ts`, create a first task, run
`trigger dev`), so the AI-driven setup path works end to end. It ships
in the CLI alongside the existing skills, version-matched to your SDK.
([#3872](#3872))
- `dev` and `deploy` now fail with a clear error when two tasks are
defined with the same id, including across different task types (e.g. a
scheduled task and a regular task sharing an id). Previously the second
definition silently overwrote the first, so one of the tasks would
vanish with no warning. Task ids are detected as duplicates during
indexing (naming each offending id and the files it was found in), and
the same rule is enforced server-side when the background worker is
registered.
([#3865](#3865))
- `trigger skills` installs Trigger.dev agent skills into your coding
agent so it knows how to write tasks, schedules, realtime, and
chat.agent code. The skills ship with the CLI and are copied into each
tool's native skills directory (Claude Code, Cursor, GitHub Copilot, and
Codex / AGENTS.md), and `trigger dev` offers to install them on first
run. ([#3868](#3868))
- Reliability fixes for `chat.agent`. A user message sent while the
agent is streaming is no longer delivered twice (which could run a
duplicate turn), input appends now carry an idempotency key so a retried
send can't duplicate a message, stopping a generation clears the
streaming state so a page reload doesn't replay the stopped turn, and
runs can now carry the full set of dashboard tags instead of being
silently truncated. `onTurnComplete` now fires on errored turns (with
the thrown error attached) and the failed turn's user message is
persisted so it isn't lost on the next run. Custom agents and manual
`chat.writeTurnComplete` callers now trim the output stream, sending a
custom action no longer leaves a second stream reader running, and a
long-lived `watch` subscription no longer grows its dedupe set without
bound. ([#3891](#3891))
- Continuation chat boots no longer stall for around 10 seconds before
the first turn. The `session.in` resume cursor is now found with a
non-blocking records read instead of draining an SSE long-poll (which
always waited out its full 5 second inactivity window, twice per boot),
the boot reads run concurrently, and chat snapshots carry the cursor so
subsequent boots skip the scan entirely.
([#3907](#3907))
- Record client-side dequeue API latency in the supervisor consumer pool
as a Prometheus histogram
(`queue_consumer_pool_dequeue_duration_seconds`, labelled by `outcome`:
success/empty/error).
([#3887](#3887))
- Add `GetProjectEnvironmentsResponseBody` and `ProjectEnvironment`
schemas for the new `GET /api/v1/projects/{projectRef}/environments`
endpoint, which lists the parent environments (dev, staging, preview,
prod) a personal access token can access for a project. Dev is scoped to
the token owner and branch (preview child) environments are excluded.
([#3880](#3880))

## Bug fixes
- Fix two `chat.createSession()` bugs: stopping a generation no longer
wedges the run (the turn loop raced a `totalUsage` promise that never
settles after a stop-abort), and continuation runs now wait for the next
message instead of invoking the model with an empty prompt.
([#3920](#3920))

<details>
<summary>Raw changeset output</summary>

⚠️⚠️⚠️⚠️⚠️⚠️

`main` is currently in **pre mode** so this branch has prereleases
rather than normal releases. If you want to exit prereleases, run
`changeset pre exit` on `main`.

⚠️⚠️⚠️⚠️⚠️⚠️

# Releases
## @trigger.dev/build@4.5.0-rc.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.6`

## trigger.dev@4.5.0-rc.6

### Patch Changes

- `trigger init` now sets up your AI coding assistant as part of project
setup: pick the MCP server, the agent skills, or both, then scaffold
with the CLI or hand off to your assistant. Adds a new `getting-started`
agent skill that teaches assistants how to bootstrap Trigger.dev
(install the SDK, write `trigger.config.ts`, create a first task, run
`trigger dev`), so the AI-driven setup path works end to end. It ships
in the CLI alongside the existing skills, version-matched to your SDK.
([#3872](#3872))

- `dev` and `deploy` now fail with a clear error when two tasks are
defined with the same id, including across different task types (e.g. a
scheduled task and a regular task sharing an id). Previously the second
definition silently overwrote the first, so one of the tasks would
vanish with no warning. Task ids are detected as duplicates during
indexing (naming each offending id and the files it was found in), and
the same rule is enforced server-side when the background worker is
registered.
([#3865](#3865))

- `trigger skills` installs Trigger.dev agent skills into your coding
agent so it knows how to write tasks, schedules, realtime, and
chat.agent code. The skills ship with the CLI and are copied into each
tool's native skills directory (Claude Code, Cursor, GitHub Copilot, and
Codex / AGENTS.md), and `trigger dev` offers to install them on first
run. ([#3868](#3868))

    ```bash
    trigger skills --target claude-code
    ```

Replaces the previous `install-rules` command, which stays as an alias.

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.6`
    -   `@trigger.dev/build@4.5.0-rc.6`
    -   `@trigger.dev/schema-to-json@4.5.0-rc.6`

## @trigger.dev/core@4.5.0-rc.6

### Patch Changes

- Reliability fixes for `chat.agent`. A user message sent while the
agent is streaming is no longer delivered twice (which could run a
duplicate turn), input appends now carry an idempotency key so a retried
send can't duplicate a message, stopping a generation clears the
streaming state so a page reload doesn't replay the stopped turn, and
runs can now carry the full set of dashboard tags instead of being
silently truncated. `onTurnComplete` now fires on errored turns (with
the thrown error attached) and the failed turn's user message is
persisted so it isn't lost on the next run. Custom agents and manual
`chat.writeTurnComplete` callers now trim the output stream, sending a
custom action no longer leaves a second stream reader running, and a
long-lived `watch` subscription no longer grows its dedupe set without
bound. ([#3891](#3891))
- Continuation chat boots no longer stall for around 10 seconds before
the first turn. The `session.in` resume cursor is now found with a
non-blocking records read instead of draining an SSE long-poll (which
always waited out its full 5 second inactivity window, twice per boot),
the boot reads run concurrently, and chat snapshots carry the cursor so
subsequent boots skip the scan entirely.
([#3907](#3907))
- Record client-side dequeue API latency in the supervisor consumer pool
as a Prometheus histogram
(`queue_consumer_pool_dequeue_duration_seconds`, labelled by `outcome`:
success/empty/error).
([#3887](#3887))
- `dev` and `deploy` now fail with a clear error when two tasks are
defined with the same id, including across different task types (e.g. a
scheduled task and a regular task sharing an id). Previously the second
definition silently overwrote the first, so one of the tasks would
vanish with no warning. Task ids are detected as duplicates during
indexing (naming each offending id and the files it was found in), and
the same rule is enforced server-side when the background worker is
registered.
([#3865](#3865))
- Add `GetProjectEnvironmentsResponseBody` and `ProjectEnvironment`
schemas for the new `GET /api/v1/projects/{projectRef}/environments`
endpoint, which lists the parent environments (dev, staging, preview,
prod) a personal access token can access for a project. Dev is scoped to
the token owner and branch (preview child) environments are excluded.
([#3880](#3880))

## @trigger.dev/python@4.5.0-rc.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/sdk@4.5.0-rc.6`
    -   `@trigger.dev/core@4.5.0-rc.6`
    -   `@trigger.dev/build@4.5.0-rc.6`

## @trigger.dev/react-hooks@4.5.0-rc.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.6`

## @trigger.dev/redis-worker@4.5.0-rc.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.6`

## @trigger.dev/rsc@4.5.0-rc.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.6`

## @trigger.dev/schema-to-json@4.5.0-rc.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.6`

## @trigger.dev/sdk@4.5.0-rc.6

### Patch Changes

- Reliability fixes for `chat.agent`. A user message sent while the
agent is streaming is no longer delivered twice (which could run a
duplicate turn), input appends now carry an idempotency key so a retried
send can't duplicate a message, stopping a generation clears the
streaming state so a page reload doesn't replay the stopped turn, and
runs can now carry the full set of dashboard tags instead of being
silently truncated. `onTurnComplete` now fires on errored turns (with
the thrown error attached) and the failed turn's user message is
persisted so it isn't lost on the next run. Custom agents and manual
`chat.writeTurnComplete` callers now trim the output stream, sending a
custom action no longer leaves a second stream reader running, and a
long-lived `watch` subscription no longer grows its dedupe set without
bound. ([#3891](#3891))
- Continuation chat boots no longer stall for around 10 seconds before
the first turn. The `session.in` resume cursor is now found with a
non-blocking records read instead of draining an SSE long-poll (which
always waited out its full 5 second inactivity window, twice per boot),
the boot reads run concurrently, and chat snapshots carry the cursor so
subsequent boots skip the scan entirely.
([#3907](#3907))
- Fix `chat.headStart` when `hydrateMessages` is registered. The warm
route's step-1 partial now reaches the agent's accumulator on the
hydrate path, so `onTurnComplete` carries the full first turn (the
head-start user message included), tool-call handovers resume from step
2 instead of re-running step 1, and the assistant `messageId` stays
stable across the handover.
([#3907](#3907))
- Preserve reasoning parts across the `chat.headStart` handover.
Extended-thinking models' step-1 reasoning now lands in the durable
session history (and `onTurnComplete`) under the same assistant
`messageId`, with provider metadata intact so Anthropic thinking
signatures survive replays.
([#3907](#3907))
- Fix two `chat.createSession()` bugs: stopping a generation no longer
wedges the run (the turn loop raced a `totalUsage` promise that never
settles after a stop-abort), and continuation runs now wait for the next
message instead of invoking the model with an empty prompt.
([#3920](#3920))
-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.6`

## @trigger.dev/plugins@4.5.0-rc.6

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.6`

</details>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…3694)

## Summary

Docs follow-up for #3683 (`TriggerClient` for per-instance SDK
configuration). Adds a dedicated reference page and threads the new
pattern through the existing management + preview-branches docs.

## What's in

**New page** `docs/management/multiple-clients.mdx` — when to use
`TriggerClient` vs `configure()` vs `auth.withAuth`, env-var fallback
rules, isolation contract, namespace surface, `inheritContext` opt-in,
and a when-to-use-what table.

**Updated pages**

- `docs/management/authentication.mdx` — rewrote the `auth.withAuth`
section to reflect the now-ALS-backed semantics (the prior version
warned about concurrency races and pointed at issue #3298 as a tracked
fix; that fix landed in #3683). Added `tr_preview_*` to the key prefix
list. Reframed the multi-target use case to lead with `TriggerClient`,
with `auth.withAuth` as the temporary-override helper.
- `docs/management/overview.mdx` — added a `Multiple clients in one
process` subsection.
- `docs/deployment/preview-branches.mdx` — added a `Triggering across
multiple branches from one process` example.
- `docs/triggering.mdx` — one-liner pointing at the new page for
cross-project triggering.
- `docs/docs.json` — slotted `management/multiple-clients` into the
Management API nav, right after authentication.

Paired with #3683.

## Test plan

- [ ] Mintlify preview renders cleanly
- [ ] Code samples in each updated page run as documented
- [ ] Cross-page links resolve (`/management/multiple-clients`,
`/management/authentication`)
…3871)

## Summary

Updates the AI-tooling docs for the new `trigger skills` installer that
shipped in #3868. The Skills page now documents `trigger skills` (skills
bundled with the CLI, version-matched to your SDK) and the four bundled
skills: `authoring-tasks`, `realtime-and-frontend`,
`authoring-chat-agent`, `chat-agent-advanced`. The old Agent Rules page
becomes a short "rules are now skills" redirect (kept because existing
redirects and the CLI link point at it), and the Building with AI
overview collapses the three-way Skills/Rules/MCP comparison into Skills
vs MCP.

Hold until the v4.5 CLI release ships, since `trigger skills` is not on
npm until then.
## Summary

Accuracy fixes across the AI chat docs: drop the non-existent per-call
option from `transport.preload`, clarify that `onValidateMessages` only
fires on turns carrying incoming messages, soften the turn-complete
token-refresh wording (the header is optional), document the new
`onTurnComplete` `error` field and `finishReason`, and correct the
idle-timeout default to 30 seconds.
… page (#3908)

## Summary

Two documentation improvements for the AI chat docs.

**Head-start persistence contract.** The fast starts page now documents
what your hooks can rely on across a head-start handover: one stable
assistant `messageId` for the whole turn, `onTurnComplete` as the
canonical persistence point, reasoning parts flowing into durable
history, and how Head Start composes with `hydrateMessages` (the
first-turn history arrives as `incomingMessages`, and the runtime
splices the warm partial onto the hydrated chain, deduplicated by id).
The hydrate examples on the lifecycle hooks and database persistence
pages now upsert their conversation row, since head-start first turns
run without a preload to create it.

**Sessions page.** The page opened with "a durable, task-bound,
bi-directional I/O channel pair", which reads as jargon and omitted run
orchestration entirely. It now leads with the plain mental model (a pair
of durable streams: input carries user messages, output carries
everything the agent produces) plus the Session's role orchestrating
runs, a diagram, a minimal runnable example, and a section on the
one-session-many-runs lifecycle.

Documents behavior shipping in
[#3907](#3907).
…ding-agents anatomy entry (#3921)

## Summary

Documents the two lower-level chat backend APIs and restructures the
Building agents section so it has a sane reading order.

**Custom agents page.** `chat.customAgent()` was effectively
undocumented (one passing mention) and `chat.createSession()` was buried
at the bottom of the Backend page, prompted by a customer asking whether
dropping down a level was supported at all. Both now live on one
dedicated page framed as a composition: register with `customAgent`,
then drive turns with the managed `createSession` iterator or a
hand-rolled primitives loop. The page covers the patterns the managed
lifecycle otherwise handles for you, each verified against a running
agent: seeding history on continuation runs (and why the seed must go
through the turn-0 `addIncoming`, which replaces the accumulator),
persisting the user message before streaming so a mid-stream reload
keeps it, racing `totalUsage` after a stop so the loop cannot wedge, and
the single-message wire shape.

**Backend page.** Now leads with a decision table across the three
abstraction levels and focuses on `chat.agent()`, routing to the new
page. Stale examples that read a plural `messages` field off the wire
payload are fixed (copy-pasting them broke turn accumulation), and the
ChatSessionOptions / ChatTurn reference tables gain their missing rows
(`compaction`, `pendingMessages`, usage fields, `setMessages`,
`prepareStep`).

**Anatomy page + reorder.** The Building agents group opened with the
long How it works mechanics page, a wall right after the Quick Start. A
short Anatomy page now leads the group: the three moving parts, one
annotated example where each region names the page that covers it, and a
routing table. How it works moves to the end of the group as the depth
payoff, matching where peer docs put their internals pages.

All pages visually verified against a local Mintlify build; cross-links
and anchors updated across the section.
## Summary

Adds the 4.5.0-rc.6 entry to the AI chat changelog, covering the
chat-facing items shipping in
[#3870](#3870): the
chat.agent reliability batch, the continuation boot latency fix, the
chat.headStart hydration and reasoning fixes, the chat.createSession
stop and continuation fixes, and the new trigger skills installer.

Should merge alongside the release so the changelog matches the
published version.
)

## Summary

Running `trigger.dev dev` before setting up a project crashed with a raw
`Cannot find matching package.json` stack trace from a transitive
dependency, instead of telling the user what to do next. It happens
whenever `dev` (or `update`) runs in a directory with no `package.json`
in it or any parent directory, for example right after creating an empty
project folder, or when `init` was exited before it scaffolded anything.

The CLI now detects the missing project and prints actionable guidance
pointing at `init`.

## Fix

`dev` runs an embedded package-version check before it loads any project
config. That check resolved `package.json` through a helper that throws
when nothing is found up the tree, and nothing caught it. It is now
wrapped, so a missing `package.json` produces a clear "run init" message
and a clean exit.

The config loader had the same latent crash on the `--skip-update-check`
path. Its resolvers for `package.json`, the lockfile, and the workspace
root all ran before the friendly "couldn't find your trigger.config.ts"
check, so any of them throwing masked it. That check now runs first and
short-circuits before the resolvers touch the filesystem.

Verified live: in an empty directory, `dev`, `dev --skip-update-check`,
and `update` all print a "run init" message and exit cleanly; in a
configured project, `dev` still resolves config and boots normally.
…ads (#3930)

## Summary

`triggerAndWait` (and other locked-version triggers) could
intermittently fail with `Task '<id>' not found on locked version
'<version>'` for a task that was registered on that version. The
failures came in bursts and recovered on their own, so a retry minutes
later would succeed.

## Root cause

For a locked-version trigger, the queue resolver looks up the task's
`BackgroundWorkerTask` metadata from the read replica (behind a Redis
cache). On a cache miss it queried the replica, and a `null` result was
treated as "task not registered" and turned into a non-retryable 422. A
read replica can return an empty result for a row that already exists on
the primary, so a momentarily-behind replica produced a false negative
even though the locked worker (resolved on the primary in the same
request) clearly had the task.

## Fix

On a cache miss, when the replica returns no row the resolver now
re-checks the primary before concluding the task is missing. If the
primary has the row it is used (and the cache is back-filled); the error
fires only when the primary genuinely lacks it, which is the only case
where the 422 is correct. The extra read happens on the
cache-miss-and-replica-empty path only, so the hot path is unchanged.

Verified with a unit test (replica stub vs. real primary) and end-to-end
against a local streaming replica with replication paused to reproduce
the stale read.

TRI-10868
@pull pull Bot locked and limited conversation to collaborators Jun 12, 2026
@pull pull Bot added the ⤵️ pull label Jun 12, 2026
@pull pull Bot merged commit 5232067 into Dustin4444:main Jun 12, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants