Skip to content

[pull] main from triggerdotdev:main#114

Merged
pull[bot] merged 3 commits into
Dustin4444:mainfrom
triggerdotdev:main
May 12, 2026
Merged

[pull] main from triggerdotdev:main#114
pull[bot] merged 3 commits into
Dustin4444:mainfrom
triggerdotdev:main

Conversation

@pull

@pull pull Bot commented May 12, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

ericallam added 3 commits May 12, 2026 18:37
… queues (#3558)

## Summary

Queues that use concurrency keys can no longer bypass the per-queue
length cap, and the "Queued | Running" columns in the dashboard now show
the true total across all CK variants instead of 0.

The cap and the dashboard both relied on `ZCARD` of the base queue key,
but CK-keyed runs live under `<base>:ck:<variant>` keys. Any queue that
used concurrency keys read 0 — letting a single CK variant grow
unbounded past the user's configured cap.

## Fix

Two per-base-queue counters are maintained inside the CK Lua scripts:
`<base>:lengthCounter` and `<base>:runningCounter`. Non-CK
enqueue/dequeue paths are untouched.

Counters are lazy-initialized the first time a CK enqueue (or nack)
lands on a queue: the Lua script sums `ZCARD` across the variants
tracked by `ckIndex`, sets the counter, then `INCR`s. Pre-existing CK
backlog on already-populated queues is captured automatically — no batch
migration required.

`INCR`/`DECR` is gated on `ZADD`/`SADD` returning 1 (a new entry vs an
idempotent no-op), so duplicate enqueues or re-dequeues don't inflate
the counter.

The counter is `SET` with a 24-hour TTL on init. `INCR`/`DECR` do not
extend the TTL, so the counter expires daily and the next CK operation
re-seeds it from `ckIndex`. This bounds any drift that accumulates
during the rolling-deploy overlap window — where old (un-Tracked) and
new (Tracked) webapp instances briefly coexist — to ≤24 hours, with no
admin sweep or background reconciler needed.

Read paths pipeline `ZCARD`/`SCARD` on the base key + `GET` on the
counter and sum. A missing counter is treated as 0, so pure non-CK
queues see the same answer as before.

The counter-aware scripts ship alongside the originals with a `Tracked`
suffix for rolling-deploy safety; a follow-up PR will drop the originals
once this has rolled out.

## Test plan

- [ ] `pnpm run test --filter @internal/run-engine` — 116 tests pass,
including a new `ckCounters.test.ts` covering lazy init from
pre-existing backlog, churn, floor-at-zero, the non-CK regression case,
mixed CK + non-CK on the same base queue, idempotent re-enqueue
(ZADD-already-exists), 24h TTL on the counter, and nack re-seeding after
counter expiry.
- [ ] Verified end-to-end against a live local environment:
- Triggered 24 CK enqueues across 4 variants → `lengthCounter=16`,
`runningCounter=8`, dashboard showed Queued=16 / Running=8 for the CK
queue.
- Set the env queue cap to 16, triggered 12 more enqueues → 8 succeeded,
4 rejected with `QueueSizeLimitExceededError`.
- Deleted the counter on a queue with 31 messages already sitting in CK
variants, triggered one more enqueue → counter materialized to 31 from
the `ckIndex` sum, then INCR'd.
## Summary

Local ClickHouse was burning ~325% CPU endlessly merging its own
telemetry tables (`metric_log`, `asynchronous_metric_log`, `part_log`,
`trace_log`) after the container had been running long enough to
accumulate hundreds of GB of system-log data. OrbStack Helper reflected
this on the host (~400% CPU).

These tables are not used by anything in the dev stack. They only exist
for ClickHouse to log itself, so disabling them eliminates the merge
churn entirely.

## Changes

- Adds `docker/config/clickhouse-disable-system-logs.xml`, mounted into
`/etc/clickhouse-server/config.d/`, that removes the noisy system log
tables via `<table remove="1"/>`.
- Mounts the override file in `docker/docker-compose.yml`.

After applying, idle CPU dropped from 325% to ~12% on my machine.

## Test plan

- [ ] `pnpm run docker` brings up the stack cleanly
- [ ] `docker stats clickhouse` shows low idle CPU
- [ ] App functionality unaffected (system log tables are not queried by
the webapp)
…mpling (#3567)

## Summary

Follow-up to #3561. The drift-audit workflow timed out on PR #3542 (92
files, +5962 lines) by hitting `--max-turns 15` before reaching a
verdict, leaving a red ❌ on that PR with no sticky comment.

## Changes

- `--max-turns` bumped from 15 to 30.
- Prompt now opens with an explicit "Strategy" section: read REVIEW.md
once, scan the file-list only, open at most 5 files (3-5 on PRs >50
files), and bias toward finishing over exploring.
- Final rule: *"when in doubt between one more file read and finish now
— finish now."*

The audit is allowed to miss things. It is not allowed to time out and
leave a red X.

## Test plan

- [ ] Verify this PR's audit posts `✅ REVIEW.md looks current for this
PR.` (small diff)
- [ ] After merge, retry the audit on #3542 or a similarly large PR and
confirm it completes
@pull pull Bot locked and limited conversation to collaborators May 12, 2026
@pull pull Bot added the ⤵️ pull label May 12, 2026
@pull pull Bot merged commit 6b0e78f into Dustin4444:main May 12, 2026
1 check was pending
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant