feat(agent): bundle rtk with the agent package#3145
Conversation
Vendor a pinned rtk binary into the agent package's dist/rtk/ at build time so RTK output compression uses one consistent version everywhere and is available for cloud runs, instead of depending on rtk being on PATH. - build/rtk-binary.mjs downloads and caches the pinned rtk release for the build target; tsup copies it into dist/rtk/ (best-effort, warns and skips offline). - Cloud: agent-server bin defaults POSTHOG_RTK to the bundled binary when unset. - Desktop: a vite plugin stages dist/rtk/ into .vite/build/rtk/ (asar-unpacked) and the host points POSTHOG_RTK at it, mirroring the Claude CLI binary. The runtime resolver is unchanged and still honors an explicit POSTHOG_RTK path or a 0/false opt-out, falling back to PATH when no bundle is present. Generated-By: PostHog Code Task-Id: ea7c6128-03d7-492c-a1c8-734175f03927
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
|
React Doctor found no issues in the changed files. 🎉 Reviewed by React Doctor for commit |
|
Address Greptile review on #3145: - Pin the SHA-256 of each rtk release archive alongside RTK_VERSION and verify the download before extraction, so a swapped release asset or a truncated download fails the build instead of being cached and bundled (P1 security). - Move POSTHOG_RTK cleanup to afterEach so teardown runs even if an assertion throws (P2). Generated-By: PostHog Code Task-Id: ea7c6128-03d7-492c-a1c8-734175f03927
|
Reviews (2): Last reviewed commit: "fix(agent): verify rtk archive checksum ..." | Re-trigger Greptile |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0146ce3708
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Address qa-swarm review nits on the rtk bundling: - Document the tar-vs-zip archive-layout assumption and make the post-extraction failure message name the target whose layout may have changed, so a future rtk release that nests the binary fails loudly. - Derive the cached archive filename from the build target rather than re-parsing the download URL. - Clarify the bin.ts comment: any explicit POSTHOG_RTK value wins, not only the 0/false opt-out. - Drop now-redundant per-test POSTHOG_RTK deletes (afterEach covers them). Generated-By: PostHog Code Task-Id: ea7c6128-03d7-492c-a1c8-734175f03927
…ault The desktop auth-adapter unconditionally set POSTHOG_RTK to the bundled rtk path, clobbering an explicit operator override (a custom path, or `0`/`false` to opt out). Guard the assignment with `!process.env.POSTHOG_RTK` so an explicit value wins, matching the cloud bin's default behavior, and add a test covering the opt-out case. Generated-By: PostHog Code Task-Id: ea7c6128-03d7-492c-a1c8-734175f03927
|
Note 🤖 Automated comment by QA Swarm (via PR Shepherd) — not written by a human Multi-perspective review: qa-team (specialists + generalists), paul-reviewer, xp-reviewer, security-audit Verdict: 💬 APPROVE WITH NITS (final @ bc53fab)Nine review rounds across the PR's evolution (rtk bundling → gauge telemetry → desktop reporter → env scrub → runnability probe); every finding is fixed and verified except the one cross-repo item below. CI green (20 pass / 0 fail as of b9c14cc; latest push re-running). Awaiting human sign-off (requested: @joshsny) — stamphog's gates require it for the auth-adapter touch and third-party binary pipeline regardless of review state. Open item (cross-repo, unresolvable here)
Highlights of what the loop fixed
Supply-chain posture (discussed, tracked)Hash-pinned download closes post-pin tampering; agreed follow-up is mirroring rtk release artifacts into PostHog-controlled storage so builds stop fetching third-party infra (per-platform npm packages as the longer-term option). Previous rounds (8)r1 @ 2168f7e — nits on retry shape/teardown (fixed). r2 @ 4e5352d — 2 MED + 5 LOW + 3 NIT; 5 fixed, 3 deferred. r3 @ 20cb39c — convergent HIGH (EXDEV) in r2's own fix; fixed. r4 @ 258e019 — polish only; fixed. r5 @ 4763eb5 — comment accuracy; fixed. r6 @ 5fe748e — MED counter_id lifetime + recipe gaps; fixed. r7 @ 7b77dfd — desktop reporter delta: 2 LOW + 3 NIT (bindings map, sync machineId, init catch); fixed. r8 @ b9c14cc — env scrub + polish delta: clean. Automated by QA Swarm — not a human review |
Read RTK's own per-run savings tally (`rtk gain --format json`) and emit it as a `_posthog/rtk_savings` event through the existing task-run event stream, so we can report how much context RTK output-compression saves. Best-effort and cloud-only: never disrupts a run, and no-ops when RTK is disabled/unavailable or tracked nothing. Stacked on top of the rtk-bundling PR (#3145). Generated-By: PostHog Code Task-Id: ea7c6128-03d7-492c-a1c8-734175f03927
emitRtkSavings called resolveRtkSavings() with process-global defaults, so agent-server tests emitted real machine-wide savings wherever rtk is installed and silently skipped the emission path on CI. Inject the resolver through AgentServerConfig, stub it null in the test server, and add a test that the rtk_savings notification is emitted exactly once, before terminal cleanup stops event ingest. Generated-By: PostHog Code Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
rtk keeps one machine-global cumulative tally shared by every session using the same database, so per-run "savings" only made sense in the ephemeral cloud sandbox. Reshape the event to gauge semantics: the cumulative_* properties are counter reads grouped by counter_id, and consumers difference readings per counter (treating a drop as a reset) instead of summing events. Concurrent sessions reading the shared counter now dedupe under max() rather than double-count, which is what lets a future desktop emit reuse the same shape with the install's device id as counter_id. In the cloud sandbox the counter starts at zero, so a single reading still equals the run's savings. Generated-By: PostHog Code Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
There was a problem hiding this comment.
Gates denied this PR because it touches auth-related code (auth-adapter.ts) and exceeds the size/complexity ceiling for automated approval. Additionally, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human owner to sign off regardless of gate status.
pauldambra
left a comment
There was a problem hiding this comment.
Note
🤖 Automated comment by QA Swarm — not written by a human
QA Swarm round 6 (delta review of 5fe748e, the gauge reshape). One convergent MEDIUM: counter_id = run_id breaks on warm snapshot-resumed sandboxes where the rtk tally survives into a new run_id.
…field counter_id was the run_id, but a warm snapshot-resume restarts the sandbox with its filesystem — and rtk tally — intact under a new run_id, so a resumed run's reading re-counted the prior run's savings under a fresh counter. The task tracks the database's lifetime; a cold resume reads a dropped value, which the reset rule already handles. Also make the zero-baseline rule explicit in the consumption contract (a single reading counts in full) and stop emitting avg_savings_pct — an average cannot be differenced; consumers derive the ratio from the two cumulative token columns. Generated-By: PostHog Code Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
There was a problem hiding this comment.
The PR matches the auth deny-list (touches auth-adapter.ts) and exceeds the size/complexity ceiling. Beyond the gate failures, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human owner to sign off regardless of how thoroughly the inline review comments were addressed.
Desktop sessions live for days, so a quit-time emit would never fire; instead a core RtkSavingsReporter reads the bundled rtk's cumulative tally at app start and once a day while running, and captures it as the "RTK savings gauge" analytics event in the same cumulative_*/counter_id shape the cloud agent emits. A reading missed at shutdown is never lost — the counter is cumulative and the next start reads it — and unchanged readings are skipped so idle installs emit nothing. The counter id hashes machine id + OS username, since rtk's database is per OS user. The bundled-rtk stage dir and binary name move into shared consts on the rtk-savings module (now exported as @posthog/agent/server/ rtk-savings), replacing the "keep in sync" comment duplication in the workspace-server resolver and bin.ts. Generated-By: PostHog Code Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
There was a problem hiding this comment.
Gates denied this PR because it touches auth-related code (auth-adapter.ts) and exceeds the size/complexity ceiling for auto-approval. Beyond the gate failures, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human owner to sign off regardless of how thoroughly inline review comments were addressed.
Every analytics event in the app pairs its ANALYTICS_EVENTS constant with a *Properties interface and an EventPropertyMap entry (that map backs the renderer's typed track helper, and main-process events like CLOUD_STREAM_DISCONNECTED register too). RTK savings gauge was the one event missing both — add RtkSavingsGaugeProperties, the map entry, and a satisfies check at the reporter's call site so the emitted bag cannot drift from the declared shape. Generated-By: PostHog Code Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
Register the reporter token in MainBindings so the eager typed get is compile-checked; use async machineId() instead of spawning a sync child process on the Electron main thread; guard the reporter's ready-then- schedule path against dispose-before-ready and rejections; correct the counterId lifetime doc to "per machine and OS user". Generated-By: PostHog Code Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
There was a problem hiding this comment.
Gates denied this PR: it touches auth-related code (auth-adapter.ts) and exceeds the size/complexity ceiling. Beyond gate failures, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human team member to sign off regardless of how thoroughly inline review comments were addressed.
There was a problem hiding this comment.
Gates denied this PR because it touches auth-related code (auth-adapter.ts) and exceeds the size/complexity ceiling. Beyond the gate failures, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human team member to sign off regardless of how thoroughly the inline review comments were addressed.
rtk gain only reads rtk's own stats database, but received the full process env — API keys and tokens included — on every telemetry read. Allowlist just the variables it needs to locate its data dir and spawn per platform. The rewrite-hook path is untouched: there rtk runs the user's own commands, which genuinely need the real env. Generated-By: PostHog Code Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
There was a problem hiding this comment.
This PR touches auth-related code (auth-adapter.ts), exceeds the size/complexity ceiling, and introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a significant security surface change that requires a human team member to sign off regardless of how thoroughly the inline review comments were addressed.
A bundled rtk can exist but be unexecutable — wrong arch/libc for the image, or a mac build whose ad-hoc signing failed — and rewriting every eligible command through it would fail them all, strictly worse than having no rtk. The rewrite hook now probes `rtk --version` lazily on the first command it would rewrite, memoized per session: a failed probe logs once and downgrades the session to no-rewrite. Both hosts flow through this one seam, so neither adoption site (bin.ts, auth-adapter) needs its own check, and `rtk gain` already tolerates a broken binary. Generated-By: PostHog Code Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
There was a problem hiding this comment.
Gates denied this PR because it touches auth-related code (auth-adapter.ts) and exceeds the size/complexity ceiling for auto-approval. Beyond the gate failures, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human team member to sign off.
|
Reviews (3): Last reviewed commit: "fix(agent): probe rtk before rewriting c..." | Re-trigger Greptile |
|
I explicitly didn't do this since I don't think bundling |
Problem
Follow-up to #3096, which added RTK command-output compression to the agent but only activated when
rtkhappened to be onPATH. That leaves cloud runs (and any machine without rtk installed) getting no compression, and different environments potentially running different rtk versions. This packages a pinned rtk with the agent so the token savings apply consistently everywhere — including cloud runs — as discussed in the thread on #3096.Refs #1076
Changes
packages/agent/build/rtk-binary.mjs— pins the rtk version and downloads the matching GitHub release for the build target, caching it undernode_modules/.cache. The agent'stsupbuild copies it intodist/rtk/. Best-effort: an offline build warns and skips rather than failing, and the runtime then falls back toPATH.agent-serverbin defaultsPOSTHOG_RTKto the bundled binary (itsdist/rtk/sibling) when the env var is unset.dist/rtk/into.vite/build/rtk/(asar-unpacked), and the host pointsPOSTHOG_RTKat it, mirroring how the Claude CLI binary is bundled. Single source of truth keeps the version consistent across hosts.The runtime resolver added in #3096 is unchanged: it still honors an explicit
POSTHOG_RTKpath or a0/falseopt-out, and no bundle means it falls back toPATH(zero behavior change).How did you test this?
pnpm --filter @posthog/agent build— confirmed it downloads and vendorsdist/rtk/rtk(rtk 0.43.0), executable, at the path the cloud bin resolves (dist/server/../rtk/rtk).pnpm --filter @posthog/agent typecheck,@posthog/workspace-server typecheck,code typecheck— all clean.pnpm --filter @posthog/agent test rtk(38 pass) and theauth-adaptersuite, including a new test covering thePOSTHOG_RTKwiring (7 pass).Automatic notifications
Created with PostHog from a Slack thread