Skip to content

feat(agent): bundle rtk with the agent package#3145

Open
pauldambra wants to merge 19 commits into
mainfrom
posthog-code/bundle-rtk-with-agent
Open

feat(agent): bundle rtk with the agent package#3145
pauldambra wants to merge 19 commits into
mainfrom
posthog-code/bundle-rtk-with-agent

Conversation

@pauldambra

Copy link
Copy Markdown
Member

Problem

Follow-up to #3096, which added RTK command-output compression to the agent but only activated when rtk happened to be on PATH. That leaves cloud runs (and any machine without rtk installed) getting no compression, and different environments potentially running different rtk versions. This packages a pinned rtk with the agent so the token savings apply consistently everywhere — including cloud runs — as discussed in the thread on #3096.

Refs #1076

Changes

  • packages/agent/build/rtk-binary.mjs — pins the rtk version and downloads the matching GitHub release for the build target, caching it under node_modules/.cache. The agent's tsup build copies it into dist/rtk/. Best-effort: an offline build warns and skips rather than failing, and the runtime then falls back to PATH.
  • Cloud runs — the agent-server bin defaults POSTHOG_RTK to the bundled binary (its dist/rtk/ sibling) when the env var is unset.
  • Desktop — a vite plugin stages dist/rtk/ into .vite/build/rtk/ (asar-unpacked), and the host points POSTHOG_RTK at it, mirroring how the Claude CLI binary is bundled. Single source of truth keeps the version consistent across hosts.

The runtime resolver added in #3096 is unchanged: it still honors an explicit POSTHOG_RTK path or a 0/false opt-out, and no bundle means it falls back to PATH (zero behavior change).

How did you test this?

  • pnpm --filter @posthog/agent build — confirmed it downloads and vendors dist/rtk/rtk (rtk 0.43.0), executable, at the path the cloud bin resolves (dist/server/../rtk/rtk).
  • Verified the target mapping resolves the right release triple for linux/darwin/win x64+arm64 (win-arm64 has no asset and skips cleanly).
  • pnpm --filter @posthog/agent typecheck, @posthog/workspace-server typecheck, code typecheck — all clean.
  • pnpm --filter @posthog/agent test rtk (38 pass) and the auth-adapter suite, including a new test covering the POSTHOG_RTK wiring (7 pass).
  • Biome clean on all changed files; host-boundary check reports no new violations.

Automatic notifications

  • Publish to changelog?
  • Alert Sales and Marketing teams?

Created with PostHog from a Slack thread

Vendor a pinned rtk binary into the agent package's dist/rtk/ at build time so
RTK output compression uses one consistent version everywhere and is available
for cloud runs, instead of depending on rtk being on PATH.

- build/rtk-binary.mjs downloads and caches the pinned rtk release for the build
  target; tsup copies it into dist/rtk/ (best-effort, warns and skips offline).
- Cloud: agent-server bin defaults POSTHOG_RTK to the bundled binary when unset.
- Desktop: a vite plugin stages dist/rtk/ into .vite/build/rtk/ (asar-unpacked)
  and the host points POSTHOG_RTK at it, mirroring the Claude CLI binary.

The runtime resolver is unchanged and still honors an explicit POSTHOG_RTK path
or a 0/false opt-out, falling back to PATH when no bundle is present.

Generated-By: PostHog Code
Task-Id: ea7c6128-03d7-492c-a1c8-734175f03927
@trunk-io

trunk-io Bot commented Jul 4, 2026

Copy link
Copy Markdown

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown

React Doctor found no issues in the changed files. 🎉

Reviewed by React Doctor for commit bc53fab.

@greptile-apps

greptile-apps Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Security Review

  • No checksum verification on downloaded RTK binary (packages/agent/build/rtk-binary.mjs): The archive downloaded from github.com/rtk-ai/rtk/releases is not verified against a published SHA-256 or checksums file before extraction. The cached binary is subsequently bundled and executed as POSTHOG_RTK to compress agent command output. A corrupted download (e.g. truncated stream, disk-level error) or a compromised upstream release asset would be cached indefinitely and executed with agent privileges. Adding a checksum step — fetching and verifying the companion .sha256 file RTK releases ship, or pinning a known hash in the build script — would close this gap.

Reviews (1): Last reviewed commit: "feat(agent): bundle rtk with the agent p..." | Re-trigger Greptile

Comment thread packages/agent/build/rtk-binary.mjs
Comment thread packages/workspace-server/src/services/agent/auth-adapter.test.ts
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026
Address Greptile review on #3145:

- Pin the SHA-256 of each rtk release archive alongside RTK_VERSION and verify
  the download before extraction, so a swapped release asset or a truncated
  download fails the build instead of being cached and bundled (P1 security).
- Move POSTHOG_RTK cleanup to afterEach so teardown runs even if an assertion
  throws (P2).

Generated-By: PostHog Code
Task-Id: ea7c6128-03d7-492c-a1c8-734175f03927
@pauldambra pauldambra marked this pull request as ready for review July 4, 2026 12:56
stamphog[bot]

This comment was marked as outdated.

@stamphog stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026
@greptile-apps

greptile-apps Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Reviews (2): Last reviewed commit: "fix(agent): verify rtk archive checksum ..." | Re-trigger Greptile

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0146ce3708

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agent/build/rtk-binary.mjs
Comment thread packages/workspace-server/src/services/agent/auth-adapter.ts Outdated
Address qa-swarm review nits on the rtk bundling:

- Document the tar-vs-zip archive-layout assumption and make the
  post-extraction failure message name the target whose layout may have
  changed, so a future rtk release that nests the binary fails loudly.
- Derive the cached archive filename from the build target rather than
  re-parsing the download URL.
- Clarify the bin.ts comment: any explicit POSTHOG_RTK value wins, not only
  the 0/false opt-out.
- Drop now-redundant per-test POSTHOG_RTK deletes (afterEach covers them).

Generated-By: PostHog Code
Task-Id: ea7c6128-03d7-492c-a1c8-734175f03927
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026 — with PostHog
stamphog[bot]

This comment was marked as outdated.

@stamphog stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026
…ault

The desktop auth-adapter unconditionally set POSTHOG_RTK to the bundled rtk
path, clobbering an explicit operator override (a custom path, or `0`/`false`
to opt out). Guard the assignment with `!process.env.POSTHOG_RTK` so an explicit
value wins, matching the cloud bin's default behavior, and add a test covering
the opt-out case.

Generated-By: PostHog Code
Task-Id: ea7c6128-03d7-492c-a1c8-734175f03927
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026 — with PostHog

pauldambra commented Jul 4, 2026

Copy link
Copy Markdown
Member Author

Note

🤖 Automated comment by QA Swarm (via PR Shepherd) — not written by a human

Multi-perspective review: qa-team (specialists + generalists), paul-reviewer, xp-reviewer, security-audit

Verdict: 💬 APPROVE WITH NITS (final @ bc53fab)

Nine review rounds across the PR's evolution (rtk bundling → gauge telemetry → desktop reporter → env scrub → runnability probe); every finding is fixed and verified except the one cross-repo item below. CI green (20 pass / 0 fail as of b9c14cc; latest push re-running). Awaiting human sign-off (requested: @joshsny) — stamphog's gates require it for the auth-adapter touch and third-party binary pipeline regardless of review state.

Open item (cross-repo, unresolvable here)

  • 🟡 Cloud consumer for _posthog/rtk_savings — verified against posthog master: the tasks backend stores the notification but never captures it as an analytics event. Follow-up in posthog: map it to the "RTK savings gauge" event (RtkSavingsGaugeProperties in @posthog/shared), consumed as a per-counter diff, never a sum.

Highlights of what the loop fixed

  • 🟠 EXDEV cross-device rename introduced by an earlier fix — extraction now lives wholly in a per-process temp dir beside the cache.
  • 🟡 counter_id re-keyed run→task after spotting warm snapshot-resumes preserve the rtk tally across run_ids.
  • 🟡 Present-but-unrunnable binary hazard — the rewrite hook now probes rtk --version once per session and downgrades to no-rewrite on failure.
  • 🔒 rtk gain env scrubbed to a platform-essentials allowlist (no more tokens/keys reaching the third-party binary); telemetry reshaped to gauge semantics (cumulative counter reads, diff-not-sum contract); desktop reports daily via the typed analytics event map.

Supply-chain posture (discussed, tracked)

Hash-pinned download closes post-pin tampering; agreed follow-up is mirroring rtk release artifacts into PostHog-controlled storage so builds stop fetching third-party infra (per-platform npm packages as the longer-term option).

Previous rounds (8)

r1 @ 2168f7e — nits on retry shape/teardown (fixed). r2 @ 4e5352d — 2 MED + 5 LOW + 3 NIT; 5 fixed, 3 deferred. r3 @ 20cb39c — convergent HIGH (EXDEV) in r2's own fix; fixed. r4 @ 258e019 — polish only; fixed. r5 @ 4763eb5 — comment accuracy; fixed. r6 @ 5fe748e — MED counter_id lifetime + recipe gaps; fixed. r7 @ 7b77dfd — desktop reporter delta: 2 LOW + 3 NIT (bindings map, sync machineId, init catch); fixed. r8 @ b9c14cc — env scrub + polish delta: clean.


Automated by QA Swarm — not a human review

stamphog[bot]

This comment was marked as outdated.

@stamphog stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026 — with PostHog
@pauldambra pauldambra enabled auto-merge (squash) July 4, 2026 15:20
stamphog[bot]

This comment was marked as outdated.

@stamphog stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026
pauldambra added a commit that referenced this pull request Jul 4, 2026
Read RTK's own per-run savings tally (`rtk gain --format json`) and emit it as
a `_posthog/rtk_savings` event through the existing task-run event stream, so we
can report how much context RTK output-compression saves. Best-effort and
cloud-only: never disrupts a run, and no-ops when RTK is disabled/unavailable or
tracked nothing.

Stacked on top of the rtk-bundling PR (#3145).

Generated-By: PostHog Code
Task-Id: ea7c6128-03d7-492c-a1c8-734175f03927
@pauldambra pauldambra added the Create Release This will trigger a new release label Jul 4, 2026 — with PostHog
emitRtkSavings called resolveRtkSavings() with process-global defaults,
so agent-server tests emitted real machine-wide savings wherever rtk is
installed and silently skipped the emission path on CI. Inject the
resolver through AgentServerConfig, stub it null in the test server, and
add a test that the rtk_savings notification is emitted exactly once,
before terminal cleanup stops event ingest.

Generated-By: PostHog Code
Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
rtk keeps one machine-global cumulative tally shared by every session
using the same database, so per-run "savings" only made sense in the
ephemeral cloud sandbox. Reshape the event to gauge semantics: the
cumulative_* properties are counter reads grouped by counter_id, and
consumers difference readings per counter (treating a drop as a reset)
instead of summing events. Concurrent sessions reading the shared
counter now dedupe under max() rather than double-count, which is what
lets a future desktop emit reuse the same shape with the install's
device id as counter_id. In the cloud sandbox the counter starts at
zero, so a single reading still equals the run's savings.

Generated-By: PostHog Code
Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026

@stamphog stamphog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gates denied this PR because it touches auth-related code (auth-adapter.ts) and exceeds the size/complexity ceiling for automated approval. Additionally, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human owner to sign off regardless of gate status.

@stamphog stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026

@pauldambra pauldambra left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

QA Swarm round 6 (delta review of 5fe748e, the gauge reshape). One convergent MEDIUM: counter_id = run_id breaks on warm snapshot-resumed sandboxes where the rtk tally survives into a new run_id.

Comment thread packages/agent/src/server/agent-server.ts Outdated
Comment thread packages/agent/src/server/agent-server.ts Outdated
Comment thread packages/agent/src/server/agent-server.ts Outdated
Comment thread packages/agent/src/server/agent-server.test.ts
…field

counter_id was the run_id, but a warm snapshot-resume restarts the
sandbox with its filesystem — and rtk tally — intact under a new
run_id, so a resumed run's reading re-counted the prior run's savings
under a fresh counter. The task tracks the database's lifetime; a cold
resume reads a dropped value, which the reset rule already handles.
Also make the zero-baseline rule explicit in the consumption contract
(a single reading counts in full) and stop emitting avg_savings_pct —
an average cannot be differenced; consumers derive the ratio from the
two cumulative token columns.

Generated-By: PostHog Code
Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026

@stamphog stamphog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR matches the auth deny-list (touches auth-adapter.ts) and exceeds the size/complexity ceiling. Beyond the gate failures, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human owner to sign off regardless of how thoroughly the inline review comments were addressed.

@stamphog stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026
Desktop sessions live for days, so a quit-time emit would never fire;
instead a core RtkSavingsReporter reads the bundled rtk's cumulative
tally at app start and once a day while running, and captures it as the
"RTK savings gauge" analytics event in the same cumulative_*/counter_id
shape the cloud agent emits. A reading missed at shutdown is never lost
— the counter is cumulative and the next start reads it — and unchanged
readings are skipped so idle installs emit nothing. The counter id
hashes machine id + OS username, since rtk's database is per OS user.

The bundled-rtk stage dir and binary name move into shared consts on
the rtk-savings module (now exported as @posthog/agent/server/
rtk-savings), replacing the "keep in sync" comment duplication in the
workspace-server resolver and bin.ts.

Generated-By: PostHog Code
Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026

@stamphog stamphog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gates denied this PR because it touches auth-related code (auth-adapter.ts) and exceeds the size/complexity ceiling for auto-approval. Beyond the gate failures, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human owner to sign off regardless of how thoroughly inline review comments were addressed.

@stamphog stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026
Every analytics event in the app pairs its ANALYTICS_EVENTS constant
with a *Properties interface and an EventPropertyMap entry (that map
backs the renderer's typed track helper, and main-process events like
CLOUD_STREAM_DISCONNECTED register too). RTK savings gauge was the one
event missing both — add RtkSavingsGaugeProperties, the map entry, and
a satisfies check at the reporter's call site so the emitted bag cannot
drift from the declared shape.

Generated-By: PostHog Code
Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
Register the reporter token in MainBindings so the eager typed get is
compile-checked; use async machineId() instead of spawning a sync child
process on the Electron main thread; guard the reporter's ready-then-
schedule path against dispose-before-ready and rejections; correct the
counterId lifetime doc to "per machine and OS user".

Generated-By: PostHog Code
Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026

@stamphog stamphog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gates denied this PR: it touches auth-related code (auth-adapter.ts) and exceeds the size/complexity ceiling. Beyond gate failures, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human team member to sign off regardless of how thoroughly inline review comments were addressed.

@stamphog stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026

@stamphog stamphog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gates denied this PR because it touches auth-related code (auth-adapter.ts) and exceeds the size/complexity ceiling. Beyond the gate failures, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human team member to sign off regardless of how thoroughly the inline review comments were addressed.

rtk gain only reads rtk's own stats database, but received the full
process env — API keys and tokens included — on every telemetry read.
Allowlist just the variables it needs to locate its data dir and spawn
per platform. The rewrite-hook path is untouched: there rtk runs the
user's own commands, which genuinely need the real env.

Generated-By: PostHog Code
Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026

@stamphog stamphog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR touches auth-related code (auth-adapter.ts), exceeds the size/complexity ceiling, and introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a significant security surface change that requires a human team member to sign off regardless of how thoroughly the inline review comments were addressed.

@stamphog stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026
@pauldambra pauldambra requested a review from joshsny July 4, 2026 22:28
A bundled rtk can exist but be unexecutable — wrong arch/libc for the
image, or a mac build whose ad-hoc signing failed — and rewriting every
eligible command through it would fail them all, strictly worse than
having no rtk. The rewrite hook now probes `rtk --version` lazily on
the first command it would rewrite, memoized per session: a failed
probe logs once and downgrades the session to no-rewrite. Both hosts
flow through this one seam, so neither adoption site (bin.ts,
auth-adapter) needs its own check, and `rtk gain` already tolerates a
broken binary.

Generated-By: PostHog Code
Task-Id: b86391cc-4634-4a2b-ae34-fe1f9c0626a0
@pauldambra pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026

@stamphog stamphog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gates denied this PR because it touches auth-related code (auth-adapter.ts) and exceeds the size/complexity ceiling for auto-approval. Beyond the gate failures, this PR introduces a third-party binary download-and-bundle pipeline wired into the agent execution environment — a meaningful security surface change that requires a human team member to sign off.

@stamphog stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jul 4, 2026
@greptile-apps

greptile-apps Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Reviews (3): Last reviewed commit: "fix(agent): probe rtk before rewriting c..." | Re-trigger Greptile

@gantoine

gantoine commented Jul 4, 2026

Copy link
Copy Markdown
Member

I explicitly didn't do this since I don't think bundling rtk is the right choice/should be a concern of posthog code, but I'll leave that up to the code team to make that product call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Create Release This will trigger a new release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants