Skip to content

Commit 4aa50e1

Browse files
authored
feat: improve feature-video skill with GitHub native video upload (#344)
1 parent 423e692 commit 4aa50e1

3 files changed

Lines changed: 497 additions & 178 deletions

File tree

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
---
2+
title: "Persistent GitHub authentication for agent-browser using named sessions"
3+
category: integrations
4+
date: 2026-03-22
5+
tags:
6+
- agent-browser
7+
- github
8+
- authentication
9+
- chrome
10+
- session-persistence
11+
- lightpanda
12+
related_to:
13+
- plugins/compound-engineering/skills/feature-video/SKILL.md
14+
- plugins/compound-engineering/skills/agent-browser/SKILL.md
15+
- plugins/compound-engineering/skills/agent-browser/references/authentication.md
16+
- plugins/compound-engineering/skills/agent-browser/references/session-management.md
17+
---
18+
19+
# agent-browser Chrome Authentication for GitHub
20+
21+
## Problem
22+
23+
agent-browser needs authenticated access to GitHub for workflows like the native video
24+
upload in the feature-video skill. Multiple authentication approaches were evaluated
25+
before finding one that works reliably with 2FA, SSO, and OAuth.
26+
27+
## Investigation
28+
29+
| Approach | Result |
30+
|---|---|
31+
| `--profile` flag | Lightpanda (default engine on some installs) throws "Profiles are not supported with Lightpanda". Must use `--engine chrome`. |
32+
| Fresh Chrome profile | No GitHub cookies. Shows "Sign up for free" instead of comment form. |
33+
| `--auto-connect` | Requires Chrome pre-launched with `--remote-debugging-port`. Error: "No running Chrome instance found" in normal use. Impractical. |
34+
| Auth vault (`auth save`/`auth login`) | Cannot handle 2FA, SSO, or OAuth redirects. Only works for simple username/password forms. |
35+
| `--session-name` with Chrome engine | Cookies auto-save/restore. One-time headed login handles any auth method. **This works.** |
36+
37+
## Working Solution
38+
39+
### One-time setup (headed, user logs in manually)
40+
41+
```bash
42+
# Close any running daemon (ignores engine/option changes when reused)
43+
agent-browser close
44+
45+
# Open GitHub login in headed Chrome with a named session
46+
agent-browser --engine chrome --headed --session-name github open https://github.com/login
47+
# User logs in manually -- handles 2FA, SSO, OAuth, any method
48+
49+
# Verify auth
50+
agent-browser open https://github.com/settings/profile
51+
# If profile page loads, auth is confirmed
52+
```
53+
54+
### Session validity check (before each workflow)
55+
56+
```bash
57+
agent-browser close
58+
agent-browser --engine chrome --session-name github open https://github.com/settings/profile
59+
agent-browser get title
60+
# Title contains username or "Profile" -> session valid, proceed
61+
# Title contains "Sign in" or URL is github.com/login -> session expired, re-auth
62+
```
63+
64+
### All subsequent runs (headless, cookies persist)
65+
66+
```bash
67+
agent-browser --engine chrome --session-name github open https://github.com/...
68+
```
69+
70+
## Key Findings
71+
72+
### Engine requirement
73+
74+
MUST use `--engine chrome`. Lightpanda does not support profiles, session persistence,
75+
or state files. Any workflow that uses `--session-name`, `--profile`, `--state`, or
76+
`state save/load` requires the Chrome engine.
77+
78+
Include `--engine chrome` explicitly in every command that uses an authenticated session.
79+
Do not rely on environment defaults -- `AGENT_BROWSER_ENGINE` may be set to `lightpanda`
80+
in some environments.
81+
82+
### Daemon restart
83+
84+
Must run `agent-browser close` before switching engine or session options. A running
85+
daemon ignores new flags like `--engine`, `--headed`, or `--session-name`.
86+
87+
### Session lifetime
88+
89+
Cookies expire when GitHub invalidates them (typically weeks). Periodic re-authentication
90+
is required. The feature-video skill handles this by checking session validity before
91+
the upload step and prompting for re-auth only when needed.
92+
93+
### Auth vault limitations
94+
95+
The auth vault (`agent-browser auth save`/`auth login`) can only handle login forms with
96+
visible username and password fields. It cannot handle:
97+
98+
- 2FA (TOTP, SMS, push notification)
99+
- SSO with identity provider redirect
100+
- OAuth consent flows
101+
- CAPTCHA
102+
- Device verification prompts
103+
104+
For GitHub and most modern services, use the one-time headed login approach instead.
105+
106+
### `--auto-connect` viability
107+
108+
Impractical for automated workflows. Requires Chrome to be pre-launched with
109+
`--remote-debugging-port=9222`, which is not how users normally run Chrome.
110+
111+
## Prevention
112+
113+
### Skills requiring auth must declare engine
114+
115+
State the engine requirement in the Prerequisites section of any skill that needs
116+
browser auth. Include `--engine chrome` in every `agent-browser` command that touches
117+
an authenticated session.
118+
119+
### Session check timing
120+
121+
Perform the session check immediately before the step that needs auth, not at skill
122+
start. A session valid at start may expire during a long workflow (video encoding can
123+
take minutes).
124+
125+
### Recovery without restart
126+
127+
When expiry is detected at upload time, the video file is already encoded. Recovery:
128+
re-authenticate, then retry only the upload step. Do not restart from the beginning.
129+
130+
### Concurrent sessions
131+
132+
Use `--session-name` with a semantically descriptive name (e.g., `github`) when multiple
133+
skills or agents may run concurrently. Two concurrent runs sharing the default session
134+
will interfere with each other.
135+
136+
### State file security
137+
138+
Session state files in `~/.agent-browser/sessions/` contain cookies in plaintext.
139+
Do not commit to repositories. Add to `.gitignore` if the session directory is inside
140+
a repo tree.
141+
142+
## Integration Points
143+
144+
This pattern is used by:
145+
- `feature-video` skill (GitHub native video upload)
146+
- Any future skill requiring authenticated GitHub browser access
147+
- Potential use for other OAuth-protected services (same pattern, different session name)
Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
---
2+
title: "GitHub inline video embedding via programmatic browser upload"
3+
category: integrations
4+
date: 2026-03-22
5+
tags:
6+
- github
7+
- video-embedding
8+
- agent-browser
9+
- playwright
10+
- feature-video
11+
- pr-description
12+
related_to:
13+
- plugins/compound-engineering/skills/feature-video/SKILL.md
14+
- plugins/compound-engineering/skills/agent-browser/SKILL.md
15+
- plugins/compound-engineering/skills/agent-browser/references/authentication.md
16+
---
17+
18+
# GitHub Native Video Upload for PRs
19+
20+
## Problem
21+
22+
Embedding video demos in GitHub PR descriptions required external storage (R2/rclone)
23+
or GitHub Release assets. Release asset URLs render as plain download links, not inline
24+
video players. Only `user-attachments/assets/` URLs render with GitHub's native inline
25+
video player -- the same result as pasting a video into the PR editor manually.
26+
27+
The distinction is absolute:
28+
29+
| URL namespace | Rendering |
30+
|---|---|
31+
| `github.com/releases/download/...` | Plain download link (bad UX, triggers download on mobile) |
32+
| `github.com/user-attachments/assets/...` | Native inline `<video>` player with controls |
33+
34+
## Investigation
35+
36+
1. **Public upload API** -- No public API exists. The `/upload/policies/assets` endpoint
37+
requires browser session cookies and is not exposed via REST or GraphQL. GitHub CLI
38+
(`gh`) has no support; issues cli/cli#1895, #4228, and #4465 are all closed as
39+
"not planned". GitHub keeps this private to limit abuse surface (malware hosting,
40+
spam CDN, DMCA liability).
41+
42+
2. **Release asset approach (Strategy B)** -- URLs render as download links, not video
43+
players. Clickable GIF previews trigger downloads on mobile. Unacceptable UX.
44+
45+
3. **Claude-in-Chrome JavaScript injection with base64** -- Blocked by CSP/mixed-content
46+
policy. HTTPS github.com cannot fetch from HTTP localhost. Base64 chunking is possible
47+
but does not scale for larger videos.
48+
49+
4. **`tonkotsuboy/github-upload-image-to-pr`** -- Open-source reference confirming
50+
browser automation is the only working approach for producing native URLs.
51+
52+
5. **agent-browser `upload` command** -- Works. Playwright sets files directly on hidden
53+
file inputs without base64 encoding or fetch requests. CSP is not a factor because
54+
Playwright's `setInputFiles` operates at the browser engine level, not via JavaScript.
55+
56+
## Working Solution
57+
58+
### Upload flow
59+
60+
```bash
61+
# Navigate to PR page (authenticated Chrome session)
62+
agent-browser --engine chrome --session-name github \
63+
open "https://github.com/[owner]/[repo]/pull/[number]"
64+
agent-browser scroll down 5000
65+
66+
# Upload video via the hidden file input
67+
agent-browser upload '#fc-new_comment_field' tmp/videos/feature-demo.mp4
68+
69+
# Wait for GitHub to process the upload (typically 3-5 seconds)
70+
agent-browser wait 5000
71+
72+
# Extract the URL GitHub injected into the textarea
73+
agent-browser eval "document.getElementById('new_comment_field').value"
74+
# Returns: https://github.com/user-attachments/assets/[uuid]
75+
76+
# Clear the textarea without submitting (upload already persisted server-side)
77+
agent-browser eval "const ta = document.getElementById('new_comment_field'); \
78+
ta.value = ''; ta.dispatchEvent(new Event('input', { bubbles: true }))"
79+
80+
# Embed in PR description (URL on its own line renders as inline video player)
81+
gh pr edit [number] --body "[body with video URL on its own line]"
82+
```
83+
84+
### Key selectors (validated March 2026)
85+
86+
| Selector | Element | Purpose |
87+
|---|---|---|
88+
| `#fc-new_comment_field` | Hidden `<input type="file">` | Target for `agent-browser upload`. Accepts `.mp4`, `.mov`, `.webm` and many other types. |
89+
| `#new_comment_field` | `<textarea>` | GitHub injects the `user-attachments/assets/` URL here after processing the upload. |
90+
91+
GitHub's comment form contains the hidden file input. After Playwright sets the file,
92+
GitHub uploads it server-side and injects a markdown URL into the textarea. The upload
93+
is persisted even if the form is never submitted.
94+
95+
## What Was Removed
96+
97+
The following approaches were removed from the feature-video skill:
98+
99+
- R2/rclone setup and configuration
100+
- Release asset upload flow (`gh release upload`)
101+
- GIF preview generation (unnecessary with native inline video player)
102+
- Strategy B fallback logic
103+
104+
Total: approximately 100 lines of SKILL.md content removed. The skill is now simpler
105+
and has zero external storage dependencies.
106+
107+
## Prevention
108+
109+
### URL validation
110+
111+
After any upload step, confirm the extracted URL contains `user-attachments/assets/`
112+
before writing it into the PR description. If the URL does not match, the upload failed
113+
or used the wrong method.
114+
115+
### Upload failure handling
116+
117+
If the textarea is empty after the wait, check:
118+
1. Session validity (did GitHub redirect to login?)
119+
2. Wait time (processing can be slow under load -- retry after 3-5 more seconds)
120+
3. File size (10MB free, 100MB paid accounts)
121+
122+
Do not silently substitute a release asset URL. Report the failure and offer to retry.
123+
124+
### DOM selector fragility
125+
126+
`#fc-new_comment_field` and `#new_comment_field` are GitHub's internal element IDs and
127+
may change in future UI updates. If the upload stops working, snapshot the PR page and
128+
inspect the current comment form structure for updated selectors.
129+
130+
### Size limits
131+
132+
- Free accounts: 10MB per file
133+
- Paid (Pro, Team, Enterprise): 100MB per file
134+
135+
Check file size before attempting upload. Re-encode at lower quality if needed.
136+
137+
## References
138+
139+
- GitHub CLI issues: cli/cli#1895, #4228, #4465 (all closed "not planned")
140+
- `tonkotsuboy/github-upload-image-to-pr` -- reference implementation
141+
- GitHub Community Discussions: #29993, #46951, #28219

0 commit comments

Comments
 (0)