A GStack Browser server can be shared with any AI agent that can make HTTP requests. The agent gets scoped access to a real Chromium browser: navigate pages, read content, click elements, fill forms, take screenshots. Each agent gets its own tab.
This document is the reference for remote agents. The quick-start instructions are
generated by $B pair-agent with the actual credentials baked in.
Your Machine Remote Agent
───────────── ────────────
GStack Browser Server Any AI agent
├── Chromium (Playwright) (OpenClaw, Hermes, Codex, etc.)
├── Local listener 127.0.0.1:LOCAL │
│ (bootstrap, CLI, sidebar, cookies) │
├── Tunnel listener 127.0.0.1:TUNNEL ◄───────┤
│ (pair-agent only: /connect, /command, │
│ /sidebar-chat — locked allowlist) │
├── ngrok tunnel (forwards tunnel port only) │
│ https://xxx.ngrok.dev ─────────────────┘
└── Token Registry
├── Root token (local listener only)
├── Setup keys (5 min, one-time)
├── Session tokens (24h, scoped)
└── SSE session cookies (30 min, stream-scope)
The daemon binds two HTTP sockets. The local listener serves the full command surface to 127.0.0.1 only and is never forwarded. The tunnel listener is bound lazily on /tunnel/start (and torn down on /tunnel/stop) with a locked path allowlist. ngrok forwards only the tunnel port.
A caller who stumbles onto your ngrok URL cannot reach /health, /cookie-picker, /inspector/*, or /welcome — those paths don't exist on that TCP socket. Root tokens sent over the tunnel get 403. The tunnel listener accepts only /connect, /command (with a scoped token + the 26-command browser-driving allowlist), and /sidebar-chat.
See ARCHITECTURE.md for the full endpoint table.
- User runs
$B pair-agent(or/pair-agentin Claude Code) - Server creates a one-time setup key (expires in 5 minutes)
- User copies the instruction block into the other agent's chat
- Remote agent runs
POST /connectwith the setup key - Server returns a scoped session token (24h default)
- Remote agent creates its own tab via
POST /commandwithnewtab - Remote agent browses using
POST /commandwith its session token + tabId
All command endpoints require a Bearer token:
Authorization: Bearer gsk_sess_...
/connect is unauthenticated (rate-limited) — it's how a remote agent exchanges a setup key for a scoped session token. /health is unauthenticated on the local listener (bootstrap) but does NOT exist on the tunnel listener (404).
SSE endpoints (/activity/stream, /inspector/events) accept either a Bearer token or the HttpOnly gstack_sse cookie (minted via POST /sse-session, 30-minute TTL, stream-scope only — cannot be used against /command). As of v1.6.0.0 the ?token=<ROOT> query-string auth is no longer accepted.
Exchange a setup key for a session token. No auth required. Rate-limited to 300/minute (flood defense — setup keys are 24 random bytes, unbruteforceable).
Request: {"setup_key": "gsk_setup_..."}
Response: {"token": "gsk_sess_...", "expires": "ISO8601", "scopes": ["read","write"], "agent": "agent-name"}Send a browser command. Requires Bearer auth.
Request: {"command": "goto", "args": ["https://example.com"], "tabId": 1}
Response: (plain text result of the command)Server status. No auth required. Returns status, tabs, mode, uptime.
| Command | Args | Description |
|---|---|---|
goto |
["URL"] |
Navigate to a URL |
back |
[] |
Go back |
forward |
[] |
Go forward |
reload |
[] |
Reload page |
| Command | Args | Description |
|---|---|---|
snapshot |
["-i"] |
Interactive snapshot with @ref labels (most useful) |
text |
[] |
Full page text |
html |
["selector?"] |
HTML of element or full page |
links |
[] |
All links on page |
screenshot |
["/tmp/s.png"] |
Take a screenshot |
url |
[] |
Current URL |
| Command | Args | Description |
|---|---|---|
click |
["@e3"] |
Click an element (use @ref from snapshot) |
fill |
["@e5", "text"] |
Fill a form field |
select |
["@e7", "option"] |
Select dropdown value |
type |
["text"] |
Type text (keyboard) |
press |
["Enter"] |
Press a key |
scroll |
["down"] |
Scroll the page |
| Command | Args | Description |
|---|---|---|
newtab |
["URL?"] |
Create a new tab (required before writing) |
tabs |
[] |
List all tabs |
closetab |
["id?"] |
Close a tab |
This is the most powerful browsing pattern. Instead of writing CSS selectors:
- Run
snapshot -ito get an interactive snapshot with labeled elements - The snapshot returns text like:
[Page Title] @e1 [link] "Home" @e2 [button] "Sign In" @e3 [input] "Search..." - Use the
@erefs directly in commands:click @e2,fill @e3 "search query"
This is how the snapshot system works, and it's much more reliable than guessing
CSS selectors. Always snapshot -i first, then use the refs.
| Scope | What it allows |
|---|---|
read |
snapshot, text, html, links, screenshot, url, tabs, console, etc. |
write |
goto, click, fill, scroll, newtab, closetab, etc. |
admin |
eval, js, cookies, storage, cookie-import, useragent, etc. |
meta |
tab, diff, frame, responsive, watch |
Default tokens get read + write. Admin requires --admin flag when pairing.
Each agent owns the tabs it creates. Rules:
- Read: Any agent can read any tab (snapshot, text, screenshot)
- Write: Only the tab owner can write (click, fill, goto, etc.)
- Unowned tabs: Pre-existing tabs are root-only for writes
- First step: Always
newtabbefore trying to interact
| Code | Meaning | What to do |
|---|---|---|
| 401 | Token invalid, expired, or revoked | Ask user to run /pair-agent again |
| 403 | Command not in scope, or tab not yours | Use newtab, or ask for --admin |
| 429 | Rate limit exceeded (>10 req/s) | Wait for Retry-After header |
- Physical port separation. Local listener and tunnel listener are separate TCP sockets. ngrok only forwards the tunnel port. Tunnel callers cannot reach bootstrap endpoints at all (404, wrong port).
- Tunnel command allowlist.
/commandover the tunnel only accepts 26 browser-driving commands (goto, click, fill, snapshot, text, newtab, tabs, back, forward, reload, closetab, etc.). Server-management commands (tunnel, pair, token, useragent, js) are denied on the tunnel. - Root token is tunnel-blocked. A request bearing the root token over the tunnel listener returns 403 with a pairing hint. Only scoped session tokens work over the tunnel.
- Setup keys expire in 5 minutes and can only be used once.
- Session tokens expire in 24 hours (configurable).
- The root token never appears in instruction blocks or connection strings.
- Admin scope (JS execution, cookie access) is denied by default.
- Tokens can be revoked instantly:
$B tunnel revoke agent-name - SSE auth uses a 30-minute HttpOnly SameSite=Strict cookie, stream-scope only (never valid against
/command). - Path traversal guarded on
/welcome—GSTACK_SLUGmust match^[a-z0-9_-]+$or falls back to the built-in template. - SSRF guards on
goto,download, and scrape paths — validates URL target against a localhost/private-range blocklist. - Tunnel surface denial logging. Every rejection on the tunnel listener (
path_not_on_tunnel,root_token_on_tunnel,missing_scoped_token,disallowed_command:*) is appended to~/.gstack/security/attempts.jsonlwith timestamp, source IP, path, method. Rate-capped at 60 writes/min. - All agent activity is logged with attribution (clientId).
Known non-goal (tracked as #1136): on Windows, the cookie-import-browser path launches Chrome with --remote-debugging-port=<random>. With App-Bound Encryption v20, a same-user local process can connect to that port and exfiltrate decrypted v20 cookies — an elevation path relative to reading the SQLite DB directly. Fix direction is --remote-debugging-pipe instead of TCP.
If both agents are on the same machine, skip the copy-paste:
$B pair-agent --local openclaw # writes to ~/.openclaw/skills/gstack/browse-remote.json
$B pair-agent --local codex # writes to ~/.codex/skills/gstack/browse-remote.json
$B pair-agent --local cursor # writes to ~/.cursor/skills/gstack/browse-remote.jsonNo tunnel needed. Uses localhost directly.
For remote agents on different machines:
- Sign up at ngrok.com (free tier works)
- Copy your auth token from the dashboard
- Save it:
echo 'NGROK_AUTHTOKEN=your_token' > ~/.gstack/ngrok.env - Optionally claim a stable domain:
echo 'NGROK_DOMAIN=your-name.ngrok-free.dev' >> ~/.gstack/ngrok.env - Start with tunnel:
BROWSE_TUNNEL=1 $B restart - Run
$B pair-agent— it will use the tunnel URL automatically