Skip to content

Commit 7fd8bd6

Browse files
feat(browser): rewrite network for agent-native discovery (#1100)
* feat(browser): rewrite network command for agent-native discovery Replace the index-based list + pretty-printed --detail flow with a structured JSON interface built around stable keys, body-shape previews, and a persistent capture cache. Agents can now reference captured requests by operationName (GraphQL) or `METHOD host+pathname` (REST) instead of array indexes that shift on every rerun. - `browser network` now emits JSON: `{workspace, captured_at, count, filtered_out, entries: [{key, method, status, url, ct, size, shape}], detail_hint}` — no body payloads by default - Shape inference (src/browser/shape.ts) walks response JSON into a flat path -> descriptor map with depth cap 6 and a 2KB budget per entry, so agents see structure without paying body tokens - Stable key generator (src/browser/network-key.ts) derives `operationName` from graphql URLs and `METHOD host+pathname` elsewhere, disambiguating collisions with `#N` suffixes - Persistent cache (src/browser/network-cache.ts) snapshots every capture to `~/.opencli/cache/browser-network/<workspace>.json` with a 24h TTL, so `--detail <key>` survives later commands - `--detail <key>` returns `{key, url, method, status, ct, size, shape, body}` with structured error codes (cache_missing / cache_expired / cache_corrupt / key_not_found, the latter including available_keys) - Add `--raw` for agents that want every full body inline, `--ttl` for cache lookups - Update opencli-adapter-author + opencli-autofix skill docs to reference `--detail <key>` and the shape-first discovery flow Supersedes the cache prototype in #1051. Co-authored-by: freemandealer <freeman.zhang1992@gmail.com> * fix(browser): structured errors for capture/save, shape budget guard Self-review findings on the network refactor: - captureNetworkItems throwing (browser crashed / CDP dropped) now emits `error.code: capture_failed` on stdout rather than leaking a bare stderr line from browserAction's generic handler — agents get a parseable JSON blob on every failure path, matching the design goal. - saveNetworkCache throwing (disk full, read-only path) is a soft failure: the captured data is already in hand, so surface a `cache_warning` field in the envelope and keep going instead of aborting. `--detail` lookups on that run will miss the cache but the listing still reaches the agent. - shape.ts: guard the sub-walk on `add()`'s return value so the "budget hits on the array/object descriptor itself" path can never emit a stray child without its parent marker. - network-key.ts: document that `#N` suffixes start at `#2` — the first occurrence stays bare, there is no `#1`. Matches test + code. Added regression tests: `capture_failed` on readNetworkCapture throw, `cache_warning` on persistence failure, shape budget hit on array descriptor. Co-authored-by: freemandealer <freeman.zhang1992@gmail.com> --------- Co-authored-by: freemandealer <freeman.zhang1992@gmail.com>
1 parent 295c523 commit 7fd8bd6

10 files changed

Lines changed: 796 additions & 68 deletions

File tree

skills/opencli-adapter-author/references/api-discovery.md

Lines changed: 20 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -14,26 +14,34 @@
1414
opencli browser network
1515
```
1616

17-
输出里过掉这些噪音
18-
- `.css / .js / .woff / .png / .svg / .webp / .mp4` — 静态资源
19-
- `googletagmanager / sentry / crazyegg / doubleclick / tracking / beacon` — 埋点
20-
- `/healthz / /ping / /heartbeat` — 健康检查
17+
默认输出是 JSON,每个候选都带
18+
- `key` — 稳定引用(GraphQL 的 `operationName``METHOD host+pathname`
19+
- `shape` — response body 的路径→类型映射(不含原 body,省 token)
20+
- `status / url / method / ct / size`
2121

22-
剩下的每一条都是候选。挑 URL 里含业务词(`list / detail / quote / feed / timeline / stock / user` 等)的优先看:
22+
静态资源 / 埋点 / 追踪默认已过滤;需要全量看用 `--all`
23+
24+
### 按 shape 初筛
25+
26+
`key` 里含业务词(`list / detail / Timeline / User / Tweets / Quote`)的优先看 `shape`
27+
28+
- `$.data``object` 且下面出现 `array(N)` / `total` / `page` → 基本是它
29+
- 路径里出现 `nickname / avatar / title / price / tweets / items` → 就是它
30+
- shape 只有 `$: string` 或全是 HTML 噪音 → 下一条
31+
32+
### 拉完整 body
33+
34+
候选定了再拉完整 body(by key,不是 index — 数组顺序会随每次 capture 变):
2335

2436
```bash
25-
opencli browser network --detail <N>
37+
opencli browser network --detail <key>
2638
```
2739

28-
看 response body 前 200 字节:
29-
30-
- 含数组 / `total` / `page` 字段 → 基本是它
31-
-`nickname / avatar / title / price` → 就是它
32-
- 是 HTML 或纯广告 → 下一条
40+
capture 会持久化到 `~/.opencli/cache/browser-network/<workspace>.json`(默认 TTL 24h),所以 `--detail` 即使跨多条其他命令也还在。
3341

3442
### 关键 request headers
3543

36-
找到候选后,`--detail <N>` 里看请求头
44+
`browser network` 当前只抓响应(body + status + ct),抓不到请求头。要看请求头就在 DevTools Network 面板里点这条 request,或用 `browser eval` 手动 `fetch(url)` 复现一次观察浏览器发出去的头
3745

3846
| 看到 | 含义 | 对应策略 |
3947
|------|------|---------|

skills/opencli-autofix/SKILL.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -130,8 +130,8 @@ opencli browser open https://example.com/target-page && opencli browser state
130130
# Interact to trigger API calls
131131
opencli browser click <N> && opencli browser network
132132

133-
# Inspect specific API response
134-
opencli browser network --detail <index>
133+
# Inspect specific API response (key is the `key` field from the default JSON output)
134+
opencli browser network --detail <key>
135135
```
136136

137137
## Step 4: Patch the Adapter

src/browser/network-cache.test.ts

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
2+
import * as fs from 'node:fs';
3+
import * as os from 'node:os';
4+
import * as path from 'node:path';
5+
import {
6+
DEFAULT_TTL_MS,
7+
findEntry,
8+
getCachePath,
9+
loadNetworkCache,
10+
saveNetworkCache,
11+
type CachedNetworkEntry,
12+
type NetworkCacheFile,
13+
} from './network-cache.js';
14+
15+
function makeEntry(key: string, body: unknown = { ok: true }): CachedNetworkEntry {
16+
return { key, url: `https://x.com/${key}`, method: 'GET', status: 200, size: 2, ct: 'application/json', body };
17+
}
18+
19+
describe('network-cache', () => {
20+
let baseDir: string;
21+
22+
beforeEach(() => {
23+
baseDir = fs.mkdtempSync(path.join(os.tmpdir(), 'opencli-netcache-'));
24+
});
25+
afterEach(() => {
26+
fs.rmSync(baseDir, { recursive: true, force: true });
27+
});
28+
29+
it('sanitizes workspace names into safe filenames', () => {
30+
const p = getCachePath('browser:default', baseDir);
31+
expect(path.basename(p)).toBe('browser_default.json');
32+
});
33+
34+
it('round-trips entries through save + load', () => {
35+
saveNetworkCache('ws', [makeEntry('UserTweets'), makeEntry('UserByScreenName')], baseDir);
36+
const res = loadNetworkCache('ws', { baseDir });
37+
expect(res.status).toBe('ok');
38+
expect(res.file?.entries).toHaveLength(2);
39+
expect(res.file?.entries[0].key).toBe('UserTweets');
40+
});
41+
42+
it('reports missing when cache file does not exist', () => {
43+
expect(loadNetworkCache('nope', { baseDir }).status).toBe('missing');
44+
});
45+
46+
it('reports expired when the cache is older than ttl', () => {
47+
saveNetworkCache('ws', [makeEntry('A')], baseDir);
48+
const future = Date.now() + DEFAULT_TTL_MS + 60_000;
49+
const res = loadNetworkCache('ws', { baseDir, now: future });
50+
expect(res.status).toBe('expired');
51+
expect(res.file?.entries).toHaveLength(1);
52+
});
53+
54+
it('reports corrupt for malformed json', () => {
55+
const file = getCachePath('ws', baseDir);
56+
fs.mkdirSync(path.dirname(file), { recursive: true });
57+
fs.writeFileSync(file, '{not json');
58+
expect(loadNetworkCache('ws', { baseDir }).status).toBe('corrupt');
59+
});
60+
61+
it('reports corrupt for wrong schema version', () => {
62+
const file = getCachePath('ws', baseDir);
63+
fs.mkdirSync(path.dirname(file), { recursive: true });
64+
fs.writeFileSync(file, JSON.stringify({ version: 0, entries: [] }));
65+
expect(loadNetworkCache('ws', { baseDir }).status).toBe('corrupt');
66+
});
67+
68+
it('findEntry returns matching entry or null', () => {
69+
const file: NetworkCacheFile = {
70+
version: 1, workspace: 'ws', savedAt: new Date().toISOString(),
71+
entries: [makeEntry('A'), makeEntry('B')],
72+
};
73+
expect(findEntry(file, 'B')?.key).toBe('B');
74+
expect(findEntry(file, 'missing')).toBeNull();
75+
});
76+
});

src/browser/network-cache.ts

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
/**
2+
* Persistent cache for browser network captures.
3+
*
4+
* The live capture buffer (JS interceptor / daemon ring) can be cleared
5+
* by navigation or lost between CLI invocations. Agents still need
6+
* stable references to request bodies after running other commands,
7+
* so every `browser network` call snapshots its results to disk.
8+
*
9+
* Layout: <cacheDir>/browser-network/<workspace>.json
10+
* Entries expire after DEFAULT_TTL_MS (24h).
11+
*/
12+
13+
import * as fs from 'node:fs';
14+
import * as os from 'node:os';
15+
import * as path from 'node:path';
16+
17+
export const DEFAULT_TTL_MS = 24 * 60 * 60 * 1000;
18+
19+
export interface CachedNetworkEntry {
20+
key: string;
21+
url: string;
22+
method: string;
23+
status: number;
24+
size: number;
25+
ct: string;
26+
body: unknown;
27+
}
28+
29+
export interface NetworkCacheFile {
30+
version: 1;
31+
workspace: string;
32+
savedAt: string;
33+
entries: CachedNetworkEntry[];
34+
}
35+
36+
function getDefaultCacheDir(): string {
37+
return process.env.OPENCLI_CACHE_DIR || path.join(os.homedir(), '.opencli', 'cache');
38+
}
39+
40+
export function getCachePath(workspace: string, baseDir: string = getDefaultCacheDir()): string {
41+
const safe = workspace.replace(/[^a-zA-Z0-9_-]+/g, '_');
42+
return path.join(baseDir, 'browser-network', `${safe}.json`);
43+
}
44+
45+
export function saveNetworkCache(
46+
workspace: string,
47+
entries: CachedNetworkEntry[],
48+
baseDir?: string,
49+
): void {
50+
const target = getCachePath(workspace, baseDir);
51+
fs.mkdirSync(path.dirname(target), { recursive: true });
52+
const payload: NetworkCacheFile = {
53+
version: 1,
54+
workspace,
55+
savedAt: new Date().toISOString(),
56+
entries,
57+
};
58+
fs.writeFileSync(target, JSON.stringify(payload), 'utf-8');
59+
}
60+
61+
export interface LoadOptions {
62+
baseDir?: string;
63+
ttlMs?: number;
64+
now?: number;
65+
}
66+
67+
export interface LoadResult {
68+
status: 'ok' | 'missing' | 'expired' | 'corrupt';
69+
file?: NetworkCacheFile;
70+
ageMs?: number;
71+
}
72+
73+
export function loadNetworkCache(workspace: string, opts: LoadOptions = {}): LoadResult {
74+
const target = getCachePath(workspace, opts.baseDir);
75+
let raw: string;
76+
try { raw = fs.readFileSync(target, 'utf-8'); }
77+
catch { return { status: 'missing' }; }
78+
79+
let parsed: NetworkCacheFile;
80+
try {
81+
const obj = JSON.parse(raw);
82+
if (!obj || obj.version !== 1 || !Array.isArray(obj.entries)) {
83+
return { status: 'corrupt' };
84+
}
85+
parsed = obj as NetworkCacheFile;
86+
} catch {
87+
return { status: 'corrupt' };
88+
}
89+
90+
const ttl = opts.ttlMs ?? DEFAULT_TTL_MS;
91+
const now = opts.now ?? Date.now();
92+
const savedAt = Date.parse(parsed.savedAt);
93+
if (!Number.isFinite(savedAt)) return { status: 'corrupt' };
94+
const ageMs = now - savedAt;
95+
if (ageMs > ttl) return { status: 'expired', file: parsed, ageMs };
96+
97+
return { status: 'ok', file: parsed, ageMs };
98+
}
99+
100+
export function findEntry(file: NetworkCacheFile, key: string): CachedNetworkEntry | null {
101+
return file.entries.find((e) => e.key === key) ?? null;
102+
}

src/browser/network-key.test.ts

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
import { describe, expect, it } from 'vitest';
2+
import { assignKeys, deriveKey } from './network-key.js';
3+
4+
describe('deriveKey', () => {
5+
it('extracts operationName from Twitter-style graphql URLs', () => {
6+
expect(deriveKey({
7+
method: 'GET',
8+
url: 'https://x.com/i/api/graphql/6fWQaBPK51aGyC_VC7t9GQ/UserTweets?variables=...',
9+
})).toBe('UserTweets');
10+
});
11+
12+
it('handles graphql URLs without a query id', () => {
13+
expect(deriveKey({
14+
method: 'POST',
15+
url: 'https://example.com/graphql/MyOp?vars=1',
16+
})).toBe('MyOp');
17+
});
18+
19+
it('uses METHOD host+pathname for REST calls', () => {
20+
expect(deriveKey({
21+
method: 'get',
22+
url: 'https://api.example.com/v1/users?page=1',
23+
})).toBe('GET api.example.com/v1/users');
24+
});
25+
26+
it('falls back to truncated raw url when URL parsing fails', () => {
27+
const key = deriveKey({ method: 'GET', url: 'not-a-valid-url' });
28+
expect(key.startsWith('GET ')).toBe(true);
29+
expect(key).toContain('not-a-valid-url');
30+
});
31+
});
32+
33+
describe('assignKeys', () => {
34+
it('disambiguates collisions with #N suffixes', () => {
35+
const out = assignKeys([
36+
{ url: 'https://x.com/i/api/graphql/a/UserTweets', method: 'GET' },
37+
{ url: 'https://x.com/i/api/graphql/b/UserTweets', method: 'GET' },
38+
{ url: 'https://api.example.com/v1/u', method: 'GET' },
39+
{ url: 'https://api.example.com/v1/u', method: 'GET' },
40+
{ url: 'https://api.example.com/v1/u', method: 'GET' },
41+
]);
42+
expect(out.map(o => o.key)).toEqual([
43+
'UserTweets',
44+
'UserTweets#2',
45+
'GET api.example.com/v1/u',
46+
'GET api.example.com/v1/u#2',
47+
'GET api.example.com/v1/u#3',
48+
]);
49+
});
50+
51+
it('preserves extra fields on each request', () => {
52+
const out = assignKeys([{ url: 'https://a.com/x', method: 'GET', status: 200 }]);
53+
expect(out[0]).toMatchObject({ status: 200, key: 'GET a.com/x' });
54+
});
55+
});

src/browser/network-key.ts

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
/**
2+
* Stable keys for network capture entries.
3+
*
4+
* Agents reference entries by key (e.g. `UserTweets`, `GET api.x.com/1.1/home`)
5+
* instead of array index, so the mapping survives new captures.
6+
*
7+
* Rules:
8+
* GraphQL (URL contains `/graphql/`): key = operationName derived from URL path
9+
* (the segment after a 22-char query id, or the last segment)
10+
* Everything else: key = `METHOD host+pathname`
11+
*
12+
* On collision assignKeys suffixes duplicates as `base#2`, `base#3`, ... —
13+
* the first occurrence stays bare (there is no `#1`).
14+
*/
15+
16+
export interface KeyableRequest {
17+
url: string;
18+
method: string;
19+
}
20+
21+
export function deriveKey(req: KeyableRequest): string {
22+
const parsed = safeParseUrl(req.url);
23+
if (!parsed) return `${req.method.toUpperCase()} ${truncate(req.url, 120)}`;
24+
25+
const path = parsed.pathname;
26+
if (path.includes('/graphql/')) {
27+
const op = graphqlOperationName(path);
28+
if (op) return op;
29+
}
30+
31+
return `${req.method.toUpperCase()} ${parsed.host}${path}`;
32+
}
33+
34+
export function assignKeys<T extends KeyableRequest>(requests: T[]): Array<T & { key: string }> {
35+
const counts = new Map<string, number>();
36+
const out: Array<T & { key: string }> = [];
37+
for (const req of requests) {
38+
const base = deriveKey(req);
39+
const n = counts.get(base) ?? 0;
40+
counts.set(base, n + 1);
41+
const key = n === 0 ? base : `${base}#${n + 1}`;
42+
out.push({ ...req, key });
43+
}
44+
return out;
45+
}
46+
47+
function graphqlOperationName(pathname: string): string | null {
48+
// Patterns we've seen in the wild:
49+
// /i/api/graphql/<queryId>/UserTweets
50+
// /graphql/<queryId>/SomeOp
51+
// /graphql/SomeOp (rare, no id)
52+
const segments = pathname.split('/').filter(Boolean);
53+
const idx = segments.indexOf('graphql');
54+
if (idx < 0) return null;
55+
const tail = segments.slice(idx + 1);
56+
if (tail.length === 0) return null;
57+
if (tail.length === 1) return tail[0];
58+
// tail[0] is usually a query id; the operation name is the next segment.
59+
return tail[1] || tail[0];
60+
}
61+
62+
function safeParseUrl(url: string): URL | null {
63+
try { return new URL(url); }
64+
catch { return null; }
65+
}
66+
67+
function truncate(s: string, max: number): string {
68+
return s.length <= max ? s : `${s.slice(0, max - 1)}…`;
69+
}

0 commit comments

Comments
 (0)