Skip to content

Commit 1eac8e0

Browse files
authored
fix(browser): drop session injection from extension exec results (#1518)
`pageScopedResult()` in extension/src/background.ts was spreading the lease's session into the result `data` for every page-scoped command. For the `exec` action — which routes user JavaScript through page.evaluate() — this contaminated arbitrary user-JS returns: * Array / primitive returns came back as `{ session, data: <value> }` envelopes. Adapters that did `Array.isArray(result)` got `false` and treated the page as having no rows. Visible repro: `opencli google search ...` and `opencli xiaohongshu search ...` — Chrome rendered results correctly but adapters extracted an empty array (reported in #1518 from the Browser Bridge v1.0.12 envelope). * Plain-object returns had an extra `session` key spliced in, silently overwriting any user `session` field with the lease's value. Fix in the extension layer instead of compensating client-side: `pageScopedResult` now returns `{ id, ok, data, page }` — the same form it had before #1461 added the workspace→session refactor. Client-side unwrapping is no longer needed and the original PR #1518 `Page.evaluate` heuristic is dropped (it only covered the array path and would have missed the plain-object path). Two adapter improvements kept from the original PR: * `clis/google/search.js` — wait for `#rso a h3` (with a 5s timeout) before extracting. On Chrome 148 / Linux Wayland the DOM can settle before SERP anchors are populated, so the existing fixed `wait 2` could return empty even with the envelope fix. * `clis/xiaohongshu/search.js` — extract initially visible cards before scrolling, then merge post-scroll rows by URL. Xiaohongshu's virtualized masonry can evict the initial note cards from the DOM after scroll, causing extraction to return [] even though the browser had rendered results correctly. Extension version bumped to 1.0.14. Repro environment (from #1518): * OpenCLI 1.7.18 * Browser Bridge extension 1.0.12 → 1.0.14 * Chrome 148.0.7778.96 * Linux Wayland, Node 22.22.1 Tests: extension/src/background.test.ts navigate same-url assertion updated to no longer expect `session` in `data`. Three Page.evaluate unwrap test cases removed.
1 parent 6af4db2 commit 1eac8e0

10 files changed

Lines changed: 76 additions & 34 deletions

File tree

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,12 @@
22

33
## Unreleased
44

5+
### Bug Fixes
6+
7+
* **browser**`page.evaluate()` / `evaluateInFrame()` now return the user JavaScript value directly. Browser Bridge `exec` previously routed through a shared `pageScopedResult` helper that spread / wrapped the lease's `session` into the result `data`, contaminating arbitrary user returns: array / primitive returns came back as `{ session, data }` envelopes, and plain-object returns had an extra `session` key injected (overwriting any user `session` field). `google search` and `xiaohongshu search` were the visible repro — Chrome rendered results correctly but adapters extracted an empty array. Fixed in extension 1.0.14 by reverting `pageScopedResult` to its pre-1461 form (`{ id, ok, data, page }`); no client-side unwrap is needed.
8+
* **google/search** — wait for `#rso a h3` before extracting, falling back to the existing fixed wait. On Chrome 148 + Linux Wayland the DOM can settle before SERP anchors are populated, making extraction return empty even with the envelope bug fixed.
9+
* **xiaohongshu/search** — extract initially visible cards before scrolling, then merge post-scroll rows by URL. Xiaohongshu's virtualized masonry layout can evict the initial cards from the DOM after scroll, so the previous always-scroll-then-extract flow could lose the top results.
10+
511
### Features
612

713
* **browser** — add `page.evaluate(fn, ...args)` for type-safe browser-context evaluation with JSON-serialized arguments. String evaluation remains supported, but new adapter code should use function form to avoid implicit `wrapForEval` auto-IIFE magic.
@@ -14,6 +20,7 @@
1420

1521
### Internal
1622

23+
* **extension 1.0.14**`pageScopedResult` no longer injects `session` into `data`. The field had no consumers and contaminated `exec` results with arbitrary user-JS shapes; routing-relevant identity is already exposed via `Result.page`.
1724
* **extension 1.0.13** — remove the internal command-session lease-key backdoor.
1825

1926
## [1.7.18](https://github.com/jackwener/opencli/compare/v1.7.17...v1.7.18) (2026-05-12)

clis/google/search.js

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,16 @@ cli({
2929
const lang = encodeURIComponent(args.lang);
3030
const url = `https://www.google.com/search?q=${keyword}&hl=${lang}&num=${limit}`;
3131
await page.goto(url);
32-
await page.wait(2);
32+
// Wait until at least one SERP title link is present. On Chrome 148 /
33+
// Linux Wayland, DOM stability can be reached before #rso anchors are
34+
// populated, making browser execution look visually correct while the
35+
// adapter extracts an empty array.
36+
try {
37+
await page.wait({ selector: '#rso a h3', timeout: 5 });
38+
}
39+
catch {
40+
await page.wait(2);
41+
}
3342
const results = await page.evaluate(`
3443
(function() {
3544
var results = [];
@@ -63,7 +72,7 @@ cli({
6372
6473
var href = link.href || '';
6574
// Skip non-http, Google internal links, and duplicates
66-
if (!href.match(/^https?:\\/\\//)) continue;
75+
if (!(href.startsWith('http://') || href.startsWith('https://'))) continue;
6776
if (href.indexOf('google.com/search') !== -1) continue;
6877
if (seenUrls[href]) continue;
6978
seenUrls[href] = true;

clis/xiaohongshu/search.js

Lines changed: 26 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -227,12 +227,32 @@ export const command = cli({
227227
if (waitResult === 'login_wall') {
228228
throw new AuthRequiredError('www.xiaohongshu.com', 'Xiaohongshu search results are blocked behind a login wall');
229229
}
230-
// Scroll until enough rows are rendered or the lazy-load plateaus.
231-
// Replaces the previous fixed `autoScroll({ times: 2 })` which capped
232-
// extraction at ~13 notes regardless of `--limit` (#1471).
233-
await page.evaluate(buildScrollUntilJs(limit));
234-
const payload = await page.evaluate(buildSearchExtractJs('www.xiaohongshu.com'));
235-
const data = Array.isArray(payload) ? payload : [];
230+
// Extract before scrolling. Xiaohongshu uses a virtualized masonry
231+
// layout, so scrolling to the bottom can evict the initially visible
232+
// note cards from the DOM and make extraction return [] even though the
233+
// browser rendered results correctly.
234+
const initialPayload = await page.evaluate(buildSearchExtractJs('www.xiaohongshu.com'));
235+
let payload = Array.isArray(initialPayload) ? initialPayload : [];
236+
if (payload.length < limit) {
237+
// Scroll until enough rows are rendered or the lazy-load plateaus.
238+
// Replaces the previous fixed `autoScroll({ times: 2 })` which capped
239+
// extraction at ~13 notes regardless of `--limit` (#1471).
240+
await page.evaluate(buildScrollUntilJs(limit));
241+
const scrolledPayload = await page.evaluate(buildSearchExtractJs('www.xiaohongshu.com'));
242+
if (Array.isArray(scrolledPayload)) {
243+
const seen = new Set(payload.map((item) => item.url).filter(Boolean));
244+
for (const item of scrolledPayload) {
245+
if (item?.url && seen.has(item.url))
246+
continue;
247+
if (item?.url)
248+
seen.add(item.url);
249+
payload.push(item);
250+
if (payload.length >= limit)
251+
break;
252+
}
253+
}
254+
}
255+
const data = payload;
236256
return data
237257
.filter((item) => item.title)
238258
.slice(0, limit)

clis/xiaohongshu/search.test.js

Lines changed: 26 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -65,9 +65,7 @@ describe('xiaohongshu search', () => {
6565
const page = createPageMock([
6666
// First evaluate: MutationObserver wait (content appeared)
6767
'content',
68-
// Second evaluate: scroll-until-enough (returns final note count)
69-
1,
70-
// Third evaluate: main DOM extraction (returns array directly)
68+
// Second evaluate: initial DOM extraction (already enough results)
7169
[
7270
{
7371
title: '某鱼买FSD被坑了4万',
@@ -99,9 +97,7 @@ describe('xiaohongshu search', () => {
9997
const page = createPageMock([
10098
// First evaluate: MutationObserver wait (content appeared)
10199
'content',
102-
// Second evaluate: scroll-until-enough (returns final note count)
103-
3,
104-
// Third evaluate: main DOM extraction (returns array directly)
100+
// Second evaluate: initial DOM extraction (already enough valid rows)
105101
[
106102
{
107103
title: 'Result A',
@@ -137,17 +133,36 @@ describe('xiaohongshu search', () => {
137133
const page = createPageMock([
138134
// First evaluate: MutationObserver wait (content appeared)
139135
'content',
140-
// Second evaluate: scroll-until-enough (no rows rendered)
141-
0,
142-
// Third evaluate: extraction (returns empty array)
136+
// Second evaluate: initial extraction (no rows rendered)
143137
[],
144138
]);
145139
const result = (await cmd.func(page, { query: '测试等待', limit: 5 }));
146140
expect(result).toHaveLength(0);
147141
// Only one navigation, no retry
148142
expect(page.goto).toHaveBeenCalledTimes(1);
149-
// Three evaluate calls: wait + scroll-until + extraction
150-
expect(page.evaluate).toHaveBeenCalledTimes(3);
143+
// Four evaluate calls: wait, initial extraction, scroll-until, post-scroll extraction.
144+
expect(page.evaluate).toHaveBeenCalledTimes(4);
145+
});
146+
it('scrolls only when the initial extraction has fewer rows than requested', async () => {
147+
const cmd = getRegistry().get('xiaohongshu/search');
148+
expect(cmd?.func).toBeTypeOf('function');
149+
const page = createPageMock([
150+
'content',
151+
[
152+
{ title: 'Result A', author: 'UserA', likes: '10', url: 'https://www.xiaohongshu.com/search_result/aaa', author_url: '' },
153+
],
154+
3,
155+
[
156+
{ title: 'Result A', author: 'UserA', likes: '10', url: 'https://www.xiaohongshu.com/search_result/aaa', author_url: '' },
157+
{ title: 'Result B', author: 'UserB', likes: '5', url: 'https://www.xiaohongshu.com/search_result/bbb', author_url: '' },
158+
],
159+
]);
160+
161+
const result = (await cmd.func(page, { query: '测试等待', limit: 2 }));
162+
163+
expect(result).toHaveLength(2);
164+
expect(result.map((item) => item.title)).toEqual(['Result A', 'Result B']);
165+
expect(page.evaluate).toHaveBeenCalledTimes(4);
151166
});
152167
it('separates fallback author text from appended relative date', async () => {
153168
const cmd = getRegistry().get('xiaohongshu/search');
@@ -165,8 +180,6 @@ describe('xiaohongshu search', () => {
165180
markVisible(dom.window.document.querySelector('section.note-item'));
166181
const page = createPageMock([]);
167182
page.evaluate.mockImplementationOnce(async () => 'content');
168-
// scroll-until-enough returns the final visible row count
169-
page.evaluate.mockImplementationOnce(async () => 1);
170183
page.evaluate.mockImplementationOnce(async (script) => Function('document', 'getComputedStyle', `return (${script})`)(dom.window.document, dom.window.getComputedStyle.bind(dom.window)));
171184

172185
const result = await cmd.func(page, { query: '测试', limit: 1 });

extension/dist/background.js

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1480,9 +1480,7 @@ async function resolveTab(tabId, leaseKey, initialUrl) {
14801480
}
14811481
async function pageScopedResult(id, tabId, data) {
14821482
const page = await resolveTargetId(tabId);
1483-
const lease = [...automationSessions.values()].find((session) => session.preferredTabId === tabId);
1484-
const scopedData = data && typeof data === "object" && !Array.isArray(data) ? { session: lease?.session, ...data } : { session: lease?.session, data };
1485-
return { id, ok: true, data: scopedData, page };
1483+
return { id, ok: true, data, page };
14861484
}
14871485
async function resolveTabId(tabId, leaseKey, initialUrl) {
14881486
const resolved = await resolveTab(tabId, leaseKey, initialUrl);

extension/manifest.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"manifest_version": 3,
33
"name": "OpenCLI",
4-
"version": "1.0.13",
4+
"version": "1.0.14",
55
"description": "Browser automation bridge for the OpenCLI CLI tool. Executes commands in Chrome tab leases via a local daemon.",
66
"permissions": [
77
"debugger",

extension/package-lock.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

extension/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "opencli-extension",
3-
"version": "1.0.13",
3+
"version": "1.0.14",
44
"private": true,
55
"opencli": {
66
"compatRange": ">=1.7.0"

extension/src/background.test.ts

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -574,7 +574,6 @@ describe('background tab isolation', () => {
574574
title: 'bilibili',
575575
url: 'https://www.bilibili.com/',
576576
timedOut: false,
577-
session: 'twitter',
578577
},
579578
});
580579
expect(update).not.toHaveBeenCalled();

extension/src/background.ts

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1110,11 +1110,7 @@ async function resolveTab(tabId: number | undefined, leaseKey: string, initialUr
11101110
/** Build a page-scoped success result with targetId resolved from tabId */
11111111
async function pageScopedResult(id: string, tabId: number, data?: unknown): Promise<Result> {
11121112
const page = await identity.resolveTargetId(tabId);
1113-
const lease = [...automationSessions.values()].find((session) => session.preferredTabId === tabId);
1114-
const scopedData = data && typeof data === 'object' && !Array.isArray(data)
1115-
? { session: lease?.session, ...(data as Record<string, unknown>) }
1116-
: { session: lease?.session, data };
1117-
return { id, ok: true, data: scopedData, page };
1113+
return { id, ok: true, data, page };
11181114
}
11191115

11201116
/** Convenience wrapper returning just the tabId (used by most handlers) */

0 commit comments

Comments
 (0)