Skip to content

Commit d6f4a44

Browse files
committed
docs: add browser use-cases column
1 parent f2a36d6 commit d6f4a44

7 files changed

Lines changed: 264 additions & 192 deletions

File tree

docs-site/how-to/browser-automation.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
Use browser tools when `web_fetch` isn't enough (JS-rendered pages, multi-step flows, forms, screenshots).
44

5+
For copy-paste recipes, see: [Browser Use Cases](browser-use-cases.md).
6+
57
## Prerequisites
68

79
Install Playwright (Python):
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# Browser Use Cases (Recipes)
2+
3+
This page is intentionally **copy-paste friendly**: concrete prompts, expected artifacts, and small troubleshooting notes.
4+
5+
See also: [Browser Automation (Playwright)](browser-automation.md).
6+
7+
## Prerequisites (Playwright)
8+
9+
Install Playwright (Python):
10+
11+
```bash
12+
python3 -m pip install playwright
13+
python3 -m playwright install chromium
14+
```
15+
16+
If your Python executable isn't `python3`, set `REXOS_BROWSER_PYTHON` (example: `python`).
17+
18+
## 1) GUI smoke check (example.com)
19+
20+
**Goal:** verify `browser_*` works end-to-end and leaves evidence in your workspace.
21+
22+
=== "Bash (macOS/Linux)"
23+
```bash
24+
mkdir -p rexos-demo && cd rexos-demo
25+
export REXOS_BROWSER_HEADLESS=0
26+
27+
rexos agent run --workspace . --prompt "Use browser tools to open https://example.com, read the page, write a 3-bullet summary to notes/example.md, save a screenshot to .rexos/browser/example.png, then close the browser."
28+
```
29+
30+
=== "PowerShell (Windows)"
31+
```powershell
32+
mkdir rexos-demo -Force | Out-Null
33+
cd rexos-demo
34+
$env:REXOS_BROWSER_HEADLESS = "0"
35+
36+
rexos agent run --workspace . --prompt "Use browser tools to open https://example.com, read the page, write a 3-bullet summary to notes/example.md, save a screenshot to .rexos/browser/example.png, then close the browser."
37+
```
38+
39+
**What to expect**
40+
41+
- `notes/example.md`
42+
- `.rexos/browser/example.png`
43+
44+
## 2) Real-world flow: Baidu “today’s weather” (Browser + Ollama)
45+
46+
**Goal:** open a Baidu results page, extract today’s weather info, and save evidence.
47+
48+
### Recommended model (Ollama)
49+
50+
Make sure Ollama has a strong instruction model (example):
51+
52+
```bash
53+
ollama pull qwen3:4b
54+
```
55+
56+
Then set it as default in `~/.rexos/config.toml`:
57+
58+
```toml
59+
[providers.ollama]
60+
default_model = "qwen3:4b"
61+
```
62+
63+
### Run (GUI mode)
64+
65+
=== "Bash (macOS/Linux)"
66+
```bash
67+
export REXOS_BROWSER_HEADLESS=0
68+
69+
rexos agent run --workspace . --prompt "Use browser tools to open https://www.baidu.com/s?wd=%E5%8C%97%E4%BA%AC%20%E4%BB%8A%E5%A4%A9%E5%A4%A9%E6%B0%94 . Wait for #content_left, then read the page. Extract today's weather info (conditions, temperature range, wind) from the page text. Write it to notes/weather.md. Save a screenshot to .rexos/browser/baidu_weather.png. Close the browser. If you can't find the weather, say so, but still save the screenshot."
70+
```
71+
72+
=== "PowerShell (Windows)"
73+
```powershell
74+
$env:REXOS_BROWSER_HEADLESS = "0"
75+
76+
rexos agent run --workspace . --prompt "Use browser tools to open https://www.baidu.com/s?wd=%E5%8C%97%E4%BA%AC%20%E4%BB%8A%E5%A4%A9%E5%A4%A9%E6%B0%94 . Wait for #content_left, then read the page. Extract today's weather info (conditions, temperature range, wind) from the page text. Write it to notes/weather.md. Save a screenshot to .rexos/browser/baidu_weather.png. Close the browser. If you can't find the weather, say so, but still save the screenshot."
77+
```
78+
79+
**What to expect**
80+
81+
- `notes/weather.md`
82+
- `.rexos/browser/baidu_weather.png`
83+
84+
!!! note "If you hit a CAPTCHA"
85+
Some sites may show CAPTCHAs or block automation. If that happens, try a different query/site, or switch to `web_search` + `web_fetch` when the content is not JS-heavy.
86+
87+
## 3) Wikipedia: open → summarize → screenshot
88+
89+
**Goal:** a stable no-login site for quick demos.
90+
91+
=== "Bash (macOS/Linux)"
92+
```bash
93+
export REXOS_BROWSER_HEADLESS=0
94+
95+
rexos agent run --workspace . --prompt "Use browser tools to open https://en.wikipedia.org/wiki/Rust_(programming_language) . Read the page. Write a short summary to notes/wiki_rust.md. Save a screenshot to .rexos/browser/wiki_rust.png. Close the browser."
96+
```
97+
98+
=== "PowerShell (Windows)"
99+
```powershell
100+
$env:REXOS_BROWSER_HEADLESS = "0"
101+
102+
rexos agent run --workspace . --prompt "Use browser tools to open https://en.wikipedia.org/wiki/Rust_(programming_language) . Read the page. Write a short summary to notes/wiki_rust.md. Save a screenshot to .rexos/browser/wiki_rust.png. Close the browser."
103+
```
104+
105+
**What to expect**
106+
107+
- `notes/wiki_rust.md`
108+
- `.rexos/browser/wiki_rust.png`
109+
110+
## 4) (From source) Run the browser + Ollama smoke test
111+
112+
If you're hacking on RexOS itself, you can run the ignored smoke test:
113+
114+
```bash
115+
REXOS_OLLAMA_MODEL=qwen3:4b cargo test -p rexos --test browser_baidu_weather_smoke -- --ignored --nocapture
116+
```
117+
118+
Expected output includes a line like:
119+
120+
- `[rexos][baidu_weather] summary=...`
121+
122+
This test uses a temp workspace and cleans it up. Use the recipes above if you want to keep screenshots and files.
123+
124+
## Tips
125+
126+
- For search engines, consider opening a **results URL** directly (more reliable than typing into the homepage search box).
127+
- Always `browser_close` at the end (even on errors).
128+
- Do not enter credentials or complete purchases without explicit user confirmation.

docs-site/how-to/use-cases.md

Lines changed: 1 addition & 96 deletions
Original file line numberDiff line numberDiff line change
@@ -174,7 +174,7 @@ Validate tool-calling + harness flow with Ollama first, then switch routing to b
174174

175175
Use browser tools when you need to interact with dynamic pages (JS-rendered content, clicking, typing, screenshots).
176176

177-
See also: [Browser Automation (Playwright)](browser-automation.md).
177+
See also: [Browser Automation (Playwright)](browser-automation.md) and [Browser Use Cases](browser-use-cases.md).
178178

179179
### Prerequisites
180180

@@ -226,98 +226,3 @@ export REXOS_WEBHOOK_URL="https://example.com/my-webhook"
226226
rexos agent run --workspace . --prompt "Use channel_send to enqueue: channel=webhook recipient=user1 message=hello"
227227
rexos channel drain
228228
```
229-
230-
---
231-
232-
## 10) Browser demo: GUI screenshot + summary (example.com)
233-
234-
Use this to verify browser automation end-to-end, with **persistent artifacts** in your workspace.
235-
236-
### Steps
237-
238-
1) Install Playwright (Python):
239-
240-
```bash
241-
python3 -m pip install playwright
242-
python3 -m playwright install chromium
243-
```
244-
245-
2) Run the demo (GUI mode):
246-
247-
=== "Bash (macOS/Linux)"
248-
```bash
249-
export REXOS_BROWSER_HEADLESS=0
250-
rexos agent run --workspace . --prompt "Use browser tools to open https://example.com, read the page, write a 3-bullet summary to notes/example.md, save a screenshot to .rexos/browser/example.png, then close the browser."
251-
```
252-
253-
=== "PowerShell (Windows)"
254-
```powershell
255-
$env:REXOS_BROWSER_HEADLESS = "0"
256-
rexos agent run --workspace . --prompt "Use browser tools to open https://example.com, read the page, write a 3-bullet summary to notes/example.md, save a screenshot to .rexos/browser/example.png, then close the browser."
257-
```
258-
259-
### What to expect
260-
261-
- `notes/example.md`
262-
- `.rexos/browser/example.png`
263-
264-
---
265-
266-
## 11) Browser + Ollama: Baidu “today’s weather” (real-world flow)
267-
268-
This is a more “real” flow: open a search results page, extract weather info, and save it.
269-
270-
### Steps
271-
272-
1) Make sure Ollama has an instruction model (example):
273-
274-
```bash
275-
ollama pull qwen3:4b
276-
```
277-
278-
2) (Optional, recommended) Use it as RexOS default model:
279-
280-
Edit `~/.rexos/config.toml` and set:
281-
282-
```toml
283-
[providers.ollama]
284-
default_model = "qwen3:4b"
285-
```
286-
287-
3) Run (GUI mode):
288-
289-
=== "Bash (macOS/Linux)"
290-
```bash
291-
export REXOS_BROWSER_HEADLESS=0
292-
rexos agent run --workspace . --prompt "Use browser tools to open https://www.baidu.com/s?wd=%E5%8C%97%E4%BA%AC%20%E4%BB%8A%E5%A4%A9%E5%A4%A9%E6%B0%94 . Wait for #content_left, then read the page. Extract today's weather info (conditions, temperature range, wind) from the page text. Write it to notes/weather.md. Save a screenshot to .rexos/browser/baidu_weather.png. Close the browser. If you can't find the weather, say so, but still save the screenshot."
293-
```
294-
295-
=== "PowerShell (Windows)"
296-
```powershell
297-
$env:REXOS_BROWSER_HEADLESS = "0"
298-
rexos agent run --workspace . --prompt "Use browser tools to open https://www.baidu.com/s?wd=%E5%8C%97%E4%BA%AC%20%E4%BB%8A%E5%A4%A9%E5%A4%A9%E6%B0%94 . Wait for #content_left, then read the page. Extract today's weather info (conditions, temperature range, wind) from the page text. Write it to notes/weather.md. Save a screenshot to .rexos/browser/baidu_weather.png. Close the browser. If you can't find the weather, say so, but still save the screenshot."
299-
```
300-
301-
### What to expect
302-
303-
- `notes/weather.md`
304-
- `.rexos/browser/baidu_weather.png`
305-
306-
!!! note "If you hit a CAPTCHA"
307-
Some sites may show CAPTCHAs or block automation. If that happens, try a different query/site, or switch to `web_search` + `web_fetch` when the content is not JS-heavy.
308-
309-
---
310-
311-
## 12) (From source) Run the browser + Ollama smoke test
312-
313-
If you're hacking on RexOS itself, you can run the ignored smoke test:
314-
315-
```bash
316-
REXOS_OLLAMA_MODEL=qwen3:4b cargo test -p rexos --test browser_baidu_weather_smoke -- --ignored --nocapture
317-
```
318-
319-
Expected output includes a line like:
320-
321-
- `[rexos][baidu_weather] summary=...`
322-
323-
This test uses a temp workspace and cleans it up. Use the recipes above if you want to keep screenshots and files.

docs-site/zh/how-to/browser-automation.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
`web_fetch` 不够用(JS 渲染页面、多步交互、表单填写、需要截图留证)时,使用 `browser_*` 工具更可靠。
44

5+
更多可复制粘贴的配方见:[浏览器案例](browser-use-cases.md)
6+
57
## 前置条件
68

79
安装 Playwright(Python):
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# 浏览器案例(配方)
2+
3+
这个页面尽量写成“可复制粘贴”的配方:具体 prompt、预期产物、以及一些简短的注意事项。
4+
5+
另见:[浏览器自动化(Playwright)](browser-automation.md)
6+
7+
## 前置条件(Playwright)
8+
9+
安装 Playwright(Python):
10+
11+
```bash
12+
python3 -m pip install playwright
13+
python3 -m playwright install chromium
14+
```
15+
16+
如果你的 Python 可执行文件不是 `python3`,可以通过环境变量 `REXOS_BROWSER_PYTHON` 指定(例如 `python`)。
17+
18+
## 1) 有界面 smoke check(example.com)
19+
20+
**目标:** 验证 `browser_*` 端到端可用,并在 workspace 里留下证据文件。
21+
22+
=== "Bash (macOS/Linux)"
23+
```bash
24+
mkdir -p rexos-demo && cd rexos-demo
25+
export REXOS_BROWSER_HEADLESS=0
26+
27+
rexos agent run --workspace . --prompt "使用 browser 工具打开 https://example.com,读取页面内容,把 3 条要点写到 notes/example.md,并把截图保存到 .rexos/browser/example.png,然后关闭浏览器。"
28+
```
29+
30+
=== "PowerShell (Windows)"
31+
```powershell
32+
mkdir rexos-demo -Force | Out-Null
33+
cd rexos-demo
34+
$env:REXOS_BROWSER_HEADLESS = "0"
35+
36+
rexos agent run --workspace . --prompt "使用 browser 工具打开 https://example.com,读取页面内容,把 3 条要点写到 notes/example.md,并把截图保存到 .rexos/browser/example.png,然后关闭浏览器。"
37+
```
38+
39+
**预期结果**
40+
41+
- `notes/example.md`
42+
- `.rexos/browser/example.png`
43+
44+
## 2) 更接近真实场景:百度“今天天气”(Browser + Ollama)
45+
46+
**目标:** 打开百度搜索结果页,提取“今天天气”关键信息,并截图留证。
47+
48+
### 推荐模型(Ollama)
49+
50+
确保 Ollama 有一个比较强的指令模型(示例):
51+
52+
```bash
53+
ollama pull qwen3:4b
54+
```
55+
56+
然后在 `~/.rexos/config.toml` 里设置默认模型:
57+
58+
```toml
59+
[providers.ollama]
60+
default_model = "qwen3:4b"
61+
```
62+
63+
### 运行(有界面模式)
64+
65+
=== "Bash (macOS/Linux)"
66+
```bash
67+
export REXOS_BROWSER_HEADLESS=0
68+
69+
rexos agent run --workspace . --prompt "使用 browser 工具打开 https://www.baidu.com/s?wd=%E5%8C%97%E4%BA%AC%20%E4%BB%8A%E5%A4%A9%E5%A4%A9%E6%B0%94 。等待 #content_left 出现后读取页面。请从页面文本中提取“今天天气”的关键信息(天气现象、温度范围、风力/风向),写入 notes/weather.md。把截图保存到 .rexos/browser/baidu_weather.png。最后关闭浏览器。如果找不到天气信息,请说明找不到,但仍要保存截图。"
70+
```
71+
72+
=== "PowerShell (Windows)"
73+
```powershell
74+
$env:REXOS_BROWSER_HEADLESS = "0"
75+
76+
rexos agent run --workspace . --prompt "使用 browser 工具打开 https://www.baidu.com/s?wd=%E5%8C%97%E4%BA%AC%20%E4%BB%8A%E5%A4%A9%E5%A4%A9%E6%B0%94 。等待 #content_left 出现后读取页面。请从页面文本中提取“今天天气”的关键信息(天气现象、温度范围、风力/风向),写入 notes/weather.md。把截图保存到 .rexos/browser/baidu_weather.png。最后关闭浏览器。如果找不到天气信息,请说明找不到,但仍要保存截图。"
77+
```
78+
79+
**预期结果**
80+
81+
- `notes/weather.md`
82+
- `.rexos/browser/baidu_weather.png`
83+
84+
!!! note "如果遇到验证码(CAPTCHA)"
85+
某些网站可能会弹验证码或限制自动化。如果遇到这种情况,可以换个站点/关键词,或者在内容不依赖 JS 的情况下改用 `web_search` + `web_fetch`
86+
87+
## 3) Wikipedia:打开 → 总结 → 截图
88+
89+
**目标:** 一个更稳定、无需登录的网站,用于快速演示。
90+
91+
=== "Bash (macOS/Linux)"
92+
```bash
93+
export REXOS_BROWSER_HEADLESS=0
94+
95+
rexos agent run --workspace . --prompt "使用 browser 工具打开 https://en.wikipedia.org/wiki/Rust_(programming_language) 。读取页面内容,把简短总结写到 notes/wiki_rust.md,并把截图保存到 .rexos/browser/wiki_rust.png,最后关闭浏览器。"
96+
```
97+
98+
=== "PowerShell (Windows)"
99+
```powershell
100+
$env:REXOS_BROWSER_HEADLESS = "0"
101+
102+
rexos agent run --workspace . --prompt "使用 browser 工具打开 https://en.wikipedia.org/wiki/Rust_(programming_language) 。读取页面内容,把简短总结写到 notes/wiki_rust.md,并把截图保存到 .rexos/browser/wiki_rust.png,最后关闭浏览器。"
103+
```
104+
105+
**预期结果**
106+
107+
- `notes/wiki_rust.md`
108+
- `.rexos/browser/wiki_rust.png`
109+
110+
## 4)(从源码)运行浏览器 + Ollama smoke test
111+
112+
如果你在开发 RexOS 本身,可以运行这个被 `#[ignore]` 的 smoke test:
113+
114+
```bash
115+
REXOS_OLLAMA_MODEL=qwen3:4b cargo test -p rexos --test browser_baidu_weather_smoke -- --ignored --nocapture
116+
```
117+
118+
预期输出会包含类似:
119+
120+
- `[rexos][baidu_weather] summary=...`
121+
122+
注意:该测试使用临时 workspace 并会自动清理;如果你想保留截图/文件,建议用上面的配方跑 `rexos agent run`
123+
124+
## 小技巧
125+
126+
- 对搜索引擎来说,直接打开**结果页 URL** 通常更稳(比在首页输入框里打字更不容易被拦)。
127+
- 出错时也尽量在最后调用 `browser_close`
128+
- 未经用户明确确认,不要输入账号密码,也不要进行任何付费/下单操作。

0 commit comments

Comments
 (0)