Skip to content

Commit 28baaba

Browse files
committed
Include finish task in cli
1 parent a142e24 commit 28baaba

2 files changed

Lines changed: 137 additions & 30 deletions

File tree

README.md

Lines changed: 35 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ const page = await agent.newPage();
5252

5353
// 1. AI navigation — recordable, replayable.
5454
const nav = await page.ai(
55-
"Go to news.ycombinator.com, open the Show section, then click 'More' to go to the next page"
55+
"Go to Hacker News show section, go to next page"
5656
);
5757
await agent.savePlan("hn show page 2", nav, "./hn.plan.json");
5858

@@ -81,43 +81,31 @@ await agent.replay("./hn.plan.json", { page }); // zero tokens
8181
const { articles } = await page.extract(/* ... */); // tokens only here
8282
```
8383

84-
## `page.ai` vs `agent.executeTask` vs `agent.executeTaskAsync`
85-
86-
All three drive the browser with AI, return the same `TaskOutput`, and can be recorded + replayed.
87-
88-
| API | Use when |
89-
| ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------- |
90-
| `page.ai(task)` | You already have a page and want to mix Playwright calls (`page.goto`, `page.clickElement`) with AI steps on the same tab. Resolves when done. |
91-
| `agent.executeTask(task)` | "Here's a goal, figure it out." The agent owns the page; include URLs in the prompt and it navigates itself. Resolves when done. |
92-
| `agent.executeTaskAsync(task)` | Same as `executeTask` but returns a `Task` control handle immediately — `task.pause()`, `task.resume()`, `task.cancel()`, and per-step event callbacks. For long-running flows, CLIs, or anything a user can interrupt. |
93-
94-
## Record & replay
95-
96-
- `agent.savePlan(task, result, path)` writes a JSON plan with the action sequence and a stable `xpath` + `cssPath` for each clicked / typed element.
97-
- `agent.replay(path, { page })` re-runs those actions with no LLM calls, no screenshots, no DOM map.
98-
- `aiFallback: true` re-plans **only** a drifted step with the LLM; the rest stays free.
99-
- `startingUrl` (option, or `--url` on the CLI) retargets a plan at a different URL — useful for staging / preview deploys / different queries.
100-
- Plans are human-readable and hand-editable (tweak an `inputText` value, reorder or delete steps).
101-
102-
> The `output` string the model produced while recording is frozen in the plan — replay does **not** regenerate it. If the value of the run is live content or fresh reasoning, keep that in a follow-up `.extract()` / `.ai()`, not inside the recorded plan.
103-
10484
## CLI
10585

10686
Everything above is available without writing code:
10787

10888
```bash
10989
# Record while running
11090
browser-agent-cli run --save-plan ./hn.plan.json \
111-
-c "Go to news.ycombinator.com, open the Show section, then click 'More' to go to the next page"
91+
-c "Go to Hacker News show section, go to next page and find top 3 articles"
11292

113-
# Replay (zero LLM calls)
93+
# Replay: deterministic navigation (no LLM), then one fresh AI pass on the
94+
# result page to produce an up-to-date final response. The navigation part
95+
# is free; only the final pass spends tokens.
11496
browser-agent-cli replay ./hn.plan.json
11597

116-
# Self-heal drifted steps
117-
browser-agent-cli replay ./hn.plan.json --ai-fallback
98+
# Pure replay — skip the final AI pass and just get the browser onto the
99+
# result page (zero LLM calls end-to-end).
100+
browser-agent-cli replay ./hn.plan.json --no-ai-finish
101+
102+
# Use a different finishing task (e.g. ask for a custom summary of the
103+
# current page instead of re-running the recorded task).
104+
browser-agent-cli replay ./hn.plan.json \
105+
--finish-task "Return the titles of the first 3 posts as a bullet list"
118106

119-
# Retarget at a different URL (e.g. start from the Ask section instead)
120-
browser-agent-cli replay ./hn.plan.json --url https://news.ycombinator.com/ask
107+
# Self-heal drifted steps during replay (independent of the finish pass).
108+
browser-agent-cli replay ./hn.plan.json --ai-fallback
121109
```
122110

123111
LLM auto-detected from `GOOGLE_API_KEY` / `GEMINI_API_KEY``OPENAI_API_KEY``ANTHROPIC_API_KEY`. Override the model with `--llm-model` or `GEMINI_MODEL` / `OPENAI_MODEL` / `ANTHROPIC_MODEL`. `replay` only needs an LLM with `--ai-fallback`. Interactive: `ctrl+p` pause, `ctrl+r` resume.
@@ -166,6 +154,26 @@ const agent = new BrowserAgent({
166154
```
167155
</details>
168156

157+
## `page.ai` vs `agent.executeTask` vs `agent.executeTaskAsync`
158+
159+
All three drive the browser with AI, return the same `TaskOutput`, and can be recorded + replayed.
160+
161+
| API | Use when |
162+
| ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------- |
163+
| `page.ai(task)` | You already have a page and want to mix Playwright calls (`page.goto`, `page.clickElement`) with AI steps on the same tab. Resolves when done. |
164+
| `agent.executeTask(task)` | "Here's a goal, figure it out." The agent owns the page; include URLs in the prompt and it navigates itself. Resolves when done. |
165+
| `agent.executeTaskAsync(task)` | Same as `executeTask` but returns a `Task` control handle immediately — `task.pause()`, `task.resume()`, `task.cancel()`, and per-step event callbacks. For long-running flows, CLIs, or anything a user can interrupt. |
166+
167+
## Record & replay
168+
169+
- `agent.savePlan(task, result, path)` writes a JSON plan with the action sequence and a stable `xpath` + `cssPath` for each clicked / typed element.
170+
- `agent.replay(path, { page })` re-runs those actions with no LLM calls, no screenshots, no DOM map.
171+
- `aiFallback: true` re-plans **only** a drifted step with the LLM; the rest stays free.
172+
- `startingUrl` (option, or `--url` on the CLI) retargets a plan at a different URL — useful for staging / preview deploys / different queries.
173+
- Plans are human-readable and hand-editable (tweak an `inputText` value, reorder or delete steps).
174+
175+
> The `output` string the model produced while recording is frozen in the plan — the programmatic `agent.replay()` does **not** regenerate it. The CLI's `replay` command, by default, runs one fresh AI pass (`page.ai(plan.task, { maxSteps: 3 })`) on the result page after navigation so every CLI run ends with an up-to-date response; pass `--no-ai-finish` to get pure token-free replay and fall back to the recorded output. If you're wiring this up programmatically, run your own `.extract()` / `.ai()` on the page after `agent.replay()` instead of relying on the recorded `output`.
176+
169177
## License
170178

171179
MIT. Forked from [HyperAgent](https://github.com/hyperbrowserai/HyperAgent) (b49afe). Serverless browser support by [@sparticuz/chromium](https://github.com/Sparticuz/chromium).

src/cli/index.ts

Lines changed: 102 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -393,15 +393,25 @@ program
393393

394394
program
395395
.command("replay")
396-
.description("Replay a saved plan without calling the LLM")
396+
.description(
397+
"Replay a saved plan deterministically, then (by default) run a single AI pass on the result page to produce a fresh final response",
398+
)
397399
.argument(
398400
"<file>",
399401
"Path to a plan JSON file previously saved with --save-plan",
400402
)
401403
.option("-d, --debug", "Enable debug mode")
402404
.option(
403405
"--ai-fallback",
404-
"Fall back to .ai() for individual steps that fail (requires an LLM to be configured)",
406+
"Fall back to .ai() for individual steps that fail during replay (requires an LLM to be configured)",
407+
)
408+
.option(
409+
"--no-ai-finish",
410+
"Skip the final AI pass after replay (by default, one fresh .ai() call runs on the result page to regenerate the final response)",
411+
)
412+
.option(
413+
"--finish-task <task>",
414+
"Override the task used for the final AI pass (defaults to the plan's recorded task)",
405415
)
406416
.option(
407417
"-u, --url <url>",
@@ -411,13 +421,21 @@ program
411421
const options = this.opts();
412422
const debug = (options.debug as boolean) || false;
413423
const aiFallback = (options.aiFallback as boolean) || false;
424+
// commander's `--no-ai-finish` sets options.aiFinish === false; default is true.
425+
const aiFinishRequested = options.aiFinish !== false;
426+
const finishTaskOverride = (options.finishTask as string) || undefined;
414427
const startingUrl = (options.url as string) || undefined;
415428

416429
console.log(chalk.blue("BrowserAgent Replay"));
417430
const spinner = ora();
418431

419432
try {
420-
const llm = aiFallback ? await createDefaultLlm() : undefined;
433+
// An LLM is needed for either --ai-fallback (mid-replay re-planning)
434+
// or the post-replay finishing pass. --ai-fallback is strict (errors
435+
// if no LLM); the finishing pass is best-effort (silently skipped if
436+
// no LLM is configured) so `replay` still works without env vars.
437+
const needsLlm = aiFallback || aiFinishRequested;
438+
const llm = needsLlm ? await createDefaultLlm() : undefined;
421439
if (aiFallback && !llm) {
422440
console.error(
423441
chalk.red(
@@ -426,13 +444,22 @@ program
426444
);
427445
process.exit(1);
428446
}
447+
const willRunAiFinish = aiFinishRequested && !!llm;
429448

430449
const agent = new BrowserAgent({
431450
llm,
432451
debug,
433452
browserProvider: "Local",
434453
});
435454

455+
// Read the plan up-front so we can (a) use the recorded task as the
456+
// default finishing prompt and (b) surface the recorded final
457+
// response when no fresh pass is run. The recorded `output` is
458+
// frozen at record time — it is NOT re-generated by plain replay.
459+
const planJson = JSON.parse(
460+
(await fs.promises.readFile(file)).toString(),
461+
) as { output?: string; task?: string };
462+
436463
const page = await agent.newPage();
437464
spinner.start(`Replaying plan from ${file}`);
438465

@@ -458,6 +485,78 @@ program
458485

459486
spinner.succeed(chalk.green("Replay complete."));
460487

488+
// Post-replay: run a fresh AI pass so the CLI produces a real,
489+
// up-to-date answer on every run. Bounded to `maxSteps: 3` so the
490+
// model can look at the current page (and make a tiny correction if
491+
// needed) but cannot re-do the whole navigation.
492+
if (willRunAiFinish) {
493+
const finishTask =
494+
finishTaskOverride ??
495+
planJson.task ??
496+
"Based on the current page, produce the final answer to the original task.";
497+
spinner.start(chalk.blue("Running final AI pass on result page..."));
498+
try {
499+
const result = await page.ai(finishTask, { maxSteps: 3 });
500+
spinner.stop();
501+
console.log(
502+
boxen(result.output || "No Response", {
503+
title: chalk.yellow("BrowserAgent Response"),
504+
titleAlignment: "center",
505+
float: "center",
506+
padding: 1,
507+
margin: { top: 2, left: 0, right: 0, bottom: 0 },
508+
}),
509+
);
510+
} catch (err) {
511+
spinner.fail(
512+
chalk.red(
513+
`Final AI pass failed: ${err instanceof Error ? err.message : String(err)}`,
514+
),
515+
);
516+
// Fall back to showing the recorded output so the user still
517+
// sees something useful.
518+
if (planJson.output) {
519+
console.log(
520+
boxen(planJson.output, {
521+
title: chalk.dim(
522+
"Recorded Response (frozen at record time — fresh pass failed)",
523+
),
524+
titleAlignment: "center",
525+
float: "center",
526+
padding: 1,
527+
margin: { top: 1, left: 0, right: 0, bottom: 0 },
528+
borderStyle: "single",
529+
dimBorder: true,
530+
}),
531+
);
532+
}
533+
}
534+
} else if (planJson.output) {
535+
// No fresh pass requested (either --no-ai-finish or no LLM
536+
// configured). Print the recorded output so the CLI run has *some*
537+
// visible output, clearly labeled as archival.
538+
if (aiFinishRequested && !llm) {
539+
console.log(
540+
chalk.dim(
541+
"(No LLM configured — skipping final AI pass. Set an API key to get a fresh response on the current page.)",
542+
),
543+
);
544+
}
545+
console.log(
546+
boxen(planJson.output, {
547+
title: chalk.dim(
548+
"Recorded Response (frozen at record time — not re-generated)",
549+
),
550+
titleAlignment: "center",
551+
float: "center",
552+
padding: 1,
553+
margin: { top: 1, left: 0, right: 0, bottom: 0 },
554+
borderStyle: "single",
555+
dimBorder: true,
556+
}),
557+
);
558+
}
559+
461560
const shouldExit = await inquirer.confirm({
462561
message: "Close browser and exit?",
463562
default: true,

0 commit comments

Comments
 (0)