Skip to content

Commit a89cb5a

Browse files
B4nanclaude
andauthored
feat: add @crawlee/stagehand package for AI-powered browser automation (apify#3331)
## Summary Creates new `@crawlee/stagehand` package integrating Browserbase's Stagehand library with Crawlee. Extends `BrowserCrawler` to provide AI-powered browser automation capabilities. ## AI Methods on Page Object - `page.act()` - Natural language browser interactions - `page.extract()` - Structured data extraction with Zod schemas - `page.observe()` - Get available page actions - `page.agent()` - Multi-step autonomous agents ## Key Features - `StagehandCrawler` extends `BrowserCrawler` - Full anti-blocking support via `BrowserPool` integration - Browser fingerprinting automatically applied - Support for LOCAL and BROWSERBASE environments - Session-based fingerprint caching - Automatic proxy rotation on blocking ## Architecture The package implements a `BrowserPlugin` that wraps Stagehand, allowing Crawlee to: - Apply fingerprinted launch options to Stagehand's browser - Manage Stagehand instances through `BrowserPool` - Provide AI methods directly on the page object - Maintain all existing Crawlee features (hooks, scaling, sessions) ## Test Coverage - 38 tests across 5 test files - All tests passing - Covers plugin, controller, launcher, crawler, and utilities 🤖 Generated with [Claude Code](https://claude.com/claude-code) Closes apify#3064 --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 585bf17 commit a89cb5a

37 files changed

Lines changed: 4227 additions & 40 deletions

.github/workflows/test-e2e.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,9 @@ jobs:
5757
- name: Login to Apify
5858
run: npx -y apify-cli@beta login -t ${{ secrets.APIFY_SCRAPER_TESTS_API_TOKEN }}
5959

60+
- name: Add Apify secrets for E2E tests
61+
run: npx -y apify-cli@beta secrets add anthropicApiKey ${{ secrets.ANTHROPIC_API_KEY }}
62+
6063
- name: Install Dependencies
6164
run: yarn
6265

@@ -72,3 +75,4 @@ jobs:
7275
env:
7376
STORAGE_IMPLEMENTATION: ${{ matrix.storage }}
7477
APIFY_HTTPBIN_TOKEN: ${{ secrets.APIFY_HTTPBIN_TOKEN }}
78+
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

docs/guides/stagehand_crawler.mdx

Lines changed: 249 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,249 @@
1+
---
2+
id: stagehand-crawler-guide
3+
title: "StagehandCrawler guide"
4+
sidebar_label: "StagehandCrawler"
5+
description: AI-powered web crawling with natural language browser automation
6+
---
7+
8+
import ApiLink from '@site/src/components/ApiLink';
9+
import Tabs from '@theme/Tabs';
10+
import TabItem from '@theme/TabItem';
11+
import CodeBlock from '@theme/CodeBlock';
12+
13+
import StagehandBasicSource from '!!raw-loader!./stagehand_crawler_basic.ts';
14+
import StagehandExtractSource from '!!raw-loader!./stagehand_crawler_extract.ts';
15+
import StagehandCombinedSource from '!!raw-loader!./stagehand_crawler_combined.ts';
16+
17+
&#8203;<ApiLink to="stagehand-crawler/class/StagehandCrawler">`StagehandCrawler`</ApiLink> combines Crawlee's powerful crawling infrastructure with [Stagehand's](https://github.com/browserbase/stagehand) AI-powered browser automation. Instead of writing CSS selectors or XPath queries, you can interact with web pages using natural language instructions.
18+
19+
## What is Stagehand
20+
21+
[Stagehand](https://github.com/browserbase/stagehand) is an AI-powered browser automation library from Browserbase. It allows you to control a browser using natural language commands like "click the login button" or "extract the product price". Under the hood, Stagehand uses large language models (OpenAI, Anthropic, or Google) to understand the page structure and execute your instructions.
22+
23+
## How StagehandCrawler works
24+
25+
StagehandCrawler extends <ApiLink to="browser-crawler/class/BrowserCrawler">`BrowserCrawler`</ApiLink> and enhances each page with AI-powered methods. Here's the architecture:
26+
27+
1. **Stagehand launches the browser** - When a new browser is needed, Stagehand initializes and launches a Chromium browser
28+
2. **Playwright connects via CDP** - Crawlee connects Playwright to the same browser using Chrome DevTools Protocol (CDP)
29+
3. **Pages are enhanced with AI methods** - Each page gets `act()`, `extract()`, `observe()`, and `agent()` methods
30+
4. **BrowserPool manages scaling** - Crawlee's BrowserPool handles browser lifecycle, retries, and concurrency
31+
32+
```
33+
┌─────────────────────────────────────────────────────────┐
34+
│ StagehandCrawler │
35+
├─────────────────────────────────────────────────────────┤
36+
│ BrowserPool (manages browser lifecycle & concurrency) │
37+
├─────────────────────────────────────────────────────────┤
38+
│ Stagehand Instance │
39+
│ ├── Launches Chromium browser │
40+
│ ├── Provides CDP endpoint │
41+
│ └── Handles AI operations (act/extract/observe) │
42+
├─────────────────────────────────────────────────────────┤
43+
│ Playwright (connected via CDP) │
44+
│ └── Standard page operations (goto, click, type, etc.) │
45+
└─────────────────────────────────────────────────────────┘
46+
```
47+
48+
## Key features
49+
50+
The enhanced page object provides four AI-powered methods:
51+
52+
### `page.act(instruction)` - Perform actions
53+
54+
Execute actions on the page using natural language. See [Stagehand act() documentation](https://docs.stagehand.dev/reference/act) for more details.
55+
56+
```ts
57+
await page.act('Click the "Add to Cart" button');
58+
await page.act('Fill in the email field with test@example.com');
59+
await page.act('Scroll down to load more products');
60+
```
61+
62+
### `page.extract(instruction, schema)` - Extract structured data
63+
64+
Extract data from the page using a Zod schema for type safety. See [Stagehand extract() documentation](https://docs.stagehand.dev/reference/extract) for more details.
65+
66+
```ts
67+
import { z } from 'zod';
68+
69+
const productSchema = z.object({
70+
title: z.string(),
71+
price: z.number(),
72+
description: z.string(),
73+
});
74+
75+
const product = await page.extract('Get the product details', productSchema);
76+
// product is typed as { title: string, price: number, description: string }
77+
```
78+
79+
### `page.observe()` - Discover page actions
80+
81+
Analyze the page and get AI-suggested actions. This is useful for exploring unfamiliar pages or building adaptive scrapers. See [Stagehand observe() documentation](https://docs.stagehand.dev/reference/observe) for more details.
82+
83+
```ts
84+
const actions = await page.observe();
85+
// Returns available actions like:
86+
// [
87+
// { action: 'click', element: 'Load More button', selector: '.load-more' },
88+
// { action: 'click', element: 'Next Page link', selector: 'a.pagination-next' },
89+
// { action: 'fill', element: 'Search input', selector: '#search' },
90+
// ]
91+
92+
// Use observe to find pagination dynamically
93+
for (const action of actions) {
94+
if (action.element?.toLowerCase().includes('next page')) {
95+
await page.act(`Click the ${action.element}`);
96+
break;
97+
}
98+
}
99+
```
100+
101+
### `page.agent(config)` - Autonomous agents
102+
103+
Create an autonomous agent for complex multi-step workflows. Unlike `act()` which executes a single action, `agent()` can plan and execute multiple steps autonomously to achieve a goal. See [Stagehand agent() documentation](https://docs.stagehand.dev/reference/agent) for more details.
104+
105+
```ts
106+
const agent = page.agent({ model: 'gpt-4.1-mini' });
107+
const result = await agent.execute('Find the cheapest laptop and add it to cart');
108+
```
109+
110+
**When to use `act()` vs `agent()`:**
111+
- Use `act()` for single, discrete actions ("click this button", "fill this form")
112+
- Use `agent()` for goals requiring multiple steps with decision-making ("find and purchase the cheapest item")
113+
114+
Note that `agent()` makes multiple LLM calls and can be slower and more expensive than sequential `act()` calls where you control the flow.
115+
116+
## Requirements
117+
118+
StagehandCrawler requires an API key for the AI model provider. The recommended way is to use the `apiKey` option:
119+
120+
```ts
121+
const crawler = new StagehandCrawler({
122+
stagehandOptions: {
123+
model: 'openai/gpt-4.1-mini',
124+
apiKey: 'sk-...', // Your OpenAI API key
125+
},
126+
});
127+
```
128+
129+
Alternatively, you can use environment variables (used as fallback when `apiKey` is not provided):
130+
131+
- **OpenAI**: `OPENAI_API_KEY`
132+
- **Anthropic**: `ANTHROPIC_API_KEY`
133+
- **Google**: `GOOGLE_API_KEY`
134+
135+
## Limitations
136+
137+
Some Crawlee features work differently or are unavailable with StagehandCrawler:
138+
139+
### Chromium only
140+
141+
Stagehand uses Chrome DevTools Protocol (CDP), so only Chromium browsers are supported. The `launcher` option is ignored - you cannot use Firefox or WebKit.
142+
143+
### Reduced fingerprinting control
144+
145+
Since Stagehand controls the browser launch process, Crawlee's advanced fingerprinting features are limited:
146+
147+
- **Browser fingerprints** - Basic fingerprinting (viewport, user-agent) is applied, but low-level browser properties cannot be modified
148+
- **`launchOptions`** - Only a subset of Playwright launch options are passed through to Stagehand (`headless`, `args`, `executablePath`, `proxy`, `viewport`)
149+
- **Browser context options** - Custom context configurations are not fully supported since Stagehand manages the browser context
150+
151+
Stagehand provides its own anti-detection measures, but you have less granular control compared to PlaywrightCrawler.
152+
153+
## When to use StagehandCrawler
154+
155+
**Use StagehandCrawler when:**
156+
- Pages have complex, dynamic structures that are hard to scrape with selectors
157+
- You need to interact with pages in ways that are difficult to express programmatically
158+
- You want to quickly prototype scrapers without writing detailed selectors
159+
- The target website frequently changes its structure
160+
161+
**Consider alternatives when:**
162+
- You need maximum performance (use CheerioCrawler or PlaywrightCrawler)
163+
- You need to minimize costs (LLM API calls add up)
164+
- You need fine-grained browser control (use PlaywrightCrawler)
165+
- You need Firefox or WebKit support (use PlaywrightCrawler)
166+
167+
## Basic example
168+
169+
Here's a simple example that extracts code examples from the Crawlee website:
170+
171+
<CodeBlock language="ts" title="src/main.ts">{StagehandBasicSource}</CodeBlock>
172+
173+
## Data extraction example
174+
175+
Here's an example showing structured data extraction with Zod schemas:
176+
177+
<CodeBlock language="ts" title="src/main.ts">{StagehandExtractSource}</CodeBlock>
178+
179+
## Configuration options
180+
181+
### Stagehand options
182+
183+
Configure the AI behavior through `stagehandOptions`:
184+
185+
```ts
186+
const crawler = new StagehandCrawler({
187+
stagehandOptions: {
188+
// Environment: 'LOCAL' or 'BROWSERBASE'
189+
env: 'LOCAL',
190+
191+
// AI model to use (e.g., 'openai/gpt-4.1-mini', 'anthropic/claude-sonnet-4-20250514')
192+
model: 'openai/gpt-4.1-mini',
193+
194+
// API key for the LLM provider (can be overridden by environment variables)
195+
apiKey: process.env.OPENAI_API_KEY,
196+
197+
// Logging verbosity: 0 (minimal), 1 (standard), 2 (debug)
198+
verbose: 1,
199+
200+
// Enable automatic error recovery
201+
selfHeal: true,
202+
203+
// Timeout for DOM to stabilize (ms)
204+
domSettleTimeout: 30000,
205+
},
206+
});
207+
```
208+
209+
### Environment variables
210+
211+
Stagehand options can alternatively be set via environment variables. Programmatic options always take precedence over environment variables:
212+
213+
| Environment variable | Option | Notes |
214+
|---------------------|--------|-------|
215+
| `OPENAI_API_KEY` | `apiKey` | Fallback for OpenAI models |
216+
| `ANTHROPIC_API_KEY` | `apiKey` | Fallback for Anthropic models |
217+
| `GOOGLE_API_KEY` | `apiKey` | Fallback for Google models |
218+
| `STAGEHAND_ENV` | `env` | |
219+
| `STAGEHAND_MODEL` | `model` | |
220+
| `STAGEHAND_VERBOSE` | `verbose` | |
221+
| `STAGEHAND_API_KEY` | `apiKey` | Browserbase API key |
222+
| `STAGEHAND_PROJECT_ID` | `projectId` | Browserbase project ID |
223+
224+
## Using with Browserbase
225+
226+
For cloud browser infrastructure, you can use [Browserbase](https://browserbase.com/):
227+
228+
```ts
229+
const crawler = new StagehandCrawler({
230+
stagehandOptions: {
231+
env: 'BROWSERBASE',
232+
apiKey: process.env.BROWSERBASE_API_KEY,
233+
projectId: process.env.BROWSERBASE_PROJECT_ID,
234+
model: 'openai/gpt-4.1-mini',
235+
},
236+
});
237+
```
238+
239+
## Combining AI and standard methods
240+
241+
You can mix AI-powered methods with standard Playwright methods:
242+
243+
<CodeBlock language="ts" title="src/main.ts">{StagehandCombinedSource}</CodeBlock>
244+
245+
## Further reading
246+
247+
- [Stagehand documentation](https://docs.stagehand.dev/)
248+
- [Browserbase documentation](https://docs.browserbase.com/)
249+
- <ApiLink to="stagehand-crawler/class/StagehandCrawler">`StagehandCrawler` API reference</ApiLink>
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
import { StagehandCrawler } from '@crawlee/stagehand';
2+
import { z } from 'zod';
3+
4+
const crawler = new StagehandCrawler({
5+
stagehandOptions: {
6+
env: 'LOCAL',
7+
model: 'openai/gpt-4.1-mini',
8+
verbose: 1,
9+
},
10+
async requestHandler({ page, request, log, pushData }) {
11+
log.info(`Processing ${request.url}`);
12+
13+
// Use AI to extract the page title
14+
const title = await page.extract('Get the main heading of the page', z.string());
15+
16+
// Use AI to click on a navigation element
17+
await page.act('Click on the Documentation link');
18+
19+
// Extract structured data after navigation
20+
const navItems = await page.extract('Get all sidebar navigation items', z.array(z.string()));
21+
22+
log.info(`Found ${navItems.length} navigation items`);
23+
24+
await pushData({
25+
url: request.url,
26+
title,
27+
navItems,
28+
});
29+
},
30+
});
31+
32+
await crawler.run(['https://crawlee.dev']);
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
import { StagehandCrawler } from '@crawlee/stagehand';
2+
import { z } from 'zod';
3+
4+
const crawler = new StagehandCrawler({
5+
stagehandOptions: {
6+
model: 'openai/gpt-4.1-mini',
7+
},
8+
async requestHandler({ page, request, log, pushData }) {
9+
// Use standard Playwright navigation
10+
await page.goto(request.url);
11+
12+
// Use AI to interact with the page
13+
await page.act('Accept the cookie consent banner');
14+
15+
// Use standard Playwright for precise operations
16+
await page.waitForSelector('.product-list');
17+
18+
// Use AI for complex extraction
19+
const products = await page.extract(
20+
'Get all product names and prices',
21+
z.array(
22+
z.object({
23+
name: z.string(),
24+
price: z.number(),
25+
}),
26+
),
27+
);
28+
29+
log.info(`Extracted ${products.length} products`);
30+
31+
// Use standard Playwright for screenshots
32+
await page.screenshot({ path: 'products.png' });
33+
34+
await pushData({ url: request.url, products });
35+
},
36+
});
37+
38+
await crawler.run(['https://example.com/products']);
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
import { StagehandCrawler, Dataset } from '@crawlee/stagehand';
2+
import { z } from 'zod';
3+
4+
// Define a schema for the data you want to extract
5+
const ProductSchema = z.object({
6+
name: z.string(),
7+
price: z.number(),
8+
description: z.string(),
9+
inStock: z.boolean(),
10+
});
11+
12+
const ProductListSchema = z.object({
13+
products: z.array(ProductSchema),
14+
totalCount: z.number(),
15+
});
16+
17+
const crawler = new StagehandCrawler({
18+
stagehandOptions: {
19+
env: 'LOCAL',
20+
model: 'anthropic/claude-sonnet-4-20250514',
21+
verbose: 1,
22+
},
23+
maxRequestsPerCrawl: 10,
24+
async requestHandler({ page, request, log, enqueueLinks }) {
25+
log.info(`Scraping ${request.url}`);
26+
27+
// Extract structured product data using AI
28+
const data = await page.extract(
29+
'Extract all products from this page including their names, prices, descriptions, and availability',
30+
ProductListSchema,
31+
);
32+
33+
log.info(`Found ${data.products.length} products`);
34+
35+
// Save each product to the dataset
36+
for (const product of data.products) {
37+
await Dataset.pushData({
38+
...product,
39+
url: request.url,
40+
scrapedAt: new Date().toISOString(),
41+
});
42+
}
43+
44+
// Use AI to find and click "Next page" if it exists
45+
try {
46+
await page.act('Click the next page button if available');
47+
// Enqueue the new URL after navigation
48+
await enqueueLinks({
49+
strategy: 'same-domain',
50+
});
51+
} catch {
52+
log.info('No more pages to scrape');
53+
}
54+
},
55+
});
56+
57+
await crawler.run(['https://example-shop.com/products']);

0 commit comments

Comments
 (0)