Skip to content

Commit ae32f73

Browse files
[docs]: add ignoreSelectors to docs for extract() (#2088)
# what changed - adds new param to reference section - adds example to "targeted extract" section <img width="723" height="974" alt="Screenshot 2026-05-06 at 10 36 25 AM" src="https://github.com/user-attachments/assets/35d676c6-48f0-46c7-8b76-cf319fa7af5e" /> <!-- This is an auto-generated description by cubic. --> --- ## Summary by cubic Documented the new `ignoreSelectors` option for `extract()` to exclude elements (and their descendants) from the snapshot. Added a targeted-extract example and clarified how it interacts with `selector`, plus minor wording updates for supported page types. <sup>Written for commit f039e45. Summary will update on new commits.</sup> <!-- End of auto-generated description by cubic. -->
1 parent 82bf5d4 commit ae32f73

2 files changed

Lines changed: 32 additions & 4 deletions

File tree

packages/docs/v3/basics/extract.mdx

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -204,6 +204,25 @@ const tableData = await stagehand.extract(
204204
);
205205
```
206206

207+
You can also exclude specific nodes (including their descendant nodes) with `ignoreSelectors`.
208+
209+
```typescript
210+
const article = await stagehand.extract(
211+
"extract the article title and body",
212+
z.object({
213+
title: z.string(),
214+
body: z.string()
215+
}),
216+
{
217+
ignoreSelectors: [".ad", ".newsletter-modal", "nav.related-posts"]
218+
}
219+
);
220+
```
221+
222+
<Note>
223+
`ignoreSelectors` removes all matches for each selector, along with each matched node's descendants. `selector` still scopes extraction to a single resolved subtree.
224+
</Note>
225+
207226

208227
## Best practices
209228

@@ -357,4 +376,4 @@ for (const pageNum of pageNumbers) {
357376
<Card title="Observe" icon="magnifying-glass" href="/v3/basics/observe">
358377
Analyze pages and preview actions
359378
</Card>
360-
</CardGroup>
379+
</CardGroup>

packages/docs/v3/references/extract.mdx

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ interface ExtractOptions {
4343
model?: ModelConfiguration;
4444
timeout?: number;
4545
selector?: string;
46+
ignoreSelectors?: string[];
4647
page?: PlaywrightPage | PuppeteerPage | PatchrightPage | Page;
4748
serverCache?: boolean;
4849
}
@@ -98,12 +99,20 @@ type ModelConfiguration =
9899
Optional selector (XPath, CSS selector, etc.) to limit extraction scope to a specific part of the page. Reduces token usage and improves accuracy.
99100
</ParamField>
100101
102+
<ParamField path="ignoreSelectors" type="string[]" optional>
103+
Optional list of selectors to exclude from the extracted snapshot before extraction runs. Each selector removes all matching elements and their descendants.
104+
105+
<Note>
106+
`ignoreSelectors` applies to all matches for each selector. `selector` keeps its single-target scoping behavior.
107+
</Note>
108+
</ParamField>
109+
101110
<ParamField path="page" type="PlaywrightPage | PuppeteerPage | PatchrightPage | Page" optional>
102111
Optional: Specify which page to perform the extraction on. Supports multiple browser automation libraries:
103-
- **Playwright**: Native Playwright Page objects
112+
- **Stagehand Page**: Native Stagehand Page objects
113+
- **Playwright**: Playwright Page objects
104114
- **Puppeteer**: Puppeteer Page objects
105115
- **Patchright**: Patchright Page objects
106-
- **Stagehand Page**: Stagehand's wrapped Page object
107116
108117
If not specified, defaults to the current "active" page in your Stagehand instance.
109118
</ParamField>
@@ -422,4 +431,4 @@ The following errors may be thrown by the `extract()` method:
422431
- **LLMResponseError** - Error in LLM response processing
423432
- **MissingLLMConfigurationError** - No LLM API key or client configured
424433
- **UnsupportedModelError** - The specified model is not supported for this operation
425-
- **InvalidAISDKModelFormatError** - Model string does not follow the required `provider/model` format
434+
- **InvalidAISDKModelFormatError** - Model string does not follow the required `provider/model` format

0 commit comments

Comments
 (0)