Skip to content

Feat: Strands search crawling Apify tools#3

Open
daveomri wants to merge 24 commits intomainfrom
feat/strands-search-crawling-actor-tools
Open

Feat: Strands search crawling Apify tools#3
daveomri wants to merge 24 commits intomainfrom
feat/strands-search-crawling-actor-tools

Conversation

@daveomri
Copy link
Copy Markdown
Collaborator

@daveomri daveomri commented Apr 1, 2026

Tester guides

link

Apify docs

link


Description

Part 2 of 3 — This PR builds on the core Apify tools introduced in feat/strands-core-apify-tools and should be reviewed after that PR is merged. A third PR (feat/strands-social-media-actor-tools) will add social media tools on top of both. A separate strands docs PR will cover documentation for all three sets of tools (core, search & crawling, and social).

This PR extends the Apify integration (src/strands_tools/apify.py) with five new search & crawling tools that wrap popular Actors from Apify Store, giving agents structured access to Google Search, Google Maps, YouTube, multi-page website crawling, and e-commerce product data.

New tools

Tool Underlying Actor Description
apify_google_search_scraper apify/google-search-scraper Search Google and return structured results (organic, ads, People Also Ask)
apify_google_places_scraper compass/crawler-google-places Search Google Maps for businesses and places, optionally with reviews
apify_youtube_scraper streamers/youtube-scraper Scrape YouTube videos, channels, or search results
apify_website_content_crawler apify/website-content-crawler Crawl a website and extract markdown content from multiple pages (multi-page counterpart to apify_scrape_url)
apify_ecommerce_scraper apify/e-commerce-scraping-tool Scrape product data from e-commerce sites (Amazon, eBay, Walmart, etc.)

All tools share the existing ApifyToolClient, error handling, and Rich console output from the core module. A shared _search_crawl_result helper keeps the implementations DRY.

Two new export lists are added:

  • APIFY_SEARCH_TOOLS — the five search & crawling tools
  • APIFY_ALL_TOOLSAPIFY_CORE_TOOLS + APIFY_SEARCH_TOOLS for convenience

Files changed

File Change
src/strands_tools/apify.py 5 new @tool-decorated functions, _search_crawl_result helper, Actor ID constants, APIFY_SEARCH_TOOLS and APIFY_ALL_TOOLS exports
tests/test_apify.py 25+ new test cases covering all five tools
docs/apify_tool.md Search & Crawling section with usage examples and parameter reference tables
README.md Tool table entries and usage examples for search & crawling tools

PR series

# Branch Scope Status
1 feat/strands-core-apify-tools Core tools: apify_run_actor, apify_get_dataset_items, apify_run_actor_and_get_dataset, apify_run_task, apify_run_task_and_get_dataset, apify_scrape_url Review first
2 feat/strands-search-crawling-actor-tools Search & crawling tools (this PR) Review after #1 merges
3 feat/strands-social-media-actor-tools Social media tools: Instagram, LinkedIn, Twitter/X, TikTok, Facebook Review after #2 merges
strands docs repo Documentation for all three tool sets Link TBD

Related Issues

#1
#2

Documentation PR

apify/docs#1

Type of Change

New Tool

Testing

  • Added unit tests for all five search & crawling tools, including:

    • Success paths with correct Actor ID and input mapping for each tool
    • Google Search Scraper: multi-page calculation, optional country/language codes, omission of unset params
    • Google Places Scraper: reviews toggle (maxReviews set to 0 when disabled), optional language
    • YouTube Scraper: search query, specific URLs, both combined, error when neither provided
    • Website Content Crawler: default parameters, URL validation
    • E-commerce Scraper: product vs. listing URL type routing, invalid url_type validation, URL validation
    • Missing dependency and missing token paths for each tool
    • Actor failure propagation
  • Ran hatch run prepare locally — no warnings or lint issues.

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Copy Markdown

@jirispilka jirispilka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 💪🏻

I have one question though.

I'm not sure about the previous integrations. When we add specific Actors, shouldn't we follow the Actor description and Input parameters description from the Actor details?

I'm not saying they are the best for LLMs but we might want to keep it the same? For example, for google maps scraper, the description and input field is queries.

There are two approaches:

  1. Mirror the Actor schema — stay aligned with the Actor's own parameter names and descriptions

  2. Unified abstraction layer — normalize parameters across all scrapers, which actually makes sense given each scraper has slightly different parameters (different developers, different conventions)

Looking at all the scrapers, the unification layer in Strands might actually be good.

Anyway, I wanted to raise it here as we should have some clarity across all the integrations we are making.

@drobnikj What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants