haystack/docs-website/reference/haystack-api/websearch_api.md at d0024c89b2635f67f01fc1a8ba2e7963b95f6c2c · deepset-ai/haystack

title	Websearch
id	websearch-api
description	Web search engine for Haystack.
slug	/websearch-api

searchapi

SearchApiWebSearch

Uses SearchApi to search the web for relevant documents.

Usage example: {/* test-ignore */}

from haystack.components.websearch import SearchApiWebSearch
from haystack.utils import Secret

websearch = SearchApiWebSearch(top_k=10, api_key=Secret.from_env_var("SERPERDEV_API_KEY"))
results = websearch.run(query="Who is the boyfriend of Olivia Wilde?")

assert results["documents"]
assert results["links"]

init

__init__(
    api_key: Secret = Secret.from_env_var("SEARCHAPI_API_KEY"),
    top_k: int | None = 10,
    allowed_domains: list[str] | None = None,
    search_params: dict[str, Any] | None = None,
) -> None

Initialize the SearchApiWebSearch component.

Parameters:

api_key (Secret) – API key for the SearchApi API
top_k (int | None) – Number of documents to return.
allowed_domains (list[str] | None) – List of domains to limit the search to.
search_params (dict[str, Any] | None) – Additional parameters passed to the SearchApi API. For example, you can set 'num' to 100 to increase the number of search results. See the SearchApi website for more details.

The default search engine is Google, however, users can change it by setting the engine parameter in the search_params.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> SearchApiWebSearch

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – The dictionary to deserialize from.

Returns:

SearchApiWebSearch – The deserialized component.

run

run(query: str) -> dict[str, list[Document] | list[str]]

Uses SearchApi to search the web.

Parameters:

query (str) – Search query.

Returns:

dict[str, list[Document] | list[str]] – A dictionary with the following keys:
"documents": List of documents returned by the search engine.
"links": List of links returned by the search engine.

Raises:

TimeoutError – If the request to the SearchApi API times out.
SearchApiError – If an error occurs while querying the SearchApi API.

run_async

run_async(query: str) -> dict[str, list[Document] | list[str]]

Asynchronously uses SearchApi to search the web.

This is the asynchronous version of the run method with the same parameters and return values.

Parameters:

query (str) – Search query.

Returns:

dict[str, list[Document] | list[str]] – A dictionary with the following keys:
"documents": List of documents returned by the search engine.
"links": List of links returned by the search engine.

Raises:

TimeoutError – If the request to the SearchApi API times out.
SearchApiError – If an error occurs while querying the SearchApi API.

serper_dev

SerperDevWebSearch

Uses Serper to search the web for relevant documents.

See the Serper Dev website for more details.

Usage example: {/* test-ignore */}

from haystack.components.websearch import SerperDevWebSearch
from haystack.utils import Secret

serper_dev_api = Secret.from_env_var("SERPERDEV_API_KEY")

websearch = SerperDevWebSearch(top_k=10, api_key=serper_dev_api)
results = websearch.run(query="Who is the boyfriend of Olivia Wilde?")

assert results["documents"]
assert results["links"]

# Example with domain filtering - exclude subdomains
websearch_filtered = SerperDevWebSearch(
    top_k=10,
    allowed_domains=["example.com"],
    exclude_subdomains=True,  # Only results from example.com, not blog.example.com
    api_key=serper_dev_api
)
results_filtered = websearch_filtered.run(query="search query")

init

__init__(
    api_key: Secret = Secret.from_env_var("SERPERDEV_API_KEY"),
    top_k: int | None = 10,
    allowed_domains: list[str] | None = None,
    search_params: dict[str, Any] | None = None,
    *,
    exclude_subdomains: bool = False
) -> None

Initialize the SerperDevWebSearch component.

Parameters:

api_key (Secret) – API key for the Serper API.
top_k (int | None) – Number of documents to return.
allowed_domains (list[str] | None) – List of domains to limit the search to.
exclude_subdomains (bool) – Whether to exclude subdomains when filtering by allowed_domains. If True, only results from the exact domains in allowed_domains will be returned. If False, results from subdomains will also be included. Defaults to False.
search_params (dict[str, Any] | None) – Additional parameters passed to the Serper API. For example, you can set 'num' to 20 to increase the number of search results. See the Serper website for more details.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> SerperDevWebSearch

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – The dictionary to deserialize from.

Returns:

SerperDevWebSearch – The deserialized component.

run

run(query: str) -> dict[str, list[Document] | list[str]]

Use Serper to search the web.

Parameters:

query (str) – Search query.

Returns:

dict[str, list[Document] | list[str]] – A dictionary with the following keys:
"documents": List of documents returned by the search engine.
"links": List of links returned by the search engine.

Raises:

SerperDevError – If an error occurs while querying the SerperDev API.
TimeoutError – If the request to the SerperDev API times out.

run_async

run_async(query: str) -> dict[str, list[Document] | list[str]]

Asynchronously uses Serper to search the web.

This is the asynchronous version of the run method with the same parameters and return values.

Parameters:

query (str) – Search query.

Returns:

dict[str, list[Document] | list[str]] – A dictionary with the following keys:
"documents": List of documents returned by the search engine.
"links": List of links returned by the search engine.

Raises:

SerperDevError – If an error occurs while querying the SerperDev API.
TimeoutError – If the request to the SerperDev API times out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

searchapi

SearchApiWebSearch

init

to_dict

from_dict

run

run_async

serper_dev

SerperDevWebSearch

init

to_dict

from_dict

run

run_async

FilesExpand file tree

websearch_api.md

Latest commit

History

websearch_api.md

File metadata and controls

searchapi

SearchApiWebSearch

init

to_dict

from_dict

run

run_async

serper_dev

SerperDevWebSearch

init

to_dict

from_dict

run

run_async