Skip to content

Commit 8255e41

Browse files
feat: add Tavily web search integration
1 parent dbf2486 commit 8255e41

File tree

2 files changed

+144
-0
lines changed

2 files changed

+144
-0
lines changed

integrations/tavily.md

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
---
2+
layout: integration
3+
name: Tavily
4+
description: Search the web using Tavily's AI-powered search API, optimized for LLM applications
5+
authors:
6+
- name: deepset
7+
socials:
8+
github: deepset-ai
9+
twitter: deepset_ai
10+
linkedin: https://www.linkedin.com/company/deepset-ai/
11+
pypi: https://pypi.org/project/tavily-haystack
12+
repo: https://github.com/deepset-ai/haystack-core-integrations
13+
type: Search & Extraction
14+
report_issue: https://github.com/deepset-ai/haystack-core-integrations/issues
15+
logo: /logos/tavily.png
16+
version: Haystack 2.0
17+
toc: true
18+
---
19+
20+
### Table of Contents
21+
22+
- [Overview](#overview)
23+
- [Installation](#installation)
24+
- [Usage](#usage)
25+
- [TavilyWebSearch](#tavilywebsearch)
26+
- [License](#license)
27+
28+
## Overview
29+
30+
[Tavily](https://tavily.com) is an AI-powered search API built for LLM applications. It returns high-quality, structured results with relevant content and source URLs — without the noise of traditional search engines.
31+
32+
This integration provides:
33+
- [`TavilyWebSearch`](https://docs.haystack.deepset.ai/docs/tavilywebsearch): Searches the web using the Tavily API and returns results as Haystack `Document` objects along with source URLs.
34+
35+
You need a Tavily API key to use this integration. You can get one at [tavily.com](https://tavily.com).
36+
37+
## Installation
38+
39+
```bash
40+
pip install tavily-haystack
41+
```
42+
43+
## Usage
44+
45+
### TavilyWebSearch
46+
47+
`TavilyWebSearch` queries the Tavily Search API and returns results as Haystack `Document` objects containing the content snippets and metadata (title, URL). Source URLs are also returned separately.
48+
49+
#### Basic Example
50+
51+
```python
52+
from haystack_integrations.components.websearch.tavily import TavilyWebSearch
53+
54+
web_search = TavilyWebSearch(top_k=5)
55+
56+
result = web_search.run(query="What is Haystack by deepset?")
57+
documents = result["documents"]
58+
links = result["links"]
59+
```
60+
61+
By default, the component reads the API key from the `TAVILY_API_KEY` environment variable. You can also pass it explicitly:
62+
63+
```python
64+
from haystack.utils import Secret
65+
from haystack_integrations.components.websearch.tavily import TavilyWebSearch
66+
67+
web_search = TavilyWebSearch(
68+
api_key=Secret.from_token("your-api-key"),
69+
top_k=5,
70+
search_params={"search_depth": "advanced"},
71+
)
72+
```
73+
74+
#### In a Pipeline
75+
76+
Here is an example of a RAG pipeline that uses `TavilyWebSearch` to retrieve web content and answer a question:
77+
78+
```python
79+
from haystack import Pipeline
80+
from haystack.utils import Secret
81+
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
82+
from haystack.components.generators.chat import OpenAIChatGenerator
83+
from haystack.dataclasses import ChatMessage
84+
from haystack_integrations.components.websearch.tavily import TavilyWebSearch
85+
86+
web_search = TavilyWebSearch(top_k=3)
87+
88+
prompt_template = [
89+
ChatMessage.from_system("You are a helpful assistant."),
90+
ChatMessage.from_user(
91+
"Given the information below:\n"
92+
"{% for document in documents %}{{ document.content }}\n{% endfor %}\n"
93+
"Answer the following question: {{ query }}.\nAnswer:",
94+
),
95+
]
96+
97+
prompt_builder = ChatPromptBuilder(
98+
template=prompt_template,
99+
required_variables={"query", "documents"},
100+
)
101+
102+
llm = OpenAIChatGenerator(
103+
api_key=Secret.from_env_var("OPENAI_API_KEY"),
104+
model="gpt-4o-mini",
105+
)
106+
107+
pipe = Pipeline()
108+
pipe.add_component("search", web_search)
109+
pipe.add_component("prompt_builder", prompt_builder)
110+
pipe.add_component("llm", llm)
111+
112+
pipe.connect("search.documents", "prompt_builder.documents")
113+
pipe.connect("prompt_builder.prompt", "llm.messages")
114+
115+
query = "What is Haystack by deepset?"
116+
result = pipe.run(data={"search": {"query": query}, "prompt_builder": {"query": query}})
117+
print(result["llm"]["replies"][0].content)
118+
```
119+
120+
#### Async Support
121+
122+
The component supports asynchronous execution via `run_async`:
123+
124+
```python
125+
import asyncio
126+
from haystack_integrations.components.websearch.tavily import TavilyWebSearch
127+
128+
async def main():
129+
web_search = TavilyWebSearch(top_k=3)
130+
result = await web_search.run_async(query="What is Haystack by deepset?")
131+
print(f"Found {len(result['documents'])} documents")
132+
133+
asyncio.run(main())
134+
```
135+
136+
#### Parameters
137+
138+
- **`api_key`**: API key for Tavily. Defaults to the `TAVILY_API_KEY` environment variable.
139+
- **`top_k`**: Maximum number of results to return. Defaults to 10.
140+
- **`search_params`**: Additional parameters passed to the Tavily Search API. Supported keys include `search_depth`, `include_answer`, `include_raw_content`, `include_domains`, `exclude_domains`. See the [Tavily API reference](https://docs.tavily.com/docs/tavily-api/rest_api) for all available options.
141+
142+
### License
143+
144+
`tavily-haystack` is distributed under the terms of the [Apache-2.0](https://spdx.org/licenses/Apache-2.0.html) license.

logos/tavily.png

1.07 KB
Loading

0 commit comments

Comments
 (0)