Skip to content

Commit 93a986b

Browse files
SyedShahmeerAli12anakin87pandegodavidsbatistaHaystackBot
authored
docs: add TavilyWebSearch component page and external integration entry (#10954)
* docs: add TavilyWebSearch component page and external integration entry * fix: `CountDocumentsAsyncTest`, `WriteDocumentsAsyncTest`, `WriteDocumentsAsyncTest` (#10948) * fix: address #10917 * removing lazyimport + solving MRO conflict --------- Co-authored-by: David S. Batista <dsbatista@gmail.com> * docs: remove gpt-3.5-turbo mentions and use ChatMessage.txt (no content) (#10958) * fix: `DeleteAllAsyncTest`, `DeleteByFilterAsyncTest`, (#10952) * fix: address #10919 * adding delete_all_documents_async missing in InMemoryDocumentStore --------- Co-authored-by: David S. Batista <dsbatista@gmail.com> * Sync Haystack API reference on Docusaurus (#10959) Co-authored-by: davidsbatista <7937824+davidsbatista@users.noreply.github.com> * fix: `UpdateByFilterAsyncTest`, `CountDocumentsByFilterAsyncTest`, `CountUniqueMetadataByFilterAsyncTest` (#10953) * fix: address #10920 * formatting --------- Co-authored-by: David S. Batista <dsbatista@gmail.com> * Sync Haystack API reference on Docusaurus (#10961) Co-authored-by: davidsbatista <7937824+davidsbatista@users.noreply.github.com> * docs: fixing code snippets syntax errors (#10955) * fixing docs syntax errors * fixing a few more docs syntax errors * feat: add get_meta_data async mixin tests to haystack.testing + InMemoryDocumentStore async operations and tests (#10963) * adding get_metadata async related Mixin tests * adding get_metadata async methods to the InMemoryDocumentStore * using Mixin async metadata tests to InMemoryDocumentstore tests * adding release notes * double ticks in release notes * Update haystack/testing/document_store_async.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Sync Haystack API reference on Docusaurus (#10962) Co-authored-by: davidsbatista <7937824+davidsbatista@users.noreply.github.com> * docs: update llama.cpp repo links from ggerganov to ggml-org (#10964) * Sync Core Integrations API reference (nvidia) on Docusaurus (#10974) Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com> * build: switch to trusted publishing (#10976) * test: adding mixing filter async tests + implementing them in InMemoryDocumentStore tests (#10975) * docs: address reviewer feedback on TavilyWebSearch docs - Fix pipeline position description (remove LinkContentFetcher reference) - Remove hardcoded model name to avoid future maintenance - Fix .content -> .text (field was removed) - Move Tavily entry from external-integrations-websearch.mdx to websearch.mdx - Copy tavilywebsearch.mdx to versioned_docs/version-2.26 - Add tavilywebsearch to sidebars.js and version-2.26-sidebars.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> Co-authored-by: Miguel Miranda Dias <7780875+pandego@users.noreply.github.com> Co-authored-by: David S. Batista <dsbatista@gmail.com> Co-authored-by: Haystack Bot <73523382+HaystackBot@users.noreply.github.com> Co-authored-by: davidsbatista <7937824+davidsbatista@users.noreply.github.com> Co-authored-by: SATISH K C <157192662+satishkc7@users.noreply.github.com> Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com> Co-authored-by: Julian Risch <julian.risch@deepset.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent aafc3a3 commit 93a986b

7 files changed

Lines changed: 211 additions & 1 deletion

File tree

docs-website/docs/pipeline-components/websearch.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,4 @@ Use these components to look up answers on the internet.
1414
| [FirecrawlWebSearch](websearch/firecrawlwebsearch.mdx) | Search engine using the Firecrawl API. |
1515
| [SearchApiWebSearch](websearch/searchapiwebsearch.mdx) | Search engine using Search API. |
1616
| [SerperDevWebSearch](websearch/serperdevwebsearch.mdx) | Search engine using SerperDev API. |
17+
| [TavilyWebSearch](websearch/tavilywebsearch.mdx) | Search engine using the Tavily AI-powered search API. |

docs-website/docs/pipeline-components/websearch/external-integrations-websearch.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,4 @@ External integrations that enable websearch with Haystack.
1313
| --- | --- |
1414
| [DuckDuckGo](https://haystack.deepset.ai/integrations/duckduckgo-api-websearch) | Use DuckDuckGo API for web searches. |
1515
| [Exa](https://haystack.deepset.ai/integrations/exa) | Search the web with Exa's AI-powered search, get content, answers, and conduct deep research. |
16-
| [Serpex](https://haystack.deepset.ai/integrations/serpex) | Multi-engine web search for Haystack — access Google, Bing, DuckDuckGo, Brave, Yahoo, and Yandex via Serpex API. |
16+
| [Serpex](https://haystack.deepset.ai/integrations/serpex) | Multi-engine web search for Haystack — access Google, Bing, DuckDuckGo, Brave, Yahoo, and Yandex via Serpex API. |
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
---
2+
title: "TavilyWebSearch"
3+
id: tavilywebsearch
4+
slug: "/tavilywebsearch"
5+
description: "Search engine using the Tavily AI-powered search API."
6+
---
7+
8+
# TavilyWebSearch
9+
10+
Search the web using the Tavily AI-powered search API, optimized for LLM applications.
11+
12+
<div className="key-value-table">
13+
14+
| | |
15+
| --- | --- |
16+
| **Most common position in a pipeline** | Before a [`ChatPromptBuilder`](../builders/chatpromptbuilder.mdx) or right at the beginning of an indexing pipeline |
17+
| **Mandatory init variables** | `api_key`: The Tavily API key. Can be set with the `TAVILY_API_KEY` env var. |
18+
| **Mandatory run variables** | `query`: A string with your search query. |
19+
| **Output variables** | `documents`: A list of Haystack Documents containing search result content and metadata. <br /> <br />`links`: A list of strings of resulting URLs. |
20+
| **API reference** | [Tavily Search API](/reference/integrations-tavily) |
21+
| **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/blob/main/integrations/tavily/src/haystack_integrations/components/websearch/tavily/tavily_websearch.py |
22+
23+
</div>
24+
25+
## Overview
26+
27+
When you give `TavilyWebSearch` a query, it uses the [Tavily](https://tavily.com) Search API to search the web and return relevant content as Haystack `Document` objects. It also returns a list of the source URLs.
28+
29+
Tavily is an AI-powered search API built specifically for LLM applications. It returns clean, relevant snippets without the noise of traditional search engines, making it a great fit for RAG pipelines.
30+
31+
`TavilyWebSearch` requires a Tavily API key to work. By default, it looks for a `TAVILY_API_KEY` environment variable. Alternatively, you can pass an `api_key` directly during initialization.
32+
33+
## Usage
34+
35+
### On its own
36+
37+
Here is a quick example of how `TavilyWebSearch` searches the web based on a query and returns a list of Documents.
38+
39+
```python
40+
from haystack_integrations.components.websearch.tavily import TavilyWebSearch
41+
from haystack.utils import Secret
42+
43+
web_search = TavilyWebSearch(
44+
api_key=Secret.from_env_var("TAVILY_API_KEY"),
45+
top_k=5,
46+
)
47+
query = "What is Haystack by deepset?"
48+
49+
response = web_search.run(query=query)
50+
51+
for doc in response["documents"]:
52+
print(doc.content)
53+
```
54+
55+
### In a pipeline
56+
57+
Here is an example of a Retrieval-Augmented Generation (RAG) pipeline that uses `TavilyWebSearch` to look up an answer on the web.
58+
59+
```python
60+
from haystack import Pipeline
61+
from haystack.utils import Secret
62+
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
63+
from haystack.components.generators.chat import OpenAIChatGenerator
64+
from haystack_integrations.components.websearch.tavily import TavilyWebSearch
65+
from haystack.dataclasses import ChatMessage
66+
67+
web_search = TavilyWebSearch(
68+
api_key=Secret.from_env_var("TAVILY_API_KEY"),
69+
top_k=3,
70+
)
71+
72+
prompt_template = [
73+
ChatMessage.from_system("You are a helpful assistant."),
74+
ChatMessage.from_user(
75+
"Given the information below:\n"
76+
"{% for document in documents %}{{ document.content }}\n{% endfor %}\n"
77+
"Answer the following question: {{ query }}.\nAnswer:",
78+
),
79+
]
80+
81+
prompt_builder = ChatPromptBuilder(
82+
template=prompt_template,
83+
required_variables={"query", "documents"},
84+
)
85+
86+
llm = OpenAIChatGenerator(
87+
api_key=Secret.from_env_var("OPENAI_API_KEY"),
88+
)
89+
90+
pipe = Pipeline()
91+
pipe.add_component("search", web_search)
92+
pipe.add_component("prompt_builder", prompt_builder)
93+
pipe.add_component("llm", llm)
94+
95+
pipe.connect("search.documents", "prompt_builder.documents")
96+
pipe.connect("prompt_builder.prompt", "llm.messages")
97+
98+
query = "What is Haystack by deepset?"
99+
100+
result = pipe.run(data={"search": {"query": query}, "prompt_builder": {"query": query}})
101+
102+
print(result["llm"]["replies"][0].text)
103+
```

docs-website/sidebars.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -613,6 +613,7 @@ export default {
613613
'pipeline-components/websearch/firecrawlwebsearch',
614614
'pipeline-components/websearch/searchapiwebsearch',
615615
'pipeline-components/websearch/serperdevwebsearch',
616+
'pipeline-components/websearch/tavilywebsearch',
616617
'pipeline-components/websearch/external-integrations-websearch',
617618
],
618619
},

docs-website/versioned_docs/version-2.26/pipeline-components/websearch.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,4 @@ Use these components to look up answers on the internet.
1414
| [FirecrawlWebSearch](websearch/firecrawlwebsearch.mdx) | Search engine using the Firecrawl API. |
1515
| [SearchApiWebSearch](websearch/searchapiwebsearch.mdx) | Search engine using Search API. |
1616
| [SerperDevWebSearch](websearch/serperdevwebsearch.mdx) | Search engine using SerperDev API. |
17+
| [TavilyWebSearch](websearch/tavilywebsearch.mdx) | Search engine using the Tavily AI-powered search API. |
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
---
2+
title: "TavilyWebSearch"
3+
id: tavilywebsearch
4+
slug: "/tavilywebsearch"
5+
description: "Search engine using the Tavily AI-powered search API."
6+
---
7+
8+
# TavilyWebSearch
9+
10+
Search the web using the Tavily AI-powered search API, optimized for LLM applications.
11+
12+
<div className="key-value-table">
13+
14+
| | |
15+
| --- | --- |
16+
| **Most common position in a pipeline** | Before a [`ChatPromptBuilder`](../builders/chatpromptbuilder.mdx) or right at the beginning of an indexing pipeline |
17+
| **Mandatory init variables** | `api_key`: The Tavily API key. Can be set with the `TAVILY_API_KEY` env var. |
18+
| **Mandatory run variables** | `query`: A string with your search query. |
19+
| **Output variables** | `documents`: A list of Haystack Documents containing search result content and metadata. <br /> <br />`links`: A list of strings of resulting URLs. |
20+
| **API reference** | [Tavily Search API](/reference/integrations-tavily) |
21+
| **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/blob/main/integrations/tavily/src/haystack_integrations/components/websearch/tavily/tavily_websearch.py |
22+
23+
</div>
24+
25+
## Overview
26+
27+
When you give `TavilyWebSearch` a query, it uses the [Tavily](https://tavily.com) Search API to search the web and return relevant content as Haystack `Document` objects. It also returns a list of the source URLs.
28+
29+
Tavily is an AI-powered search API built specifically for LLM applications. It returns clean, relevant snippets without the noise of traditional search engines, making it a great fit for RAG pipelines.
30+
31+
`TavilyWebSearch` requires a Tavily API key to work. By default, it looks for a `TAVILY_API_KEY` environment variable. Alternatively, you can pass an `api_key` directly during initialization.
32+
33+
## Usage
34+
35+
### On its own
36+
37+
Here is a quick example of how `TavilyWebSearch` searches the web based on a query and returns a list of Documents.
38+
39+
```python
40+
from haystack_integrations.components.websearch.tavily import TavilyWebSearch
41+
from haystack.utils import Secret
42+
43+
web_search = TavilyWebSearch(
44+
api_key=Secret.from_env_var("TAVILY_API_KEY"),
45+
top_k=5,
46+
)
47+
query = "What is Haystack by deepset?"
48+
49+
response = web_search.run(query=query)
50+
51+
for doc in response["documents"]:
52+
print(doc.content)
53+
```
54+
55+
### In a pipeline
56+
57+
Here is an example of a Retrieval-Augmented Generation (RAG) pipeline that uses `TavilyWebSearch` to look up an answer on the web.
58+
59+
```python
60+
from haystack import Pipeline
61+
from haystack.utils import Secret
62+
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
63+
from haystack.components.generators.chat import OpenAIChatGenerator
64+
from haystack_integrations.components.websearch.tavily import TavilyWebSearch
65+
from haystack.dataclasses import ChatMessage
66+
67+
web_search = TavilyWebSearch(
68+
api_key=Secret.from_env_var("TAVILY_API_KEY"),
69+
top_k=3,
70+
)
71+
72+
prompt_template = [
73+
ChatMessage.from_system("You are a helpful assistant."),
74+
ChatMessage.from_user(
75+
"Given the information below:\n"
76+
"{% for document in documents %}{{ document.content }}\n{% endfor %}\n"
77+
"Answer the following question: {{ query }}.\nAnswer:",
78+
),
79+
]
80+
81+
prompt_builder = ChatPromptBuilder(
82+
template=prompt_template,
83+
required_variables={"query", "documents"},
84+
)
85+
86+
llm = OpenAIChatGenerator(
87+
api_key=Secret.from_env_var("OPENAI_API_KEY"),
88+
)
89+
90+
pipe = Pipeline()
91+
pipe.add_component("search", web_search)
92+
pipe.add_component("prompt_builder", prompt_builder)
93+
pipe.add_component("llm", llm)
94+
95+
pipe.connect("search.documents", "prompt_builder.documents")
96+
pipe.connect("prompt_builder.prompt", "llm.messages")
97+
98+
query = "What is Haystack by deepset?"
99+
100+
result = pipe.run(data={"search": {"query": query}, "prompt_builder": {"query": query}})
101+
102+
print(result["llm"]["replies"][0].text)
103+
```

docs-website/versioned_sidebars/version-2.26-sidebars.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -609,6 +609,7 @@
609609
"pipeline-components/websearch/firecrawlwebsearch",
610610
"pipeline-components/websearch/searchapiwebsearch",
611611
"pipeline-components/websearch/serperdevwebsearch",
612+
"pipeline-components/websearch/tavilywebsearch",
612613
"pipeline-components/websearch/external-integrations-websearch"
613614
]
614615
},

0 commit comments

Comments
 (0)