Skip to content

Commit 978d0e7

Browse files
authored
chore: add faiss integrations page (#407)
1 parent 0ec26f7 commit 978d0e7

1 file changed

Lines changed: 119 additions & 0 deletions

File tree

integrations/faiss.md

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
---
2+
layout: integration
3+
name: FAISS
4+
description: A Document Store for vector search using FAISS
5+
authors:
6+
- name: Guna Palanivel
7+
socials:
8+
github: GunaPalanivel
9+
- name: deepset
10+
socials:
11+
github: deepset-ai
12+
twitter: deepset_ai
13+
linkedin: https://www.linkedin.com/company/deepset-ai/
14+
pypi: https://pypi.org/project/faiss-haystack
15+
repo: https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/faiss
16+
type: Document Store
17+
report_issue: https://github.com/deepset-ai/haystack-core-integrations/issues
18+
logo: /logos/meta.png
19+
version: Haystack 2.0
20+
toc: true
21+
---
22+
23+
The integration provides `FAISSDocumentStore`, which uses [FAISS](https://github.com/facebookresearch/faiss) (Facebook AI Similarity Search) for vector search and a simple JSON file for metadata storage. It is suitable for small to medium-sized datasets where simplicity is preferred over scalability, and supports optional persistence by saving the FAISS index to a `.faiss` file and documents to a `.json` file. Use `FAISSEmbeddingRetriever` for semantic retrieval in your pipelines.
24+
25+
## Installation
26+
27+
Install the package with pip:
28+
29+
```bash
30+
pip install faiss-haystack
31+
```
32+
33+
For GPU-accelerated FAISS, install `faiss-gpu` separately and use it in place of the default `faiss-cpu` dependency where applicable.
34+
35+
The examples below use [Sentence Transformers](https://www.sbert.net/) for embeddings. Install with: `pip install "sentence-transformers>=5.0.0"`.
36+
37+
## Usage
38+
39+
### In-memory document store
40+
41+
Create an in-memory document store (no persistence):
42+
43+
```python
44+
from haystack_integrations.document_stores.faiss import FAISSDocumentStore
45+
46+
document_store = FAISSDocumentStore(embedding_dim=768)
47+
```
48+
49+
### Persisted document store
50+
51+
To save and load the index and documents from disk, pass `index_path`:
52+
53+
```python
54+
from haystack_integrations.document_stores.faiss import FAISSDocumentStore
55+
56+
document_store = FAISSDocumentStore(
57+
index_path="./my_faiss_index",
58+
index_string="Flat",
59+
embedding_dim=768,
60+
)
61+
# After writing documents, persist with:
62+
# document_store.save("./my_faiss_index")
63+
# Later, create the store with the same index_path to load from disk.
64+
```
65+
66+
### Writing documents
67+
68+
Use an indexing pipeline to write documents (with embeddings) to the store.
69+
This example uses Sentence Transformers (768 dimensions).
70+
71+
```python
72+
from haystack import Pipeline
73+
from haystack.components.converters import TextFileToDocument
74+
from haystack.components.writers import DocumentWriter
75+
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
76+
from haystack_integrations.document_stores.faiss import FAISSDocumentStore
77+
78+
document_store = FAISSDocumentStore(
79+
index_path="./my_faiss_index",
80+
index_string="Flat",
81+
embedding_dim=768,
82+
)
83+
84+
indexing = Pipeline()
85+
indexing.add_component("converter", TextFileToDocument())
86+
indexing.add_component("embedder", SentenceTransformersDocumentEmbedder())
87+
indexing.add_component("writer", DocumentWriter(document_store))
88+
indexing.connect("converter", "embedder")
89+
indexing.connect("embedder", "writer")
90+
indexing.run({"converter": {"sources": file_paths}})
91+
92+
# If using persistence, save after indexing
93+
# document_store.save("./my_faiss_index")
94+
```
95+
96+
### Retrieval with FAISSEmbeddingRetriever
97+
98+
Build a query pipeline using `FAISSEmbeddingRetriever` for semantic search:
99+
100+
```python
101+
from haystack import Pipeline
102+
from haystack.components.embedders import SentenceTransformersTextEmbedder
103+
from haystack_integrations.components.retrievers.faiss import FAISSEmbeddingRetriever
104+
105+
query_pipeline = Pipeline()
106+
query_pipeline.add_component("text_embedder", SentenceTransformersTextEmbedder())
107+
query_pipeline.add_component(
108+
"retriever",
109+
FAISSEmbeddingRetriever(document_store=document_store, top_k=10),
110+
)
111+
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
112+
113+
results = query_pipeline.run({"text_embedder": {"text": "your query"}})
114+
documents = results["retriever"]["documents"]
115+
```
116+
117+
## License
118+
119+
`faiss-haystack` is distributed under the terms of the [Apache-2.0](https://spdx.org/licenses/Apache-2.0.html) license.

0 commit comments

Comments
 (0)