Skip to content

Commit f13f7e9

Browse files
authored
Add ArangoDB integration page (#511)
* Add ArangoDB integration page * Add ArangoDB integration page * Add logo for ArangoDB integration * Update ArangoDB integration logo
1 parent c124941 commit f13f7e9

2 files changed

Lines changed: 120 additions & 0 deletions

File tree

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
---
2+
layout: integration
3+
name: ArangoDB
4+
description: Use the ArangoDB database as a Document Store with Haystack
5+
authors:
6+
- name: deepset
7+
socials:
8+
github: deepset-ai
9+
twitter: Haystack_AI
10+
linkedin: https://www.linkedin.com/company/deepset-ai/
11+
pypi: https://pypi.org/project/arangodb-haystack/
12+
repo: https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/arangodb
13+
type: Document Store
14+
report_issue: https://github.com/deepset-ai/haystack-core-integrations/issues
15+
logo: /logos/arangodb.png
16+
version: Haystack 2.0
17+
toc: true
18+
---
19+
20+
[![PyPI - Version](https://img.shields.io/pypi/v/arangodb-haystack.svg)](https://pypi.org/project/arangodb-haystack/)
21+
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/arangodb-haystack.svg)](https://pypi.org/project/arangodb-haystack/)
22+
[![test](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/arangodb.yml/badge.svg)](https://github.com/deepset-ai/haystack-core-integrations/actions/workflows/arangodb.yml)
23+
24+
-----
25+
26+
**Table of Contents**
27+
28+
- [Overview](#overview)
29+
- [Installation](#installation)
30+
- [Usage](#usage)
31+
- [License](#license)
32+
33+
## Overview
34+
35+
[ArangoDB](https://arango.ai/) is an open-source, multi-model database that combines documents, graphs, and key/values with native vector search. This integration lets you use ArangoDB as a [Document Store](https://docs.haystack.deepset.ai/docs/document-store) in Haystack and retrieve documents with vector similarity search, which makes it a good fit for RAG and GraphRAG pipelines.
36+
37+
The integration provides two components:
38+
39+
- `ArangoDocumentStore`: a Document Store that stores documents (including their embeddings) in an ArangoDB collection and implements the [DocumentStore protocol](https://docs.haystack.deepset.ai/docs/document-store#documentstore-protocol).
40+
- `ArangoEmbeddingRetriever`: a [retriever](https://docs.haystack.deepset.ai/docs/retrievers) that fetches the most relevant documents from an `ArangoDocumentStore` using vector similarity on embeddings.
41+
42+
## Installation
43+
44+
Vector search requires ArangoDB 3.12 or later with the vector index enabled. You can quickly start a local instance with Docker:
45+
46+
```bash
47+
docker run -e ARANGO_ROOT_PASSWORD=test-password -p 8529:8529 arangodb:3.12 arangod --vector-index
48+
```
49+
50+
Install the integration with `pip`:
51+
52+
```bash
53+
pip install arangodb-haystack
54+
```
55+
56+
## Usage
57+
58+
By default, the `ArangoDocumentStore` reads its credentials from the `ARANGO_USERNAME` (optional, falls back to the `root` user) and `ARANGO_PASSWORD` environment variables:
59+
60+
```bash
61+
export ARANGO_PASSWORD="test-password"
62+
```
63+
64+
Then initialize the Document Store:
65+
66+
```python
67+
from haystack_integrations.document_stores.arangodb import ArangoDocumentStore
68+
69+
document_store = ArangoDocumentStore(
70+
host="http://localhost:8529",
71+
database="haystack",
72+
collection_name="haystack_documents",
73+
embedding_dimension=768,
74+
similarity_function="cosine",
75+
recreate_collection=True,
76+
)
77+
```
78+
79+
### Writing Documents to ArangoDocumentStore
80+
81+
To write documents to the `ArangoDocumentStore`, create an indexing pipeline that embeds and writes documents:
82+
83+
```python
84+
from haystack import Pipeline
85+
from haystack.components.converters import TextFileToDocument
86+
from haystack.components.writers import DocumentWriter
87+
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
88+
89+
indexing = Pipeline()
90+
indexing.add_component("converter", TextFileToDocument())
91+
indexing.add_component("embedder", SentenceTransformersDocumentEmbedder())
92+
indexing.add_component("writer", DocumentWriter(document_store))
93+
indexing.connect("converter", "embedder")
94+
indexing.connect("embedder", "writer")
95+
96+
indexing.run({"converter": {"sources": file_paths}})
97+
```
98+
99+
### Retrieval from ArangoDocumentStore
100+
101+
You can retrieve documents that are semantically similar to a query with a pipeline that uses the `ArangoEmbeddingRetriever`:
102+
103+
```python
104+
from haystack import Pipeline
105+
from haystack.components.embedders import SentenceTransformersTextEmbedder
106+
from haystack_integrations.components.retrievers.arangodb import ArangoEmbeddingRetriever
107+
108+
querying = Pipeline()
109+
querying.add_component("embedder", SentenceTransformersTextEmbedder())
110+
querying.add_component("retriever", ArangoEmbeddingRetriever(document_store=document_store, top_k=3))
111+
querying.connect("embedder", "retriever")
112+
113+
results = querying.run({"embedder": {"text": "my query"}})
114+
```
115+
116+
The retriever also supports [metadata filtering](https://docs.haystack.deepset.ai/docs/metadata-filtering), which you can pass either at initialization or at query time.
117+
118+
## License
119+
120+
`arangodb-haystack` is distributed under the terms of the [Apache-2.0](https://spdx.org/licenses/Apache-2.0.html) license.

logos/arangodb.png

4.15 KB
Loading

0 commit comments

Comments
 (0)