You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs-website/docs/pipeline-components/rankers.mdx
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ Rankers are a group of components that order documents by given criteria. Their
14
14
|[AmazonBedrockRanker](rankers/amazonbedrockranker.mdx)| Ranks documents based on their similarity to the query using Amazon Bedrock models. |
15
15
|[CohereRanker](rankers/cohereranker.mdx)| Ranks documents based on their similarity to the query using Cohere rerank models. |
16
16
|[FastembedRanker](rankers/fastembedranker.mdx)| Ranks documents based on their similarity to the query using cross-encoder models supported by FastEmbed. |
17
-
|[FastembedColbertRanker](rankers/fastembedcolbertranker.mdx)| Ranks documents based on their similarity to the query using ColBERT models supported by FastEmbed. |
17
+
|[FastembedLateInteractionRanker](rankers/fastembedlateinteractionranker.mdx)| Ranks documents based on their similarity to the query using late interaction models supported by FastEmbed. |
18
18
|[HuggingFaceTEIRanker](rankers/huggingfaceteiranker.mdx)| Ranks documents based on their similarity to the query using a Text Embeddings Inference (TEI) API endpoint. |
19
19
|[JinaRanker](rankers/jinaranker.mdx)| Ranks documents based on their similarity to the query using Jina AI models. |
20
20
|[LLMRanker](rankers/llmranker.mdx)| Ranks documents for a query using a Large Language Model, which returns ranked document indices as JSON. |
Copy file name to clipboardExpand all lines: docs-website/docs/pipeline-components/rankers/fastembedlateinteractionranker.mdx
+32-21Lines changed: 32 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,11 @@
1
1
---
2
-
title: "FastembedColbertRanker"
3
-
id: fastembedcolbertranker
4
-
slug: "/fastembedcolbertranker"
5
-
description: "Use this component to rank documents based on ColBERT late-interaction scoring using models supported by FastEmbed."
2
+
title: "FastembedLateInteractionRanker"
3
+
id: fastembedlateinteractionranker
4
+
slug: "/fastembedlateinteractionranker"
5
+
description: "Use this component to rank documents based on lateinteraction scoring using models supported by FastEmbed."
6
6
---
7
7
8
-
# FastembedColbertRanker
8
+
# FastembedLateInteractionRanker
9
9
10
10
Use this component to rank documents based on their similarity to the query using ColBERT models via FastEmbed.
11
11
@@ -23,11 +23,11 @@ Use this component to rank documents based on their similarity to the query usin
23
23
24
24
## Overview
25
25
26
-
`FastembedColbertRanker` ranks documents using **ColBERT late-interaction scoring**. Unlike cross-encoder rankers (which encode the query and document together), ColBERT encodes the query and each document independently into token-level embeddings, then computes a **MaxSim** score: for each query token, it finds the most similar document token, and sums these maximum similarities into a final relevance score.
26
+
`FastembedLateInteractionRanker` ranks documents using **lateinteraction scoring**. Unlike cross-encoder rankers (which encode the query and document together), ColBERT encodes the query and each document independently into token-level embeddings, then computes a **MaxSim** score: for each query token, it finds the most similar document token, and sums these maximum similarities into a final relevance score.
27
27
28
28
This approach gives ColBERT a strong balance between accuracy and efficiency — it is more expressive than bi-encoders while being faster than cross-encoders at inference time.
29
29
30
-
`FastembedColbertRanker` is most useful in query pipelines such as a retrieval-augmented generation (RAG) pipeline or a document search pipeline. Use it after a Retriever to rerank a candidate set of documents by relevance. When combining with a Retriever, set the Retriever's `top_k` higher than the Ranker's `top_k` — retrieve a broad candidate set, then let ColBERT select the best ones.
30
+
`FastembedLateInteractionRanker` is most useful in query pipelines such as a retrieval-augmented generation (RAG) pipeline or a document search pipeline. Use it after a Retriever to rerank a candidate set of documents by relevance. When combining with a Retriever, set the Retriever's `top_k` higher than the Ranker's `top_k` — retrieve a broad candidate set, then let ColBERT select the best ones.
31
31
32
32
By default, this component uses the `colbert-ir/colbertv2.0` model. For details on different initialization settings, check out the [API reference](/reference/fastembed-embedders) page.
33
33
@@ -52,7 +52,7 @@ pip install fastembed-haystack
52
52
You can set the path where the model is stored in a cache directory. You can also set the number of threads a single `onnxruntime` session can use.
Below is an example of a full RAG pipeline that retrieves documents using embedding similarity, reranks them with `FastembedColbertRanker`, and generates an answer with an LLM.
95
+
Below is an example of a full RAG pipeline that retrieves documents using embedding similarity, reranks them with `FastembedLateInteractionRanker`, and generates an answer with an LLM.
96
+
97
+
This example uses the `HuggingFaceLocalChatGenerator`, which requires additional packages:
98
+
99
+
```shell
100
+
pip install "transformers[torch]"
101
+
```
94
102
95
103
```python
96
104
from haystack import Document, Pipeline
97
105
from haystack.document_stores.in_memory import InMemoryDocumentStore
98
-
from haystack.components.embedders import (
99
-
SentenceTransformersDocumentEmbedder,
100
-
SentenceTransformersTextEmbedder,
101
-
)
106
+
102
107
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
103
108
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
104
109
from haystack.components.generators.chat import HuggingFaceLocalChatGenerator
105
110
from haystack.components.writers import DocumentWriter
106
111
from haystack.dataclasses import ChatMessage
107
-
from haystack_integrations.components.rankers.fastembed import FastembedColbertRanker
112
+
from haystack_integrations.components.rankers.fastembed import (
113
+
FastembedLateInteractionRanker,
114
+
)
115
+
from haystack_integrations.components.embedders.fastembed import (
Copy file name to clipboardExpand all lines: docs-website/versioned_docs/version-2.27/pipeline-components/rankers.mdx
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,7 @@ Rankers are a group of components that order documents by given criteria. Their
14
14
|[AmazonBedrockRanker](rankers/amazonbedrockranker.mdx)| Ranks documents based on their similarity to the query using Amazon Bedrock models. |
15
15
|[CohereRanker](rankers/cohereranker.mdx)| Ranks documents based on their similarity to the query using Cohere rerank models. |
16
16
|[FastembedRanker](rankers/fastembedranker.mdx)| Ranks documents based on their similarity to the query using cross-encoder models supported by FastEmbed. |
17
+
|[FastembedLateInteractionRanker](rankers/fastembedlateinteractionranker.mdx)| Ranks documents based on their similarity to the query using late interaction models supported by FastEmbed. |
17
18
|[HuggingFaceTEIRanker](rankers/huggingfaceteiranker.mdx)| Ranks documents based on their similarity to the query using a Text Embeddings Inference (TEI) API endpoint. |
18
19
|[JinaRanker](rankers/jinaranker.mdx)| Ranks documents based on their similarity to the query using Jina AI models. |
19
20
|[LLMRanker](rankers/llmranker.mdx)| Ranks documents for a query using a Large Language Model, which returns ranked document indices as JSON. |
description: "Use this component to rank documents based on late interaction scoring using models supported by FastEmbed."
6
+
---
7
+
8
+
# FastembedLateInteractionRanker
9
+
10
+
Use this component to rank documents based on their similarity to the query using ColBERT models via FastEmbed.
11
+
12
+
<divclassName="key-value-table">
13
+
14
+
|||
15
+
| --- | --- |
16
+
|**Most common position in a pipeline**| In a query pipeline, after a component that returns a list of documents such as a [Retriever](../retrievers.mdx)|
17
+
|**Mandatory run variables**|`documents`: A list of documents <br /> <br />`query`: A query string |
18
+
|**Output variables**|`documents`: A list of documents |
`FastembedLateInteractionRanker` ranks documents using **late interaction scoring**. Unlike cross-encoder rankers (which encode the query and document together), ColBERT encodes the query and each document independently into token-level embeddings, then computes a **MaxSim** score: for each query token, it finds the most similar document token, and sums these maximum similarities into a final relevance score.
27
+
28
+
This approach gives ColBERT a strong balance between accuracy and efficiency — it is more expressive than bi-encoders while being faster than cross-encoders at inference time.
29
+
30
+
`FastembedLateInteractionRanker` is most useful in query pipelines such as a retrieval-augmented generation (RAG) pipeline or a document search pipeline. Use it after a Retriever to rerank a candidate set of documents by relevance. When combining with a Retriever, set the Retriever's `top_k` higher than the Ranker's `top_k` — retrieve a broad candidate set, then let ColBERT select the best ones.
31
+
32
+
By default, this component uses the `colbert-ir/colbertv2.0` model. For details on different initialization settings, check out the [API reference](/reference/fastembed-embedders) page.
33
+
34
+
:::note
35
+
ColBERT scores are **unnormalized sums** (not probabilities). Their magnitude depends on query length and document length, typically ranging from ~3 to ~30. They are meaningful for ranking within a single query but should not be compared across different queries.
36
+
:::
37
+
38
+
### Compatible Models
39
+
40
+
You can find the compatible ColBERT models in the [FastEmbed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
41
+
42
+
### Installation
43
+
44
+
To start using this integration with Haystack, install the package with:
45
+
46
+
```shell
47
+
pip install fastembed-haystack
48
+
```
49
+
50
+
### Parameters
51
+
52
+
You can set the path where the model is stored in a cache directory. You can also set the number of threads a single `onnxruntime` session can use.
53
+
54
+
```python
55
+
ranker = FastembedLateInteractionRanker(
56
+
model_name="colbert-ir/colbertv2.0",
57
+
cache_dir="/your_cache_directory",
58
+
threads=2,
59
+
)
60
+
```
61
+
62
+
For offline encoding of large document sets, enable data-parallel processing:
63
+
64
+
```python
65
+
ranker = FastembedLateInteractionRanker(
66
+
model_name="colbert-ir/colbertv2.0",
67
+
batch_size=64,
68
+
parallel=2, # number of parallel processes; 0 = use all cores
69
+
)
70
+
```
71
+
72
+
## Usage
73
+
74
+
### On its own
75
+
76
+
This example uses `FastembedLateInteractionRanker` to rank two simple documents.
77
+
78
+
```python
79
+
from haystack import Document
80
+
from haystack_integrations.components.rankers.fastembed import (
result = ranker.run(query="City in Germany", documents=docs)
89
+
print(result["documents"][0].content)
90
+
# Berlin
91
+
```
92
+
93
+
### In a pipeline
94
+
95
+
Below is an example of a full RAG pipeline that retrieves documents using embedding similarity, reranks them with `FastembedLateInteractionRanker`, and generates an answer with an LLM.
96
+
97
+
This example uses the `HuggingFaceLocalChatGenerator`, which requires additional packages:
98
+
99
+
```shell
100
+
pip install "transformers[torch]"
101
+
```
102
+
103
+
```python
104
+
from haystack import Document, Pipeline
105
+
from haystack.document_stores.in_memory import InMemoryDocumentStore
106
+
107
+
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
108
+
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
109
+
from haystack.components.generators.chat import HuggingFaceLocalChatGenerator
110
+
from haystack.components.writers import DocumentWriter
111
+
from haystack.dataclasses import ChatMessage
112
+
from haystack_integrations.components.rankers.fastembed import (
113
+
FastembedLateInteractionRanker,
114
+
)
115
+
from haystack_integrations.components.embedders.fastembed import (
116
+
FastembedDocumentEmbedder,
117
+
FastembedTextEmbedder,
118
+
)
119
+
120
+
# Set up and populate the document store
121
+
document_store = InMemoryDocumentStore()
122
+
docs = [
123
+
Document(content="Paris is the capital of France."),
124
+
Document(content="Berlin is the capital of Germany."),
125
+
Document(content="Madrid is the capital of Spain."),
0 commit comments