diff --git a/docs-website/reference/integrations-api/fastembed.md b/docs-website/reference/integrations-api/fastembed.md
index 256b8d87bc..46e42a35fb 100644
--- a/docs-website/reference/integrations-api/fastembed.md
+++ b/docs-website/reference/integrations-api/fastembed.md
@@ -11,6 +11,7 @@ slug: "/fastembed-embedders"
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
@@ -445,12 +446,141 @@ Embeds text using the Fastembed model.
- TypeError – If the input is not a string.
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
+
+- dict\[str, Any\] – Dictionary with serialized data.
+
+#### from_dict
+
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
@@ -483,7 +613,7 @@ __init__(
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
-)
+) -> None
```
Creates an instance of the 'FastembedRanker'.
@@ -537,7 +667,7 @@ Deserializes the component from a dictionary.
#### warm_up
```python
-warm_up()
+warm_up() -> None
```
Initializes the component.
diff --git a/docs-website/reference_versioned_docs/version-2.18/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.18/integrations-api/fastembed.md
index 9d8857b931..46e42a35fb 100644
--- a/docs-website/reference_versioned_docs/version-2.18/integrations-api/fastembed.md
+++ b/docs-website/reference_versioned_docs/version-2.18/integrations-api/fastembed.md
@@ -5,18 +5,17 @@ description: "FastEmbed integration for Haystack"
slug: "/fastembed-embedders"
---
-
-## Module haystack\_integrations.components.embedders.fastembed.fastembed\_document\_embedder
-
-
+## haystack_integrations.components.embedders.fastembed.fastembed_document_embedder
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
+
```python
# To use this component, install the "fastembed-haystack" package.
# pip install fastembed-haystack
@@ -57,104 +56,94 @@ print(f"Document Embedding: {result['documents'][0].embedding}")
print(f"Embedding Dimension: {len(result['documents'][0].embedding)}")
```
-
-
-#### FastembedDocumentEmbedder.\_\_init\_\_
+#### __init__
```python
-def __init__(model: str = "BAAI/bge-small-en-v1.5",
- cache_dir: str | None = None,
- threads: int | None = None,
- prefix: str = "",
- suffix: str = "",
- batch_size: int = 256,
- progress_bar: bool = True,
- parallel: int | None = None,
- local_files_only: bool = False,
- meta_fields_to_embed: list[str] | None = None,
- embedding_separator: str = "\n") -> None
+__init__(
+ model: str = "BAAI/bge-small-en-v1.5",
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ prefix: str = "",
+ suffix: str = "",
+ batch_size: int = 256,
+ progress_bar: bool = True,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ embedding_separator: str = "\n",
+) -> None
```
Create an FastembedDocumentEmbedder component.
-**Arguments**:
-
-- `model`: Local path or name of the model in Hugging Face's model hub,
-such as `BAAI/bge-small-en-v1.5`.
-- `cache_dir`: The path to the cache directory.
-Can be set using the `FASTEMBED_CACHE_PATH` env variable.
-Defaults to `fastembed_cache` in the system's temp directory.
-- `threads`: The number of threads single onnxruntime session can use. Defaults to None.
-- `prefix`: A string to add to the beginning of each text.
-- `suffix`: A string to add to the end of each text.
-- `batch_size`: Number of strings to encode at once.
-- `progress_bar`: If `True`, displays progress bar during embedding.
-- `parallel`: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
-If 0, use all available cores.
-If None, don't use data-parallel processing, use default onnxruntime threading instead.
-- `local_files_only`: If `True`, only use the model files in the `cache_dir`.
-- `meta_fields_to_embed`: List of meta fields that should be embedded along with the Document content.
-- `embedding_separator`: Separator used to concatenate the meta fields to the Document content.
-
-
-
-#### FastembedDocumentEmbedder.to\_dict
+**Parameters:**
+
+- **model** (str) – Local path or name of the model in Hugging Face's model hub,
+ such as `BAAI/bge-small-en-v1.5`.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **prefix** (str) – A string to add to the beginning of each text.
+- **suffix** (str) – A string to add to the end of each text.
+- **batch_size** (int) – Number of strings to encode at once.
+- **progress_bar** (bool) – If `True`, displays progress bar during embedding.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be embedded along with the Document content.
+- **embedding_separator** (str) – Separator used to concatenate the meta fields to the Document content.
+
+#### to_dict
```python
-def to_dict() -> dict[str, Any]
+to_dict() -> dict[str, Any]
```
Serializes the component to a dictionary.
-**Returns**:
+**Returns:**
-Dictionary with serialized data.
+- dict\[str, Any\] – Dictionary with serialized data.
-
-
-#### FastembedDocumentEmbedder.warm\_up
+#### warm_up
```python
-def warm_up() -> None
+warm_up() -> None
```
Initializes the component.
-
-
-#### FastembedDocumentEmbedder.run
+#### run
```python
-@component.output_types(documents=list[Document])
-def run(documents: list[Document]) -> dict[str, list[Document]]
+run(documents: list[Document]) -> dict[str, list[Document]]
```
Embeds a list of Documents.
-**Arguments**:
-
-- `documents`: List of Documents to embed.
+**Parameters:**
-**Raises**:
+- **documents** (list\[Document\]) – List of Documents to embed.
-- `TypeError`: If the input is not a list of Documents.
+**Returns:**
-**Returns**:
-
-A dictionary with the following keys:
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
- `documents`: List of Documents with each Document's `embedding` field set to the computed embeddings.
-
+**Raises:**
-## Module haystack\_integrations.components.embedders.fastembed.fastembed\_sparse\_document\_embedder
+- TypeError – If the input is not a list of Documents.
-
+## haystack_integrations.components.embedders.fastembed.fastembed_sparse_document_embedder
### FastembedSparseDocumentEmbedder
FastembedSparseDocumentEmbedder computes Document embeddings using Fastembed sparse models.
Usage example:
+
```python
from haystack_integrations.components.embedders.fastembed import FastembedSparseDocumentEmbedder
from haystack.dataclasses import Document
@@ -192,103 +181,93 @@ print(f"Document Sparse Embedding: {result['documents'][0].sparse_embedding}")
print(f"Sparse Embedding Dimension: {len(result['documents'][0].sparse_embedding)}")
```
-
-
-#### FastembedSparseDocumentEmbedder.\_\_init\_\_
+#### __init__
```python
-def __init__(model: str = "prithivida/Splade_PP_en_v1",
- cache_dir: str | None = None,
- threads: int | None = None,
- batch_size: int = 32,
- progress_bar: bool = True,
- parallel: int | None = None,
- local_files_only: bool = False,
- meta_fields_to_embed: list[str] | None = None,
- embedding_separator: str = "\n",
- model_kwargs: dict[str, Any] | None = None) -> None
+__init__(
+ model: str = "prithivida/Splade_PP_en_v1",
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 32,
+ progress_bar: bool = True,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ embedding_separator: str = "\n",
+ model_kwargs: dict[str, Any] | None = None,
+) -> None
```
Create an FastembedDocumentEmbedder component.
-**Arguments**:
-
-- `model`: Local path or name of the model in Hugging Face's model hub,
-such as `prithivida/Splade_PP_en_v1`.
-- `cache_dir`: The path to the cache directory.
-Can be set using the `FASTEMBED_CACHE_PATH` env variable.
-Defaults to `fastembed_cache` in the system's temp directory.
-- `threads`: The number of threads single onnxruntime session can use.
-- `batch_size`: Number of strings to encode at once.
-- `progress_bar`: If `True`, displays progress bar during embedding.
-- `parallel`: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
-If 0, use all available cores.
-If None, don't use data-parallel processing, use default onnxruntime threading instead.
-- `local_files_only`: If `True`, only use the model files in the `cache_dir`.
-- `meta_fields_to_embed`: List of meta fields that should be embedded along with the Document content.
-- `embedding_separator`: Separator used to concatenate the meta fields to the Document content.
-- `model_kwargs`: Dictionary containing model parameters such as `k`, `b`, `avg_len`, `language`.
-
-
-
-#### FastembedSparseDocumentEmbedder.to\_dict
+**Parameters:**
+
+- **model** (str) – Local path or name of the model in Hugging Face's model hub,
+ such as `prithivida/Splade_PP_en_v1`.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use.
+- **batch_size** (int) – Number of strings to encode at once.
+- **progress_bar** (bool) – If `True`, displays progress bar during embedding.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be embedded along with the Document content.
+- **embedding_separator** (str) – Separator used to concatenate the meta fields to the Document content.
+- **model_kwargs** (dict\[str, Any\] | None) – Dictionary containing model parameters such as `k`, `b`, `avg_len`, `language`.
+
+#### to_dict
```python
-def to_dict() -> dict[str, Any]
+to_dict() -> dict[str, Any]
```
Serializes the component to a dictionary.
-**Returns**:
-
-Dictionary with serialized data.
+**Returns:**
-
+- dict\[str, Any\] – Dictionary with serialized data.
-#### FastembedSparseDocumentEmbedder.warm\_up
+#### warm_up
```python
-def warm_up() -> None
+warm_up() -> None
```
Initializes the component.
-
-
-#### FastembedSparseDocumentEmbedder.run
+#### run
```python
-@component.output_types(documents=list[Document])
-def run(documents: list[Document]) -> dict[str, list[Document]]
+run(documents: list[Document]) -> dict[str, list[Document]]
```
Embeds a list of Documents.
-**Arguments**:
-
-- `documents`: List of Documents to embed.
-
-**Raises**:
+**Parameters:**
-- `TypeError`: If the input is not a list of Documents.
+- **documents** (list\[Document\]) – List of Documents to embed.
-**Returns**:
+**Returns:**
-A dictionary with the following keys:
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
- `documents`: List of Documents with each Document's `sparse_embedding`
-field set to the computed embeddings.
+ field set to the computed embeddings.
-
+**Raises:**
-## Module haystack\_integrations.components.embedders.fastembed.fastembed\_sparse\_text\_embedder
+- TypeError – If the input is not a list of Documents.
-
+## haystack_integrations.components.embedders.fastembed.fastembed_sparse_text_embedder
### FastembedSparseTextEmbedder
FastembedSparseTextEmbedder computes string embedding using fastembed sparse models.
Usage example:
+
```python
from haystack_integrations.components.embedders.fastembed import FastembedSparseTextEmbedder
@@ -302,95 +281,85 @@ sparse_text_embedder = FastembedSparseTextEmbedder(
sparse_embedding = sparse_text_embedder.run(text)["sparse_embedding"]
```
-
-
-#### FastembedSparseTextEmbedder.\_\_init\_\_
+#### __init__
```python
-def __init__(model: str = "prithivida/Splade_PP_en_v1",
- cache_dir: str | None = None,
- threads: int | None = None,
- progress_bar: bool = True,
- parallel: int | None = None,
- local_files_only: bool = False,
- model_kwargs: dict[str, Any] | None = None) -> None
+__init__(
+ model: str = "prithivida/Splade_PP_en_v1",
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ progress_bar: bool = True,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ model_kwargs: dict[str, Any] | None = None,
+) -> None
```
Create a FastembedSparseTextEmbedder component.
-**Arguments**:
-
-- `model`: Local path or name of the model in Fastembed's model hub, such as `prithivida/Splade_PP_en_v1`
-- `cache_dir`: The path to the cache directory.
-Can be set using the `FASTEMBED_CACHE_PATH` env variable.
-Defaults to `fastembed_cache` in the system's temp directory.
-- `threads`: The number of threads single onnxruntime session can use. Defaults to None.
-- `progress_bar`: If `True`, displays progress bar during embedding.
-- `parallel`: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
-If 0, use all available cores.
-If None, don't use data-parallel processing, use default onnxruntime threading instead.
-- `local_files_only`: If `True`, only use the model files in the `cache_dir`.
-- `model_kwargs`: Dictionary containing model parameters such as `k`, `b`, `avg_len`, `language`.
+**Parameters:**
-
+- **model** (str) – Local path or name of the model in Fastembed's model hub, such as `prithivida/Splade_PP_en_v1`
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **progress_bar** (bool) – If `True`, displays progress bar during embedding.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **model_kwargs** (dict\[str, Any\] | None) – Dictionary containing model parameters such as `k`, `b`, `avg_len`, `language`.
-#### FastembedSparseTextEmbedder.to\_dict
+#### to_dict
```python
-def to_dict() -> dict[str, Any]
+to_dict() -> dict[str, Any]
```
Serializes the component to a dictionary.
-**Returns**:
+**Returns:**
-Dictionary with serialized data.
+- dict\[str, Any\] – Dictionary with serialized data.
-
-
-#### FastembedSparseTextEmbedder.warm\_up
+#### warm_up
```python
-def warm_up() -> None
+warm_up() -> None
```
Initializes the component.
-
-
-#### FastembedSparseTextEmbedder.run
+#### run
```python
-@component.output_types(sparse_embedding=SparseEmbedding)
-def run(text: str) -> dict[str, SparseEmbedding]
+run(text: str) -> dict[str, SparseEmbedding]
```
Embeds text using the Fastembed model.
-**Arguments**:
-
-- `text`: A string to embed.
+**Parameters:**
-**Raises**:
+- **text** (str) – A string to embed.
-- `TypeError`: If the input is not a string.
+**Returns:**
-**Returns**:
-
-A dictionary with the following keys:
+- dict\[str, SparseEmbedding\] – A dictionary with the following keys:
- `embedding`: A list of floats representing the embedding of the input text.
-
+**Raises:**
-## Module haystack\_integrations.components.embedders.fastembed.fastembed\_text\_embedder
+- TypeError – If the input is not a string.
-
+## haystack_integrations.components.embedders.fastembed.fastembed_text_embedder
### FastembedTextEmbedder
FastembedTextEmbedder computes string embedding using fastembed embedding models.
Usage example:
+
```python
from haystack_integrations.components.embedders.fastembed import FastembedTextEmbedder
@@ -404,100 +373,219 @@ text_embedder = FastembedTextEmbedder(
embedding = text_embedder.run(text)["embedding"]
```
-
-
-#### FastembedTextEmbedder.\_\_init\_\_
+#### __init__
```python
-def __init__(model: str = "BAAI/bge-small-en-v1.5",
- cache_dir: str | None = None,
- threads: int | None = None,
- prefix: str = "",
- suffix: str = "",
- progress_bar: bool = True,
- parallel: int | None = None,
- local_files_only: bool = False) -> None
+__init__(
+ model: str = "BAAI/bge-small-en-v1.5",
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ prefix: str = "",
+ suffix: str = "",
+ progress_bar: bool = True,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+) -> None
```
Create a FastembedTextEmbedder component.
-**Arguments**:
-
-- `model`: Local path or name of the model in Fastembed's model hub, such as `BAAI/bge-small-en-v1.5`
-- `cache_dir`: The path to the cache directory.
-Can be set using the `FASTEMBED_CACHE_PATH` env variable.
-Defaults to `fastembed_cache` in the system's temp directory.
-- `threads`: The number of threads single onnxruntime session can use. Defaults to None.
-- `prefix`: A string to add to the beginning of each text.
-- `suffix`: A string to add to the end of each text.
-- `progress_bar`: If `True`, displays progress bar during embedding.
-- `parallel`: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
-If 0, use all available cores.
-If None, don't use data-parallel processing, use default onnxruntime threading instead.
-- `local_files_only`: If `True`, only use the model files in the `cache_dir`.
+**Parameters:**
-
+- **model** (str) – Local path or name of the model in Fastembed's model hub, such as `BAAI/bge-small-en-v1.5`
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **prefix** (str) – A string to add to the beginning of each text.
+- **suffix** (str) – A string to add to the end of each text.
+- **progress_bar** (bool) – If `True`, displays progress bar during embedding.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
-#### FastembedTextEmbedder.to\_dict
+#### to_dict
```python
-def to_dict() -> dict[str, Any]
+to_dict() -> dict[str, Any]
```
Serializes the component to a dictionary.
-**Returns**:
+**Returns:**
-Dictionary with serialized data.
+- dict\[str, Any\] – Dictionary with serialized data.
-
-
-#### FastembedTextEmbedder.warm\_up
+#### warm_up
```python
-def warm_up() -> None
+warm_up() -> None
```
Initializes the component.
-
-
-#### FastembedTextEmbedder.run
+#### run
```python
-@component.output_types(embedding=list[float])
-def run(text: str) -> dict[str, list[float]]
+run(text: str) -> dict[str, list[float]]
```
Embeds text using the Fastembed model.
-**Arguments**:
+**Parameters:**
-- `text`: A string to embed.
+- **text** (str) – A string to embed.
-**Raises**:
+**Returns:**
-- `TypeError`: If the input is not a string.
+- dict\[str, list\[float\]\] – A dictionary with the following keys:
+- `embedding`: A list of floats representing the embedding of the input text.
-**Returns**:
+**Raises:**
-A dictionary with the following keys:
-- `embedding`: A list of floats representing the embedding of the input text.
+- TypeError – If the input is not a string.
+
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
-
+- dict\[str, Any\] – Dictionary with serialized data.
-## Module haystack\_integrations.components.rankers.fastembed.ranker
+#### from_dict
-
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
+## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
Usage example:
+
```python
from haystack import Document
from haystack_integrations.components.rankers.fastembed import FastembedRanker
@@ -512,109 +600,99 @@ print(output["documents"][0].content)
# Berlin
```
-
-
-#### FastembedRanker.\_\_init\_\_
+#### __init__
```python
-def __init__(model_name: str = "Xenova/ms-marco-MiniLM-L-6-v2",
- top_k: int = 10,
- cache_dir: str | None = None,
- threads: int | None = None,
- batch_size: int = 64,
- parallel: int | None = None,
- local_files_only: bool = False,
- meta_fields_to_embed: list[str] | None = None,
- meta_data_separator: str = "\n")
+__init__(
+ model_name: str = "Xenova/ms-marco-MiniLM-L-6-v2",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+) -> None
```
Creates an instance of the 'FastembedRanker'.
-**Arguments**:
-
-- `model_name`: Fastembed model name. Check the list of supported models in the [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
-- `top_k`: The maximum number of documents to return.
-- `cache_dir`: The path to the cache directory.
-Can be set using the `FASTEMBED_CACHE_PATH` env variable.
-Defaults to `fastembed_cache` in the system's temp directory.
-- `threads`: The number of threads single onnxruntime session can use. Defaults to None.
-- `batch_size`: Number of strings to encode at once.
-- `parallel`: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
-If 0, use all available cores.
-If None, don't use data-parallel processing, use default onnxruntime threading instead.
-- `local_files_only`: If `True`, only use the model files in the `cache_dir`.
-- `meta_fields_to_embed`: List of meta fields that should be concatenated
-with the document content for reranking.
-- `meta_data_separator`: Separator used to concatenate the meta fields
-to the Document content.
-
-
-
-#### FastembedRanker.to\_dict
+**Parameters:**
+
+- **model_name** (str) – Fastembed model name. Check the list of supported models in the [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+
+#### to_dict
```python
-def to_dict() -> dict[str, Any]
+to_dict() -> dict[str, Any]
```
Serializes the component to a dictionary.
-**Returns**:
-
-Dictionary with serialized data.
+**Returns:**
-
+- dict\[str, Any\] – Dictionary with serialized data.
-#### FastembedRanker.from\_dict
+#### from_dict
```python
-@classmethod
-def from_dict(cls, data: dict[str, Any]) -> "FastembedRanker"
+from_dict(data: dict[str, Any]) -> FastembedRanker
```
Deserializes the component from a dictionary.
-**Arguments**:
-
-- `data`: The dictionary to deserialize from.
+**Parameters:**
-**Returns**:
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
-The deserialized component.
+**Returns:**
-
+- FastembedRanker – The deserialized component.
-#### FastembedRanker.warm\_up
+#### warm_up
```python
-def warm_up()
+warm_up() -> None
```
Initializes the component.
-
-
-#### FastembedRanker.run
+#### run
```python
-@component.output_types(documents=list[Document])
-def run(query: str,
- documents: list[Document],
- top_k: int | None = None) -> dict[str, list[Document]]
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
```
Returns a list of documents ranked by their similarity to the given query, using FastEmbed.
-**Arguments**:
+**Parameters:**
-- `query`: The input query to compare the documents to.
-- `documents`: A list of documents to be ranked.
-- `top_k`: The maximum number of documents to return.
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
-**Raises**:
+**Returns:**
-- `ValueError`: If `top_k` is not > 0.
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
-**Returns**:
+**Raises:**
-A dictionary with the following keys:
-- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+- ValueError – If `top_k` is not > 0.
diff --git a/docs-website/reference_versioned_docs/version-2.19/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.19/integrations-api/fastembed.md
index 256b8d87bc..46e42a35fb 100644
--- a/docs-website/reference_versioned_docs/version-2.19/integrations-api/fastembed.md
+++ b/docs-website/reference_versioned_docs/version-2.19/integrations-api/fastembed.md
@@ -11,6 +11,7 @@ slug: "/fastembed-embedders"
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
@@ -445,12 +446,141 @@ Embeds text using the Fastembed model.
- TypeError – If the input is not a string.
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
+
+- dict\[str, Any\] – Dictionary with serialized data.
+
+#### from_dict
+
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
@@ -483,7 +613,7 @@ __init__(
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
-)
+) -> None
```
Creates an instance of the 'FastembedRanker'.
@@ -537,7 +667,7 @@ Deserializes the component from a dictionary.
#### warm_up
```python
-warm_up()
+warm_up() -> None
```
Initializes the component.
diff --git a/docs-website/reference_versioned_docs/version-2.20/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.20/integrations-api/fastembed.md
index 256b8d87bc..46e42a35fb 100644
--- a/docs-website/reference_versioned_docs/version-2.20/integrations-api/fastembed.md
+++ b/docs-website/reference_versioned_docs/version-2.20/integrations-api/fastembed.md
@@ -11,6 +11,7 @@ slug: "/fastembed-embedders"
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
@@ -445,12 +446,141 @@ Embeds text using the Fastembed model.
- TypeError – If the input is not a string.
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
+
+- dict\[str, Any\] – Dictionary with serialized data.
+
+#### from_dict
+
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
@@ -483,7 +613,7 @@ __init__(
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
-)
+) -> None
```
Creates an instance of the 'FastembedRanker'.
@@ -537,7 +667,7 @@ Deserializes the component from a dictionary.
#### warm_up
```python
-warm_up()
+warm_up() -> None
```
Initializes the component.
diff --git a/docs-website/reference_versioned_docs/version-2.21/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.21/integrations-api/fastembed.md
index 256b8d87bc..46e42a35fb 100644
--- a/docs-website/reference_versioned_docs/version-2.21/integrations-api/fastembed.md
+++ b/docs-website/reference_versioned_docs/version-2.21/integrations-api/fastembed.md
@@ -11,6 +11,7 @@ slug: "/fastembed-embedders"
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
@@ -445,12 +446,141 @@ Embeds text using the Fastembed model.
- TypeError – If the input is not a string.
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
+
+- dict\[str, Any\] – Dictionary with serialized data.
+
+#### from_dict
+
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
@@ -483,7 +613,7 @@ __init__(
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
-)
+) -> None
```
Creates an instance of the 'FastembedRanker'.
@@ -537,7 +667,7 @@ Deserializes the component from a dictionary.
#### warm_up
```python
-warm_up()
+warm_up() -> None
```
Initializes the component.
diff --git a/docs-website/reference_versioned_docs/version-2.22/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.22/integrations-api/fastembed.md
index 256b8d87bc..46e42a35fb 100644
--- a/docs-website/reference_versioned_docs/version-2.22/integrations-api/fastembed.md
+++ b/docs-website/reference_versioned_docs/version-2.22/integrations-api/fastembed.md
@@ -11,6 +11,7 @@ slug: "/fastembed-embedders"
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
@@ -445,12 +446,141 @@ Embeds text using the Fastembed model.
- TypeError – If the input is not a string.
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
+
+- dict\[str, Any\] – Dictionary with serialized data.
+
+#### from_dict
+
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
@@ -483,7 +613,7 @@ __init__(
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
-)
+) -> None
```
Creates an instance of the 'FastembedRanker'.
@@ -537,7 +667,7 @@ Deserializes the component from a dictionary.
#### warm_up
```python
-warm_up()
+warm_up() -> None
```
Initializes the component.
diff --git a/docs-website/reference_versioned_docs/version-2.23/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.23/integrations-api/fastembed.md
index 256b8d87bc..46e42a35fb 100644
--- a/docs-website/reference_versioned_docs/version-2.23/integrations-api/fastembed.md
+++ b/docs-website/reference_versioned_docs/version-2.23/integrations-api/fastembed.md
@@ -11,6 +11,7 @@ slug: "/fastembed-embedders"
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
@@ -445,12 +446,141 @@ Embeds text using the Fastembed model.
- TypeError – If the input is not a string.
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
+
+- dict\[str, Any\] – Dictionary with serialized data.
+
+#### from_dict
+
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
@@ -483,7 +613,7 @@ __init__(
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
-)
+) -> None
```
Creates an instance of the 'FastembedRanker'.
@@ -537,7 +667,7 @@ Deserializes the component from a dictionary.
#### warm_up
```python
-warm_up()
+warm_up() -> None
```
Initializes the component.
diff --git a/docs-website/reference_versioned_docs/version-2.24/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.24/integrations-api/fastembed.md
index 256b8d87bc..46e42a35fb 100644
--- a/docs-website/reference_versioned_docs/version-2.24/integrations-api/fastembed.md
+++ b/docs-website/reference_versioned_docs/version-2.24/integrations-api/fastembed.md
@@ -11,6 +11,7 @@ slug: "/fastembed-embedders"
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
@@ -445,12 +446,141 @@ Embeds text using the Fastembed model.
- TypeError – If the input is not a string.
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
+
+- dict\[str, Any\] – Dictionary with serialized data.
+
+#### from_dict
+
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
@@ -483,7 +613,7 @@ __init__(
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
-)
+) -> None
```
Creates an instance of the 'FastembedRanker'.
@@ -537,7 +667,7 @@ Deserializes the component from a dictionary.
#### warm_up
```python
-warm_up()
+warm_up() -> None
```
Initializes the component.
diff --git a/docs-website/reference_versioned_docs/version-2.25/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.25/integrations-api/fastembed.md
index 256b8d87bc..46e42a35fb 100644
--- a/docs-website/reference_versioned_docs/version-2.25/integrations-api/fastembed.md
+++ b/docs-website/reference_versioned_docs/version-2.25/integrations-api/fastembed.md
@@ -11,6 +11,7 @@ slug: "/fastembed-embedders"
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
@@ -445,12 +446,141 @@ Embeds text using the Fastembed model.
- TypeError – If the input is not a string.
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
+
+- dict\[str, Any\] – Dictionary with serialized data.
+
+#### from_dict
+
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
@@ -483,7 +613,7 @@ __init__(
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
-)
+) -> None
```
Creates an instance of the 'FastembedRanker'.
@@ -537,7 +667,7 @@ Deserializes the component from a dictionary.
#### warm_up
```python
-warm_up()
+warm_up() -> None
```
Initializes the component.
diff --git a/docs-website/reference_versioned_docs/version-2.26/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.26/integrations-api/fastembed.md
index 256b8d87bc..46e42a35fb 100644
--- a/docs-website/reference_versioned_docs/version-2.26/integrations-api/fastembed.md
+++ b/docs-website/reference_versioned_docs/version-2.26/integrations-api/fastembed.md
@@ -11,6 +11,7 @@ slug: "/fastembed-embedders"
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
@@ -445,12 +446,141 @@ Embeds text using the Fastembed model.
- TypeError – If the input is not a string.
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
+
+- dict\[str, Any\] – Dictionary with serialized data.
+
+#### from_dict
+
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
@@ -483,7 +613,7 @@ __init__(
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
-)
+) -> None
```
Creates an instance of the 'FastembedRanker'.
@@ -537,7 +667,7 @@ Deserializes the component from a dictionary.
#### warm_up
```python
-warm_up()
+warm_up() -> None
```
Initializes the component.
diff --git a/docs-website/reference_versioned_docs/version-2.27/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.27/integrations-api/fastembed.md
index 256b8d87bc..46e42a35fb 100644
--- a/docs-website/reference_versioned_docs/version-2.27/integrations-api/fastembed.md
+++ b/docs-website/reference_versioned_docs/version-2.27/integrations-api/fastembed.md
@@ -11,6 +11,7 @@ slug: "/fastembed-embedders"
### FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
+
The embedding of each Document is stored in the `embedding` field of the Document.
Usage example:
@@ -445,12 +446,141 @@ Embeds text using the Fastembed model.
- TypeError – If the input is not a string.
+## haystack_integrations.components.rankers.fastembed.late_interaction_ranker
+
+### FastembedLateInteractionRanker
+
+Ranks Documents based on their similarity to the query using ColBERT models via Fastembed.
+
+Uses late interaction (MaxSim) scoring to compute token-level similarity between
+query and document embeddings, then ranks documents accordingly.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
+
+Usage example:
+
+```python
+from haystack import Document
+from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker
+
+ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2)
+
+docs = [Document(content="Paris"), Document(content="Berlin")]
+query = "What is the capital of germany?"
+output = ranker.run(query=query, documents=docs)
+print(output["documents"][0].content)
+
+# Berlin
+```
+
+#### __init__
+
+```python
+__init__(
+ model_name: str = "colbert-ir/colbertv2.0",
+ top_k: int = 10,
+ cache_dir: str | None = None,
+ threads: int | None = None,
+ batch_size: int = 64,
+ parallel: int | None = None,
+ local_files_only: bool = False,
+ meta_fields_to_embed: list[str] | None = None,
+ meta_data_separator: str = "\n",
+ score_threshold: float | None = None,
+) -> None
+```
+
+Creates an instance of the 'FastembedLateInteractionRanker'.
+
+**Parameters:**
+
+- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the
+ [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+- **top_k** (int) – The maximum number of documents to return.
+- **cache_dir** (str | None) – The path to the cache directory.
+ Can be set using the `FASTEMBED_CACHE_PATH` env variable.
+ Defaults to `fastembed_cache` in the system's temp directory.
+- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None.
+- **batch_size** (int) – Number of strings to encode at once.
+- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets.
+ If 0, use all available cores.
+ If None, don't use data-parallel processing, use default onnxruntime threading instead.
+- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`.
+- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated
+ with the document content for reranking.
+- **meta_data_separator** (str) – Separator used to concatenate the meta fields
+ to the Document content.
+- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned.
+ Note that ColBERT scores are unnormalized sums and typically range from 3 to 25.
+
+#### to_dict
+
+```python
+to_dict() -> dict[str, Any]
+```
+
+Serializes the component to a dictionary.
+
+**Returns:**
+
+- dict\[str, Any\] – Dictionary with serialized data.
+
+#### from_dict
+
+```python
+from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker
+```
+
+Deserializes the component from a dictionary.
+
+**Parameters:**
+
+- **data** (dict\[str, Any\]) – The dictionary to deserialize from.
+
+**Returns:**
+
+- FastembedLateInteractionRanker – The deserialized component.
+
+#### warm_up
+
+```python
+warm_up() -> None
+```
+
+Initializes the component.
+
+#### run
+
+```python
+run(
+ query: str, documents: list[Document], top_k: int | None = None
+) -> dict[str, list[Document]]
+```
+
+Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring.
+
+**Parameters:**
+
+- **query** (str) – The input query to compare the documents to.
+- **documents** (list\[Document\]) – A list of documents to be ranked.
+- **top_k** (int | None) – The maximum number of documents to return.
+
+**Returns:**
+
+- dict\[str, list\[Document\]\] – A dictionary with the following keys:
+- `documents`: A list of documents closest to the query, sorted from most similar to least similar.
+
+**Raises:**
+
+- ValueError – If `top_k` is not > 0.
+
## haystack_integrations.components.rankers.fastembed.ranker
### FastembedRanker
-Ranks Documents based on their similarity to the query using
-[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/).
+Ranks Documents based on their similarity to the query using Fastembed models.
+
+See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models.
Documents are indexed from most to least semantically relevant to the query.
@@ -483,7 +613,7 @@ __init__(
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
-)
+) -> None
```
Creates an instance of the 'FastembedRanker'.
@@ -537,7 +667,7 @@ Deserializes the component from a dictionary.
#### warm_up
```python
-warm_up()
+warm_up() -> None
```
Initializes the component.