diff --git a/docs-website/reference/integrations-api/fastembed.md b/docs-website/reference/integrations-api/fastembed.md index 256b8d87bc..46e42a35fb 100644 --- a/docs-website/reference/integrations-api/fastembed.md +++ b/docs-website/reference/integrations-api/fastembed.md @@ -11,6 +11,7 @@ slug: "/fastembed-embedders" ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: @@ -445,12 +446,141 @@ Embeds text using the Fastembed model. - TypeError – If the input is not a string. +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** + +- dict\[str, Any\] – Dictionary with serialized data. + +#### from_dict + +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + ## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. @@ -483,7 +613,7 @@ __init__( local_files_only: bool = False, meta_fields_to_embed: list[str] | None = None, meta_data_separator: str = "\n", -) +) -> None ``` Creates an instance of the 'FastembedRanker'. @@ -537,7 +667,7 @@ Deserializes the component from a dictionary. #### warm_up ```python -warm_up() +warm_up() -> None ``` Initializes the component. diff --git a/docs-website/reference_versioned_docs/version-2.18/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.18/integrations-api/fastembed.md index 9d8857b931..46e42a35fb 100644 --- a/docs-website/reference_versioned_docs/version-2.18/integrations-api/fastembed.md +++ b/docs-website/reference_versioned_docs/version-2.18/integrations-api/fastembed.md @@ -5,18 +5,17 @@ description: "FastEmbed integration for Haystack" slug: "/fastembed-embedders" --- - -## Module haystack\_integrations.components.embedders.fastembed.fastembed\_document\_embedder - - +## haystack_integrations.components.embedders.fastembed.fastembed_document_embedder ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: + ```python # To use this component, install the "fastembed-haystack" package. # pip install fastembed-haystack @@ -57,104 +56,94 @@ print(f"Document Embedding: {result['documents'][0].embedding}") print(f"Embedding Dimension: {len(result['documents'][0].embedding)}") ``` - - -#### FastembedDocumentEmbedder.\_\_init\_\_ +#### __init__ ```python -def __init__(model: str = "BAAI/bge-small-en-v1.5", - cache_dir: str | None = None, - threads: int | None = None, - prefix: str = "", - suffix: str = "", - batch_size: int = 256, - progress_bar: bool = True, - parallel: int | None = None, - local_files_only: bool = False, - meta_fields_to_embed: list[str] | None = None, - embedding_separator: str = "\n") -> None +__init__( + model: str = "BAAI/bge-small-en-v1.5", + cache_dir: str | None = None, + threads: int | None = None, + prefix: str = "", + suffix: str = "", + batch_size: int = 256, + progress_bar: bool = True, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + embedding_separator: str = "\n", +) -> None ``` Create an FastembedDocumentEmbedder component. -**Arguments**: - -- `model`: Local path or name of the model in Hugging Face's model hub, -such as `BAAI/bge-small-en-v1.5`. -- `cache_dir`: The path to the cache directory. -Can be set using the `FASTEMBED_CACHE_PATH` env variable. -Defaults to `fastembed_cache` in the system's temp directory. -- `threads`: The number of threads single onnxruntime session can use. Defaults to None. -- `prefix`: A string to add to the beginning of each text. -- `suffix`: A string to add to the end of each text. -- `batch_size`: Number of strings to encode at once. -- `progress_bar`: If `True`, displays progress bar during embedding. -- `parallel`: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. -If 0, use all available cores. -If None, don't use data-parallel processing, use default onnxruntime threading instead. -- `local_files_only`: If `True`, only use the model files in the `cache_dir`. -- `meta_fields_to_embed`: List of meta fields that should be embedded along with the Document content. -- `embedding_separator`: Separator used to concatenate the meta fields to the Document content. - - - -#### FastembedDocumentEmbedder.to\_dict +**Parameters:** + +- **model** (str) – Local path or name of the model in Hugging Face's model hub, + such as `BAAI/bge-small-en-v1.5`. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **prefix** (str) – A string to add to the beginning of each text. +- **suffix** (str) – A string to add to the end of each text. +- **batch_size** (int) – Number of strings to encode at once. +- **progress_bar** (bool) – If `True`, displays progress bar during embedding. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be embedded along with the Document content. +- **embedding_separator** (str) – Separator used to concatenate the meta fields to the Document content. + +#### to_dict ```python -def to_dict() -> dict[str, Any] +to_dict() -> dict[str, Any] ``` Serializes the component to a dictionary. -**Returns**: +**Returns:** -Dictionary with serialized data. +- dict\[str, Any\] – Dictionary with serialized data. - - -#### FastembedDocumentEmbedder.warm\_up +#### warm_up ```python -def warm_up() -> None +warm_up() -> None ``` Initializes the component. - - -#### FastembedDocumentEmbedder.run +#### run ```python -@component.output_types(documents=list[Document]) -def run(documents: list[Document]) -> dict[str, list[Document]] +run(documents: list[Document]) -> dict[str, list[Document]] ``` Embeds a list of Documents. -**Arguments**: - -- `documents`: List of Documents to embed. +**Parameters:** -**Raises**: +- **documents** (list\[Document\]) – List of Documents to embed. -- `TypeError`: If the input is not a list of Documents. +**Returns:** -**Returns**: - -A dictionary with the following keys: +- dict\[str, list\[Document\]\] – A dictionary with the following keys: - `documents`: List of Documents with each Document's `embedding` field set to the computed embeddings. - +**Raises:** -## Module haystack\_integrations.components.embedders.fastembed.fastembed\_sparse\_document\_embedder +- TypeError – If the input is not a list of Documents. - +## haystack_integrations.components.embedders.fastembed.fastembed_sparse_document_embedder ### FastembedSparseDocumentEmbedder FastembedSparseDocumentEmbedder computes Document embeddings using Fastembed sparse models. Usage example: + ```python from haystack_integrations.components.embedders.fastembed import FastembedSparseDocumentEmbedder from haystack.dataclasses import Document @@ -192,103 +181,93 @@ print(f"Document Sparse Embedding: {result['documents'][0].sparse_embedding}") print(f"Sparse Embedding Dimension: {len(result['documents'][0].sparse_embedding)}") ``` - - -#### FastembedSparseDocumentEmbedder.\_\_init\_\_ +#### __init__ ```python -def __init__(model: str = "prithivida/Splade_PP_en_v1", - cache_dir: str | None = None, - threads: int | None = None, - batch_size: int = 32, - progress_bar: bool = True, - parallel: int | None = None, - local_files_only: bool = False, - meta_fields_to_embed: list[str] | None = None, - embedding_separator: str = "\n", - model_kwargs: dict[str, Any] | None = None) -> None +__init__( + model: str = "prithivida/Splade_PP_en_v1", + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 32, + progress_bar: bool = True, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + embedding_separator: str = "\n", + model_kwargs: dict[str, Any] | None = None, +) -> None ``` Create an FastembedDocumentEmbedder component. -**Arguments**: - -- `model`: Local path or name of the model in Hugging Face's model hub, -such as `prithivida/Splade_PP_en_v1`. -- `cache_dir`: The path to the cache directory. -Can be set using the `FASTEMBED_CACHE_PATH` env variable. -Defaults to `fastembed_cache` in the system's temp directory. -- `threads`: The number of threads single onnxruntime session can use. -- `batch_size`: Number of strings to encode at once. -- `progress_bar`: If `True`, displays progress bar during embedding. -- `parallel`: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. -If 0, use all available cores. -If None, don't use data-parallel processing, use default onnxruntime threading instead. -- `local_files_only`: If `True`, only use the model files in the `cache_dir`. -- `meta_fields_to_embed`: List of meta fields that should be embedded along with the Document content. -- `embedding_separator`: Separator used to concatenate the meta fields to the Document content. -- `model_kwargs`: Dictionary containing model parameters such as `k`, `b`, `avg_len`, `language`. - - - -#### FastembedSparseDocumentEmbedder.to\_dict +**Parameters:** + +- **model** (str) – Local path or name of the model in Hugging Face's model hub, + such as `prithivida/Splade_PP_en_v1`. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. +- **batch_size** (int) – Number of strings to encode at once. +- **progress_bar** (bool) – If `True`, displays progress bar during embedding. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be embedded along with the Document content. +- **embedding_separator** (str) – Separator used to concatenate the meta fields to the Document content. +- **model_kwargs** (dict\[str, Any\] | None) – Dictionary containing model parameters such as `k`, `b`, `avg_len`, `language`. + +#### to_dict ```python -def to_dict() -> dict[str, Any] +to_dict() -> dict[str, Any] ``` Serializes the component to a dictionary. -**Returns**: - -Dictionary with serialized data. +**Returns:** - +- dict\[str, Any\] – Dictionary with serialized data. -#### FastembedSparseDocumentEmbedder.warm\_up +#### warm_up ```python -def warm_up() -> None +warm_up() -> None ``` Initializes the component. - - -#### FastembedSparseDocumentEmbedder.run +#### run ```python -@component.output_types(documents=list[Document]) -def run(documents: list[Document]) -> dict[str, list[Document]] +run(documents: list[Document]) -> dict[str, list[Document]] ``` Embeds a list of Documents. -**Arguments**: - -- `documents`: List of Documents to embed. - -**Raises**: +**Parameters:** -- `TypeError`: If the input is not a list of Documents. +- **documents** (list\[Document\]) – List of Documents to embed. -**Returns**: +**Returns:** -A dictionary with the following keys: +- dict\[str, list\[Document\]\] – A dictionary with the following keys: - `documents`: List of Documents with each Document's `sparse_embedding` -field set to the computed embeddings. + field set to the computed embeddings. - +**Raises:** -## Module haystack\_integrations.components.embedders.fastembed.fastembed\_sparse\_text\_embedder +- TypeError – If the input is not a list of Documents. - +## haystack_integrations.components.embedders.fastembed.fastembed_sparse_text_embedder ### FastembedSparseTextEmbedder FastembedSparseTextEmbedder computes string embedding using fastembed sparse models. Usage example: + ```python from haystack_integrations.components.embedders.fastembed import FastembedSparseTextEmbedder @@ -302,95 +281,85 @@ sparse_text_embedder = FastembedSparseTextEmbedder( sparse_embedding = sparse_text_embedder.run(text)["sparse_embedding"] ``` - - -#### FastembedSparseTextEmbedder.\_\_init\_\_ +#### __init__ ```python -def __init__(model: str = "prithivida/Splade_PP_en_v1", - cache_dir: str | None = None, - threads: int | None = None, - progress_bar: bool = True, - parallel: int | None = None, - local_files_only: bool = False, - model_kwargs: dict[str, Any] | None = None) -> None +__init__( + model: str = "prithivida/Splade_PP_en_v1", + cache_dir: str | None = None, + threads: int | None = None, + progress_bar: bool = True, + parallel: int | None = None, + local_files_only: bool = False, + model_kwargs: dict[str, Any] | None = None, +) -> None ``` Create a FastembedSparseTextEmbedder component. -**Arguments**: - -- `model`: Local path or name of the model in Fastembed's model hub, such as `prithivida/Splade_PP_en_v1` -- `cache_dir`: The path to the cache directory. -Can be set using the `FASTEMBED_CACHE_PATH` env variable. -Defaults to `fastembed_cache` in the system's temp directory. -- `threads`: The number of threads single onnxruntime session can use. Defaults to None. -- `progress_bar`: If `True`, displays progress bar during embedding. -- `parallel`: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. -If 0, use all available cores. -If None, don't use data-parallel processing, use default onnxruntime threading instead. -- `local_files_only`: If `True`, only use the model files in the `cache_dir`. -- `model_kwargs`: Dictionary containing model parameters such as `k`, `b`, `avg_len`, `language`. +**Parameters:** - +- **model** (str) – Local path or name of the model in Fastembed's model hub, such as `prithivida/Splade_PP_en_v1` +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **progress_bar** (bool) – If `True`, displays progress bar during embedding. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **model_kwargs** (dict\[str, Any\] | None) – Dictionary containing model parameters such as `k`, `b`, `avg_len`, `language`. -#### FastembedSparseTextEmbedder.to\_dict +#### to_dict ```python -def to_dict() -> dict[str, Any] +to_dict() -> dict[str, Any] ``` Serializes the component to a dictionary. -**Returns**: +**Returns:** -Dictionary with serialized data. +- dict\[str, Any\] – Dictionary with serialized data. - - -#### FastembedSparseTextEmbedder.warm\_up +#### warm_up ```python -def warm_up() -> None +warm_up() -> None ``` Initializes the component. - - -#### FastembedSparseTextEmbedder.run +#### run ```python -@component.output_types(sparse_embedding=SparseEmbedding) -def run(text: str) -> dict[str, SparseEmbedding] +run(text: str) -> dict[str, SparseEmbedding] ``` Embeds text using the Fastembed model. -**Arguments**: - -- `text`: A string to embed. +**Parameters:** -**Raises**: +- **text** (str) – A string to embed. -- `TypeError`: If the input is not a string. +**Returns:** -**Returns**: - -A dictionary with the following keys: +- dict\[str, SparseEmbedding\] – A dictionary with the following keys: - `embedding`: A list of floats representing the embedding of the input text. - +**Raises:** -## Module haystack\_integrations.components.embedders.fastembed.fastembed\_text\_embedder +- TypeError – If the input is not a string. - +## haystack_integrations.components.embedders.fastembed.fastembed_text_embedder ### FastembedTextEmbedder FastembedTextEmbedder computes string embedding using fastembed embedding models. Usage example: + ```python from haystack_integrations.components.embedders.fastembed import FastembedTextEmbedder @@ -404,100 +373,219 @@ text_embedder = FastembedTextEmbedder( embedding = text_embedder.run(text)["embedding"] ``` - - -#### FastembedTextEmbedder.\_\_init\_\_ +#### __init__ ```python -def __init__(model: str = "BAAI/bge-small-en-v1.5", - cache_dir: str | None = None, - threads: int | None = None, - prefix: str = "", - suffix: str = "", - progress_bar: bool = True, - parallel: int | None = None, - local_files_only: bool = False) -> None +__init__( + model: str = "BAAI/bge-small-en-v1.5", + cache_dir: str | None = None, + threads: int | None = None, + prefix: str = "", + suffix: str = "", + progress_bar: bool = True, + parallel: int | None = None, + local_files_only: bool = False, +) -> None ``` Create a FastembedTextEmbedder component. -**Arguments**: - -- `model`: Local path or name of the model in Fastembed's model hub, such as `BAAI/bge-small-en-v1.5` -- `cache_dir`: The path to the cache directory. -Can be set using the `FASTEMBED_CACHE_PATH` env variable. -Defaults to `fastembed_cache` in the system's temp directory. -- `threads`: The number of threads single onnxruntime session can use. Defaults to None. -- `prefix`: A string to add to the beginning of each text. -- `suffix`: A string to add to the end of each text. -- `progress_bar`: If `True`, displays progress bar during embedding. -- `parallel`: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. -If 0, use all available cores. -If None, don't use data-parallel processing, use default onnxruntime threading instead. -- `local_files_only`: If `True`, only use the model files in the `cache_dir`. +**Parameters:** - +- **model** (str) – Local path or name of the model in Fastembed's model hub, such as `BAAI/bge-small-en-v1.5` +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **prefix** (str) – A string to add to the beginning of each text. +- **suffix** (str) – A string to add to the end of each text. +- **progress_bar** (bool) – If `True`, displays progress bar during embedding. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. -#### FastembedTextEmbedder.to\_dict +#### to_dict ```python -def to_dict() -> dict[str, Any] +to_dict() -> dict[str, Any] ``` Serializes the component to a dictionary. -**Returns**: +**Returns:** -Dictionary with serialized data. +- dict\[str, Any\] – Dictionary with serialized data. - - -#### FastembedTextEmbedder.warm\_up +#### warm_up ```python -def warm_up() -> None +warm_up() -> None ``` Initializes the component. - - -#### FastembedTextEmbedder.run +#### run ```python -@component.output_types(embedding=list[float]) -def run(text: str) -> dict[str, list[float]] +run(text: str) -> dict[str, list[float]] ``` Embeds text using the Fastembed model. -**Arguments**: +**Parameters:** -- `text`: A string to embed. +- **text** (str) – A string to embed. -**Raises**: +**Returns:** -- `TypeError`: If the input is not a string. +- dict\[str, list\[float\]\] – A dictionary with the following keys: +- `embedding`: A list of floats representing the embedding of the input text. -**Returns**: +**Raises:** -A dictionary with the following keys: -- `embedding`: A list of floats representing the embedding of the input text. +- TypeError – If the input is not a string. + +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** - +- dict\[str, Any\] – Dictionary with serialized data. -## Module haystack\_integrations.components.rankers.fastembed.ranker +#### from_dict - +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + +## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. Usage example: + ```python from haystack import Document from haystack_integrations.components.rankers.fastembed import FastembedRanker @@ -512,109 +600,99 @@ print(output["documents"][0].content) # Berlin ``` - - -#### FastembedRanker.\_\_init\_\_ +#### __init__ ```python -def __init__(model_name: str = "Xenova/ms-marco-MiniLM-L-6-v2", - top_k: int = 10, - cache_dir: str | None = None, - threads: int | None = None, - batch_size: int = 64, - parallel: int | None = None, - local_files_only: bool = False, - meta_fields_to_embed: list[str] | None = None, - meta_data_separator: str = "\n") +__init__( + model_name: str = "Xenova/ms-marco-MiniLM-L-6-v2", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", +) -> None ``` Creates an instance of the 'FastembedRanker'. -**Arguments**: - -- `model_name`: Fastembed model name. Check the list of supported models in the [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). -- `top_k`: The maximum number of documents to return. -- `cache_dir`: The path to the cache directory. -Can be set using the `FASTEMBED_CACHE_PATH` env variable. -Defaults to `fastembed_cache` in the system's temp directory. -- `threads`: The number of threads single onnxruntime session can use. Defaults to None. -- `batch_size`: Number of strings to encode at once. -- `parallel`: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. -If 0, use all available cores. -If None, don't use data-parallel processing, use default onnxruntime threading instead. -- `local_files_only`: If `True`, only use the model files in the `cache_dir`. -- `meta_fields_to_embed`: List of meta fields that should be concatenated -with the document content for reranking. -- `meta_data_separator`: Separator used to concatenate the meta fields -to the Document content. - - - -#### FastembedRanker.to\_dict +**Parameters:** + +- **model_name** (str) – Fastembed model name. Check the list of supported models in the [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. + +#### to_dict ```python -def to_dict() -> dict[str, Any] +to_dict() -> dict[str, Any] ``` Serializes the component to a dictionary. -**Returns**: - -Dictionary with serialized data. +**Returns:** - +- dict\[str, Any\] – Dictionary with serialized data. -#### FastembedRanker.from\_dict +#### from_dict ```python -@classmethod -def from_dict(cls, data: dict[str, Any]) -> "FastembedRanker" +from_dict(data: dict[str, Any]) -> FastembedRanker ``` Deserializes the component from a dictionary. -**Arguments**: - -- `data`: The dictionary to deserialize from. +**Parameters:** -**Returns**: +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. -The deserialized component. +**Returns:** - +- FastembedRanker – The deserialized component. -#### FastembedRanker.warm\_up +#### warm_up ```python -def warm_up() +warm_up() -> None ``` Initializes the component. - - -#### FastembedRanker.run +#### run ```python -@component.output_types(documents=list[Document]) -def run(query: str, - documents: list[Document], - top_k: int | None = None) -> dict[str, list[Document]] +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] ``` Returns a list of documents ranked by their similarity to the given query, using FastEmbed. -**Arguments**: +**Parameters:** -- `query`: The input query to compare the documents to. -- `documents`: A list of documents to be ranked. -- `top_k`: The maximum number of documents to return. +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. -**Raises**: +**Returns:** -- `ValueError`: If `top_k` is not > 0. +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. -**Returns**: +**Raises:** -A dictionary with the following keys: -- `documents`: A list of documents closest to the query, sorted from most similar to least similar. +- ValueError – If `top_k` is not > 0. diff --git a/docs-website/reference_versioned_docs/version-2.19/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.19/integrations-api/fastembed.md index 256b8d87bc..46e42a35fb 100644 --- a/docs-website/reference_versioned_docs/version-2.19/integrations-api/fastembed.md +++ b/docs-website/reference_versioned_docs/version-2.19/integrations-api/fastembed.md @@ -11,6 +11,7 @@ slug: "/fastembed-embedders" ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: @@ -445,12 +446,141 @@ Embeds text using the Fastembed model. - TypeError – If the input is not a string. +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** + +- dict\[str, Any\] – Dictionary with serialized data. + +#### from_dict + +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + ## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. @@ -483,7 +613,7 @@ __init__( local_files_only: bool = False, meta_fields_to_embed: list[str] | None = None, meta_data_separator: str = "\n", -) +) -> None ``` Creates an instance of the 'FastembedRanker'. @@ -537,7 +667,7 @@ Deserializes the component from a dictionary. #### warm_up ```python -warm_up() +warm_up() -> None ``` Initializes the component. diff --git a/docs-website/reference_versioned_docs/version-2.20/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.20/integrations-api/fastembed.md index 256b8d87bc..46e42a35fb 100644 --- a/docs-website/reference_versioned_docs/version-2.20/integrations-api/fastembed.md +++ b/docs-website/reference_versioned_docs/version-2.20/integrations-api/fastembed.md @@ -11,6 +11,7 @@ slug: "/fastembed-embedders" ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: @@ -445,12 +446,141 @@ Embeds text using the Fastembed model. - TypeError – If the input is not a string. +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** + +- dict\[str, Any\] – Dictionary with serialized data. + +#### from_dict + +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + ## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. @@ -483,7 +613,7 @@ __init__( local_files_only: bool = False, meta_fields_to_embed: list[str] | None = None, meta_data_separator: str = "\n", -) +) -> None ``` Creates an instance of the 'FastembedRanker'. @@ -537,7 +667,7 @@ Deserializes the component from a dictionary. #### warm_up ```python -warm_up() +warm_up() -> None ``` Initializes the component. diff --git a/docs-website/reference_versioned_docs/version-2.21/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.21/integrations-api/fastembed.md index 256b8d87bc..46e42a35fb 100644 --- a/docs-website/reference_versioned_docs/version-2.21/integrations-api/fastembed.md +++ b/docs-website/reference_versioned_docs/version-2.21/integrations-api/fastembed.md @@ -11,6 +11,7 @@ slug: "/fastembed-embedders" ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: @@ -445,12 +446,141 @@ Embeds text using the Fastembed model. - TypeError – If the input is not a string. +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** + +- dict\[str, Any\] – Dictionary with serialized data. + +#### from_dict + +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + ## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. @@ -483,7 +613,7 @@ __init__( local_files_only: bool = False, meta_fields_to_embed: list[str] | None = None, meta_data_separator: str = "\n", -) +) -> None ``` Creates an instance of the 'FastembedRanker'. @@ -537,7 +667,7 @@ Deserializes the component from a dictionary. #### warm_up ```python -warm_up() +warm_up() -> None ``` Initializes the component. diff --git a/docs-website/reference_versioned_docs/version-2.22/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.22/integrations-api/fastembed.md index 256b8d87bc..46e42a35fb 100644 --- a/docs-website/reference_versioned_docs/version-2.22/integrations-api/fastembed.md +++ b/docs-website/reference_versioned_docs/version-2.22/integrations-api/fastembed.md @@ -11,6 +11,7 @@ slug: "/fastembed-embedders" ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: @@ -445,12 +446,141 @@ Embeds text using the Fastembed model. - TypeError – If the input is not a string. +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** + +- dict\[str, Any\] – Dictionary with serialized data. + +#### from_dict + +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + ## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. @@ -483,7 +613,7 @@ __init__( local_files_only: bool = False, meta_fields_to_embed: list[str] | None = None, meta_data_separator: str = "\n", -) +) -> None ``` Creates an instance of the 'FastembedRanker'. @@ -537,7 +667,7 @@ Deserializes the component from a dictionary. #### warm_up ```python -warm_up() +warm_up() -> None ``` Initializes the component. diff --git a/docs-website/reference_versioned_docs/version-2.23/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.23/integrations-api/fastembed.md index 256b8d87bc..46e42a35fb 100644 --- a/docs-website/reference_versioned_docs/version-2.23/integrations-api/fastembed.md +++ b/docs-website/reference_versioned_docs/version-2.23/integrations-api/fastembed.md @@ -11,6 +11,7 @@ slug: "/fastembed-embedders" ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: @@ -445,12 +446,141 @@ Embeds text using the Fastembed model. - TypeError – If the input is not a string. +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** + +- dict\[str, Any\] – Dictionary with serialized data. + +#### from_dict + +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + ## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. @@ -483,7 +613,7 @@ __init__( local_files_only: bool = False, meta_fields_to_embed: list[str] | None = None, meta_data_separator: str = "\n", -) +) -> None ``` Creates an instance of the 'FastembedRanker'. @@ -537,7 +667,7 @@ Deserializes the component from a dictionary. #### warm_up ```python -warm_up() +warm_up() -> None ``` Initializes the component. diff --git a/docs-website/reference_versioned_docs/version-2.24/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.24/integrations-api/fastembed.md index 256b8d87bc..46e42a35fb 100644 --- a/docs-website/reference_versioned_docs/version-2.24/integrations-api/fastembed.md +++ b/docs-website/reference_versioned_docs/version-2.24/integrations-api/fastembed.md @@ -11,6 +11,7 @@ slug: "/fastembed-embedders" ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: @@ -445,12 +446,141 @@ Embeds text using the Fastembed model. - TypeError – If the input is not a string. +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** + +- dict\[str, Any\] – Dictionary with serialized data. + +#### from_dict + +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + ## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. @@ -483,7 +613,7 @@ __init__( local_files_only: bool = False, meta_fields_to_embed: list[str] | None = None, meta_data_separator: str = "\n", -) +) -> None ``` Creates an instance of the 'FastembedRanker'. @@ -537,7 +667,7 @@ Deserializes the component from a dictionary. #### warm_up ```python -warm_up() +warm_up() -> None ``` Initializes the component. diff --git a/docs-website/reference_versioned_docs/version-2.25/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.25/integrations-api/fastembed.md index 256b8d87bc..46e42a35fb 100644 --- a/docs-website/reference_versioned_docs/version-2.25/integrations-api/fastembed.md +++ b/docs-website/reference_versioned_docs/version-2.25/integrations-api/fastembed.md @@ -11,6 +11,7 @@ slug: "/fastembed-embedders" ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: @@ -445,12 +446,141 @@ Embeds text using the Fastembed model. - TypeError – If the input is not a string. +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** + +- dict\[str, Any\] – Dictionary with serialized data. + +#### from_dict + +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + ## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. @@ -483,7 +613,7 @@ __init__( local_files_only: bool = False, meta_fields_to_embed: list[str] | None = None, meta_data_separator: str = "\n", -) +) -> None ``` Creates an instance of the 'FastembedRanker'. @@ -537,7 +667,7 @@ Deserializes the component from a dictionary. #### warm_up ```python -warm_up() +warm_up() -> None ``` Initializes the component. diff --git a/docs-website/reference_versioned_docs/version-2.26/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.26/integrations-api/fastembed.md index 256b8d87bc..46e42a35fb 100644 --- a/docs-website/reference_versioned_docs/version-2.26/integrations-api/fastembed.md +++ b/docs-website/reference_versioned_docs/version-2.26/integrations-api/fastembed.md @@ -11,6 +11,7 @@ slug: "/fastembed-embedders" ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: @@ -445,12 +446,141 @@ Embeds text using the Fastembed model. - TypeError – If the input is not a string. +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** + +- dict\[str, Any\] – Dictionary with serialized data. + +#### from_dict + +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + ## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. @@ -483,7 +613,7 @@ __init__( local_files_only: bool = False, meta_fields_to_embed: list[str] | None = None, meta_data_separator: str = "\n", -) +) -> None ``` Creates an instance of the 'FastembedRanker'. @@ -537,7 +667,7 @@ Deserializes the component from a dictionary. #### warm_up ```python -warm_up() +warm_up() -> None ``` Initializes the component. diff --git a/docs-website/reference_versioned_docs/version-2.27/integrations-api/fastembed.md b/docs-website/reference_versioned_docs/version-2.27/integrations-api/fastembed.md index 256b8d87bc..46e42a35fb 100644 --- a/docs-website/reference_versioned_docs/version-2.27/integrations-api/fastembed.md +++ b/docs-website/reference_versioned_docs/version-2.27/integrations-api/fastembed.md @@ -11,6 +11,7 @@ slug: "/fastembed-embedders" ### FastembedDocumentEmbedder FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. + The embedding of each Document is stored in the `embedding` field of the Document. Usage example: @@ -445,12 +446,141 @@ Embeds text using the Fastembed model. - TypeError – If the input is not a string. +## haystack_integrations.components.rankers.fastembed.late_interaction_ranker + +### FastembedLateInteractionRanker + +Ranks Documents based on their similarity to the query using ColBERT models via Fastembed. + +Uses late interaction (MaxSim) scoring to compute token-level similarity between +query and document embeddings, then ranks documents accordingly. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. + +Usage example: + +```python +from haystack import Document +from haystack_integrations.components.rankers.fastembed import FastembedLateInteractionRanker + +ranker = FastembedLateInteractionRanker(model_name="colbert-ir/colbertv2.0", top_k=2) + +docs = [Document(content="Paris"), Document(content="Berlin")] +query = "What is the capital of germany?" +output = ranker.run(query=query, documents=docs) +print(output["documents"][0].content) + +# Berlin +``` + +#### __init__ + +```python +__init__( + model_name: str = "colbert-ir/colbertv2.0", + top_k: int = 10, + cache_dir: str | None = None, + threads: int | None = None, + batch_size: int = 64, + parallel: int | None = None, + local_files_only: bool = False, + meta_fields_to_embed: list[str] | None = None, + meta_data_separator: str = "\n", + score_threshold: float | None = None, +) -> None +``` + +Creates an instance of the 'FastembedLateInteractionRanker'. + +**Parameters:** + +- **model_name** (str) – Fastembed ColBERT model name. Check the list of supported models in the + [Fastembed documentation](https://qdrant.github.io/fastembed/examples/Supported_Models/). +- **top_k** (int) – The maximum number of documents to return. +- **cache_dir** (str | None) – The path to the cache directory. + Can be set using the `FASTEMBED_CACHE_PATH` env variable. + Defaults to `fastembed_cache` in the system's temp directory. +- **threads** (int | None) – The number of threads single onnxruntime session can use. Defaults to None. +- **batch_size** (int) – Number of strings to encode at once. +- **parallel** (int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. + If 0, use all available cores. + If None, don't use data-parallel processing, use default onnxruntime threading instead. +- **local_files_only** (bool) – If `True`, only use the model files in the `cache_dir`. +- **meta_fields_to_embed** (list\[str\] | None) – List of meta fields that should be concatenated + with the document content for reranking. +- **meta_data_separator** (str) – Separator used to concatenate the meta fields + to the Document content. +- **score_threshold** (float | None) – If provided, only documents with a score above the threshold are returned. + Note that ColBERT scores are unnormalized sums and typically range from 3 to 25. + +#### to_dict + +```python +to_dict() -> dict[str, Any] +``` + +Serializes the component to a dictionary. + +**Returns:** + +- dict\[str, Any\] – Dictionary with serialized data. + +#### from_dict + +```python +from_dict(data: dict[str, Any]) -> FastembedLateInteractionRanker +``` + +Deserializes the component from a dictionary. + +**Parameters:** + +- **data** (dict\[str, Any\]) – The dictionary to deserialize from. + +**Returns:** + +- FastembedLateInteractionRanker – The deserialized component. + +#### warm_up + +```python +warm_up() -> None +``` + +Initializes the component. + +#### run + +```python +run( + query: str, documents: list[Document], top_k: int | None = None +) -> dict[str, list[Document]] +``` + +Returns a list of documents ranked by their similarity to the given query using ColBERT MaxSim scoring. + +**Parameters:** + +- **query** (str) – The input query to compare the documents to. +- **documents** (list\[Document\]) – A list of documents to be ranked. +- **top_k** (int | None) – The maximum number of documents to return. + +**Returns:** + +- dict\[str, list\[Document\]\] – A dictionary with the following keys: +- `documents`: A list of documents closest to the query, sorted from most similar to least similar. + +**Raises:** + +- ValueError – If `top_k` is not > 0. + ## haystack_integrations.components.rankers.fastembed.ranker ### FastembedRanker -Ranks Documents based on their similarity to the query using -[Fastembed models](https://qdrant.github.io/fastembed/examples/Supported_Models/). +Ranks Documents based on their similarity to the query using Fastembed models. + +See https://qdrant.github.io/fastembed/examples/Supported_Models/ for supported models. Documents are indexed from most to least semantically relevant to the query. @@ -483,7 +613,7 @@ __init__( local_files_only: bool = False, meta_fields_to_embed: list[str] | None = None, meta_data_separator: str = "\n", -) +) -> None ``` Creates an instance of the 'FastembedRanker'. @@ -537,7 +667,7 @@ Deserializes the component from a dictionary. #### warm_up ```python -warm_up() +warm_up() -> None ``` Initializes the component.