haystack/docs-website/reference/haystack-api/embedders_api.md at d0024c89b2635f67f01fc1a8ba2e7963b95f6c2c · deepset-ai/haystack

title	Embedders
id	embedders-api
description	Transforms queries into vectors to look for similar or relevant Documents.
slug	/embedders-api

azure_document_embedder

AzureOpenAIDocumentEmbedder

Bases: OpenAIDocumentEmbedder

Calculates document embeddings using OpenAI models deployed on Azure.

Usage example

{/* test-ignore */}

from haystack import Document
from haystack.components.embedders import AzureOpenAIDocumentEmbedder

doc = Document(content="I love pizza!")
document_embedder = AzureOpenAIDocumentEmbedder()

result = document_embedder.run([doc])
print(result['documents'][0].embedding)

# [0.017020374536514282, -0.023255806416273117, ...]

init

__init__(
    azure_endpoint: str | None = None,
    api_version: str | None = "2023-05-15",
    azure_deployment: str = "text-embedding-ada-002",
    dimensions: int | None = None,
    api_key: Secret | None = Secret.from_env_var(
        "AZURE_OPENAI_API_KEY", strict=False
    ),
    azure_ad_token: Secret | None = Secret.from_env_var(
        "AZURE_OPENAI_AD_TOKEN", strict=False
    ),
    organization: str | None = None,
    prefix: str = "",
    suffix: str = "",
    batch_size: int = 32,
    progress_bar: bool = True,
    meta_fields_to_embed: list[str] | None = None,
    embedding_separator: str = "\n",
    timeout: float | None = None,
    max_retries: int | None = None,
    *,
    default_headers: dict[str, str] | None = None,
    azure_ad_token_provider: AzureADTokenProvider | None = None,
    http_client_kwargs: dict[str, Any] | None = None,
    raise_on_failure: bool = False
) -> None

Creates an AzureOpenAIDocumentEmbedder component.

Parameters:

azure_endpoint (str | None) – The endpoint of the model deployed on Azure.
api_version (str | None) – The version of the API to use.
azure_deployment (str) – The name of the model deployed on Azure. The default model is text-embedding-ada-002.
dimensions (int | None) – The number of dimensions of the resulting embeddings. Only supported in text-embedding-3 and later models.
api_key (Secret | None) – The Azure OpenAI API key. You can set it with an environment variable AZURE_OPENAI_API_KEY, or pass with this parameter during initialization.
azure_ad_token (Secret | None) – Microsoft Entra ID token, see Microsoft's Entra ID documentation for more information. You can set it with an environment variable AZURE_OPENAI_AD_TOKEN, or pass with this parameter during initialization. Previously called Azure Active Directory.
organization (str | None) – Your organization ID. See OpenAI's Setting Up Your Organization for more information.
prefix (str) – A string to add at the beginning of each text.
suffix (str) – A string to add at the end of each text.
batch_size (int) – Number of documents to embed at once.
progress_bar (bool) – If True, shows a progress bar when running.
meta_fields_to_embed (list[str] | None) – List of metadata fields to embed along with the document text.
embedding_separator (str) – Separator used to concatenate the metadata fields to the document text.
timeout (float | None) – The timeout for AzureOpenAI client calls, in seconds. If not set, defaults to either the OPENAI_TIMEOUT environment variable, or 30 seconds.
max_retries (int | None) – Maximum number of retries to contact AzureOpenAI after an internal error. If not set, defaults to either the OPENAI_MAX_RETRIES environment variable or to 5 retries.
default_headers (dict[str, str] | None) – Default headers to send to the AzureOpenAI client.
azure_ad_token_provider (AzureADTokenProvider | None) – A function that returns an Azure Active Directory token, will be invoked on every request.
http_client_kwargs (dict[str, Any] | None) – A dictionary of keyword arguments to configure a custom httpx.Clientor httpx.AsyncClient. For more information, see the HTTPX documentation.
raise_on_failure (bool) – Whether to raise an exception if the embedding request fails. If False, the component will log the error and continue processing the remaining documents. If True, it will raise an exception on failure.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> AzureOpenAIDocumentEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

AzureOpenAIDocumentEmbedder – Deserialized component.

azure_text_embedder

AzureOpenAITextEmbedder

Bases: OpenAITextEmbedder

Embeds strings using OpenAI models deployed on Azure.

Usage example

{/* test-ignore */}

from haystack.components.embedders import AzureOpenAITextEmbedder

text_to_embed = "I love pizza!"
text_embedder = AzureOpenAITextEmbedder()

print(text_embedder.run(text_to_embed))

# {'embedding': [0.017020374536514282, -0.023255806416273117, ...],
# 'meta': {'model': 'text-embedding-ada-002-v2',
#          'usage': {'prompt_tokens': 4, 'total_tokens': 4}}}

init

__init__(
    azure_endpoint: str | None = None,
    api_version: str | None = "2023-05-15",
    azure_deployment: str = "text-embedding-ada-002",
    dimensions: int | None = None,
    api_key: Secret | None = Secret.from_env_var(
        "AZURE_OPENAI_API_KEY", strict=False
    ),
    azure_ad_token: Secret | None = Secret.from_env_var(
        "AZURE_OPENAI_AD_TOKEN", strict=False
    ),
    organization: str | None = None,
    timeout: float | None = None,
    max_retries: int | None = None,
    prefix: str = "",
    suffix: str = "",
    *,
    default_headers: dict[str, str] | None = None,
    azure_ad_token_provider: AzureADTokenProvider | None = None,
    http_client_kwargs: dict[str, Any] | None = None
) -> None

Creates an AzureOpenAITextEmbedder component.

Parameters:

azure_endpoint (str | None) – The endpoint of the model deployed on Azure.
api_version (str | None) – The version of the API to use.
azure_deployment (str) – The name of the model deployed on Azure. The default model is text-embedding-ada-002.
dimensions (int | None) – The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models.
api_key (Secret | None) – The Azure OpenAI API key. You can set it with an environment variable AZURE_OPENAI_API_KEY, or pass with this parameter during initialization.
azure_ad_token (Secret | None) – Microsoft Entra ID token, see Microsoft's Entra ID documentation for more information. You can set it with an environment variable AZURE_OPENAI_AD_TOKEN, or pass with this parameter during initialization. Previously called Azure Active Directory.
organization (str | None) – Your organization ID. See OpenAI's Setting Up Your Organization for more information.
timeout (float | None) – The timeout for AzureOpenAI client calls, in seconds. If not set, defaults to either the OPENAI_TIMEOUT environment variable, or 30 seconds.
max_retries (int | None) – Maximum number of retries to contact AzureOpenAI after an internal error. If not set, defaults to either the OPENAI_MAX_RETRIES environment variable, or to 5 retries.
prefix (str) – A string to add at the beginning of each text.
suffix (str) – A string to add at the end of each text.
default_headers (dict[str, str] | None) – Default headers to send to the AzureOpenAI client.
azure_ad_token_provider (AzureADTokenProvider | None) – A function that returns an Azure Active Directory token, will be invoked on every request.
http_client_kwargs (dict[str, Any] | None) – A dictionary of keyword arguments to configure a custom httpx.Clientor httpx.AsyncClient. For more information, see the HTTPX documentation.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> AzureOpenAITextEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

AzureOpenAITextEmbedder – Deserialized component.

hugging_face_api_document_embedder

HuggingFaceAPIDocumentEmbedder

Embeds documents using Hugging Face APIs.

Use it with the following Hugging Face APIs:

Usage examples

With free serverless inference API

{/* test-ignore */}

from haystack.components.embedders import HuggingFaceAPIDocumentEmbedder
from haystack.utils import Secret
from haystack.dataclasses import Document

doc = Document(content="I love pizza!")

doc_embedder = HuggingFaceAPIDocumentEmbedder(api_type="serverless_inference_api",
                                              api_params={"model": "BAAI/bge-small-en-v1.5"},
                                              token=Secret.from_token("<your-api-key>"))

result = document_embedder.run([doc])
print(result["documents"][0].embedding)

# [0.017020374536514282, -0.023255806416273117, ...]

With paid inference endpoints

{/* test-ignore */}

from haystack.components.embedders import HuggingFaceAPIDocumentEmbedder
from haystack.utils import Secret
from haystack.dataclasses import Document

doc = Document(content="I love pizza!")

doc_embedder = HuggingFaceAPIDocumentEmbedder(api_type="inference_endpoints",
                                              api_params={"url": "<your-inference-endpoint-url>"},
                                              token=Secret.from_token("<your-api-key>"))

result = document_embedder.run([doc])
print(result["documents"][0].embedding)

# [0.017020374536514282, -0.023255806416273117, ...]

With self-hosted text embeddings inference

{/* test-ignore */}

from haystack.components.embedders import HuggingFaceAPIDocumentEmbedder
from haystack.dataclasses import Document

doc = Document(content="I love pizza!")

doc_embedder = HuggingFaceAPIDocumentEmbedder(api_type="text_embeddings_inference",
                                              api_params={"url": "http://localhost:8080"})

result = document_embedder.run([doc])
print(result["documents"][0].embedding)

# [0.017020374536514282, -0.023255806416273117, ...]

init

__init__(
    api_type: HFEmbeddingAPIType | str,
    api_params: dict[str, str],
    token: Secret | None = Secret.from_env_var(
        ["HF_API_TOKEN", "HF_TOKEN"], strict=False
    ),
    prefix: str = "",
    suffix: str = "",
    truncate: bool | None = True,
    normalize: bool | None = False,
    batch_size: int = 32,
    progress_bar: bool = True,
    meta_fields_to_embed: list[str] | None = None,
    embedding_separator: str = "\n",
    concurrency_limit: int = 4,
) -> None

Creates a HuggingFaceAPIDocumentEmbedder component.

Parameters:

api_type (HFEmbeddingAPIType | str) – The type of Hugging Face API to use.
api_params (dict[str, str]) – A dictionary with the following keys:
model: Hugging Face model ID. Required when api_type is SERVERLESS_INFERENCE_API.
url: URL of the inference endpoint. Required when api_type is INFERENCE_ENDPOINTS or TEXT_EMBEDDINGS_INFERENCE.
token (Secret | None) – The Hugging Face token to use as HTTP bearer authorization. Check your HF token in your account settings.
prefix (str) – A string to add at the beginning of each text.
suffix (str) – A string to add at the end of each text.
truncate (bool | None) – Truncates the input text to the maximum length supported by the model. Applicable when api_type is TEXT_EMBEDDINGS_INFERENCE, or INFERENCE_ENDPOINTS if the backend uses Text Embeddings Inference. If api_type is SERVERLESS_INFERENCE_API, this parameter is ignored.
normalize (bool | None) – Normalizes the embeddings to unit length. Applicable when api_type is TEXT_EMBEDDINGS_INFERENCE, or INFERENCE_ENDPOINTS if the backend uses Text Embeddings Inference. If api_type is SERVERLESS_INFERENCE_API, this parameter is ignored.
batch_size (int) – Number of documents to process at once.
progress_bar (bool) – If True, shows a progress bar when running.
meta_fields_to_embed (list[str] | None) – List of metadata fields to embed along with the document text.
embedding_separator (str) – Separator used to concatenate the metadata fields to the document text.
concurrency_limit (int) – The maximum number of requests that should be allowed to run concurrently. This parameter is only used in the run_async method.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> HuggingFaceAPIDocumentEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

HuggingFaceAPIDocumentEmbedder – Deserialized component.

run

run(documents: list[Document]) -> dict[str, list[Document]]

Embeds a list of documents.

Parameters:

documents (list[Document]) – Documents to embed.

Returns:

dict[str, list[Document]] – A dictionary with the following keys:
documents: A list of documents with embeddings.

run_async

run_async(documents: list[Document]) -> dict[str, list[Document]]

Embeds a list of documents asynchronously.

Parameters:

documents (list[Document]) – Documents to embed.

Returns:

dict[str, list[Document]] – A dictionary with the following keys:
documents: A list of documents with embeddings.

hugging_face_api_text_embedder

HuggingFaceAPITextEmbedder

Embeds strings using Hugging Face APIs.

Use it with the following Hugging Face APIs:

Usage examples

With free serverless inference API

{/* test-ignore */}

from haystack.components.embedders import HuggingFaceAPITextEmbedder
from haystack.utils import Secret

text_embedder = HuggingFaceAPITextEmbedder(api_type="serverless_inference_api",
                                           api_params={"model": "BAAI/bge-small-en-v1.5"},
                                           token=Secret.from_token("<your-api-key>"))

print(text_embedder.run("I love pizza!"))

# {'embedding': [0.017020374536514282, -0.023255806416273117, ...],

With paid inference endpoints

{/* test-ignore */}

from haystack.components.embedders import HuggingFaceAPITextEmbedder
from haystack.utils import Secret
text_embedder = HuggingFaceAPITextEmbedder(api_type="inference_endpoints",
                                           api_params={"model": "BAAI/bge-small-en-v1.5"},
                                           token=Secret.from_token("<your-api-key>"))

print(text_embedder.run("I love pizza!"))

# {'embedding': [0.017020374536514282, -0.023255806416273117, ...],

With self-hosted text embeddings inference

{/* test-ignore */}

from haystack.components.embedders import HuggingFaceAPITextEmbedder
from haystack.utils import Secret

text_embedder = HuggingFaceAPITextEmbedder(api_type="text_embeddings_inference",
                                           api_params={"url": "http://localhost:8080"})

print(text_embedder.run("I love pizza!"))

# {'embedding': [0.017020374536514282, -0.023255806416273117, ...],

init

__init__(
    api_type: HFEmbeddingAPIType | str,
    api_params: dict[str, str],
    token: Secret | None = Secret.from_env_var(
        ["HF_API_TOKEN", "HF_TOKEN"], strict=False
    ),
    prefix: str = "",
    suffix: str = "",
    truncate: bool | None = True,
    normalize: bool | None = False,
) -> None

Creates a HuggingFaceAPITextEmbedder component.

Parameters:

api_type (HFEmbeddingAPIType | str) – The type of Hugging Face API to use.
api_params (dict[str, str]) – A dictionary with the following keys:
model: Hugging Face model ID. Required when api_type is SERVERLESS_INFERENCE_API.
url: URL of the inference endpoint. Required when api_type is INFERENCE_ENDPOINTS or TEXT_EMBEDDINGS_INFERENCE.
token (Secret | None) – The Hugging Face token to use as HTTP bearer authorization. Check your HF token in your account settings.
prefix (str) – A string to add at the beginning of each text.
suffix (str) – A string to add at the end of each text.
truncate (bool | None) – Truncates the input text to the maximum length supported by the model. Applicable when api_type is TEXT_EMBEDDINGS_INFERENCE, or INFERENCE_ENDPOINTS if the backend uses Text Embeddings Inference. If api_type is SERVERLESS_INFERENCE_API, this parameter is ignored.
normalize (bool | None) – Normalizes the embeddings to unit length. Applicable when api_type is TEXT_EMBEDDINGS_INFERENCE, or INFERENCE_ENDPOINTS if the backend uses Text Embeddings Inference. If api_type is SERVERLESS_INFERENCE_API, this parameter is ignored.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> HuggingFaceAPITextEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

HuggingFaceAPITextEmbedder – Deserialized component.

run

run(text: str) -> dict[str, Any]

Embeds a single string.

Parameters:

text (str) – Text to embed.

Returns:

dict[str, Any] – A dictionary with the following keys:
embedding: The embedding of the input text.

run_async

run_async(text: str) -> dict[str, Any]

Embeds a single string asynchronously.

Parameters:

text (str) – Text to embed.

Returns:

dict[str, Any] – A dictionary with the following keys:
embedding: The embedding of the input text.

image/sentence_transformers_doc_image_embedder

SentenceTransformersDocumentImageEmbedder

A component for computing Document embeddings based on images using Sentence Transformers models.

The embedding of each Document is stored in the embedding field of the Document.

Usage example

{/* test-ignore */}

from haystack import Document
from haystack.components.embedders.image import SentenceTransformersDocumentImageEmbedder

embedder = SentenceTransformersDocumentImageEmbedder(model="sentence-transformers/clip-ViT-B-32")

documents = [
    Document(content="A photo of a cat", meta={"file_path": "cat.jpg"}),
    Document(content="A photo of a dog", meta={"file_path": "dog.jpg"}),
]

result = embedder.run(documents=documents)
documents_with_embeddings = result["documents"]
print(documents_with_embeddings)

# [Document(id=...,
#           content='A photo of a cat',
#           meta={'file_path': 'cat.jpg',
#                 'embedding_source': {'type': 'image', 'file_path_meta_field': 'file_path'}},
#           embedding=vector of size 512),
#  ...]

init

__init__(
    *,
    file_path_meta_field: str = "file_path",
    root_path: str | None = None,
    model: str = "sentence-transformers/clip-ViT-B-32",
    device: ComponentDevice | None = None,
    token: Secret | None = Secret.from_env_var(
        ["HF_API_TOKEN", "HF_TOKEN"], strict=False
    ),
    batch_size: int = 32,
    progress_bar: bool = True,
    normalize_embeddings: bool = False,
    trust_remote_code: bool = False,
    local_files_only: bool = False,
    model_kwargs: dict[str, Any] | None = None,
    tokenizer_kwargs: dict[str, Any] | None = None,
    config_kwargs: dict[str, Any] | None = None,
    precision: Literal[
        "float32", "int8", "uint8", "binary", "ubinary"
    ] = "float32",
    encode_kwargs: dict[str, Any] | None = None,
    backend: Literal["torch", "onnx", "openvino"] = "torch"
) -> None

Creates a SentenceTransformersDocumentEmbedder component.

Parameters:

file_path_meta_field (str) – The metadata field in the Document that contains the file path to the image or PDF.
root_path (str | None) – The root directory path where document files are located. If provided, file paths in document metadata will be resolved relative to this path. If None, file paths are treated as absolute paths.
model (str) – The Sentence Transformers model to use for calculating embeddings. Pass a local path or ID of the model on Hugging Face. To be used with this component, the model must be able to embed images and text into the same vector space. Compatible models include:
"sentence-transformers/clip-ViT-B-32"
"sentence-transformers/clip-ViT-L-14"
"sentence-transformers/clip-ViT-B-16"
"sentence-transformers/clip-ViT-B-32-multilingual-v1"
"jinaai/jina-embeddings-v4"
"jinaai/jina-clip-v1"
"jinaai/jina-clip-v2".
device (ComponentDevice | None) – The device to use for loading the model. Overrides the default device.
token (Secret | None) – The API token to download private models from Hugging Face.
batch_size (int) – Number of documents to embed at once.
progress_bar (bool) – If True, shows a progress bar when embedding documents.
normalize_embeddings (bool) – If True, the embeddings are normalized using L2 normalization, so that each embedding has a norm of 1.
trust_remote_code (bool) – If False, allows only Hugging Face verified model architectures. If True, allows custom models and scripts.
local_files_only (bool) – If True, does not attempt to download the model from Hugging Face Hub and only looks at local files.
model_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoModelForSequenceClassification.from_pretrained when loading the model. Refer to specific model documentation for available kwargs.
tokenizer_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoTokenizer.from_pretrained when loading the tokenizer. Refer to specific model documentation for available kwargs.
config_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoConfig.from_pretrained when loading the model configuration.
precision (Literal['float32', 'int8', 'uint8', 'binary', 'ubinary']) – The precision to use for the embeddings. All non-float32 precisions are quantized embeddings. Quantized embeddings are smaller and faster to compute, but may have a lower accuracy. They are useful for reducing the size of the embeddings of a corpus for semantic search, among other tasks.
encode_kwargs (dict[str, Any] | None) – Additional keyword arguments for SentenceTransformer.encode when embedding documents. This parameter is provided for fine customization. Be careful not to clash with already set parameters and avoid passing parameters that change the output type.
backend (Literal['torch', 'onnx', 'openvino']) – The backend to use for the Sentence Transformers model. Choose from "torch", "onnx", or "openvino". Refer to the Sentence Transformers documentation for more information on acceleration and quantization options.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> SentenceTransformersDocumentImageEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

SentenceTransformersDocumentImageEmbedder – Deserialized component.

warm_up

warm_up() -> None

Initializes the component.

run

run(documents: list[Document]) -> dict[str, list[Document]]

Embed a list of documents.

Parameters:

documents (list[Document]) – Documents to embed.

Returns:

dict[str, list[Document]] – A dictionary with the following keys:
documents: Documents with embeddings.

openai_document_embedder

OpenAIDocumentEmbedder

Computes document embeddings using OpenAI models.

Usage example

{/* test-ignore */}

from haystack import Document
from haystack.components.embedders import OpenAIDocumentEmbedder

doc = Document(content="I love pizza!")
document_embedder = OpenAIDocumentEmbedder()
result = document_embedder.run([doc])

print(result['documents'][0].embedding)

# [0.017020374536514282, -0.023255806416273117, ...]

init

__init__(
    api_key: Secret = Secret.from_env_var("OPENAI_API_KEY"),
    model: str = "text-embedding-ada-002",
    dimensions: int | None = None,
    api_base_url: str | None = None,
    organization: str | None = None,
    prefix: str = "",
    suffix: str = "",
    batch_size: int = 32,
    progress_bar: bool = True,
    meta_fields_to_embed: list[str] | None = None,
    embedding_separator: str = "\n",
    timeout: float | None = None,
    max_retries: int | None = None,
    http_client_kwargs: dict[str, Any] | None = None,
    *,
    raise_on_failure: bool = False
) -> None

Creates an OpenAIDocumentEmbedder component.

Before initializing the component, you can set the 'OPENAI_TIMEOUT' and 'OPENAI_MAX_RETRIES' environment variables to override the timeout and max_retries parameters respectively in the OpenAI client.

Parameters:

api_key (Secret) – The OpenAI API key. You can set it with an environment variable OPENAI_API_KEY, or pass with this parameter during initialization.
model (str) – The name of the model to use for calculating embeddings. The default model is text-embedding-ada-002.
dimensions (int | None) – The number of dimensions of the resulting embeddings. Only text-embedding-3 and later models support this parameter.
api_base_url (str | None) – Overrides the default base URL for all HTTP requests.
organization (str | None) – Your OpenAI organization ID. See OpenAI's Setting Up Your Organization for more information.
prefix (str) – A string to add at the beginning of each text.
suffix (str) – A string to add at the end of each text.
batch_size (int) – Number of documents to embed at once.
progress_bar (bool) – If True, shows a progress bar when running.
meta_fields_to_embed (list[str] | None) – List of metadata fields to embed along with the document text.
embedding_separator (str) – Separator used to concatenate the metadata fields to the document text.
timeout (float | None) – Timeout for OpenAI client calls. If not set, it defaults to either the OPENAI_TIMEOUT environment variable, or 30 seconds.
max_retries (int | None) – Maximum number of retries to contact OpenAI after an internal error. If not set, it defaults to either the OPENAI_MAX_RETRIES environment variable, or 5 retries.
http_client_kwargs (dict[str, Any] | None) – A dictionary of keyword arguments to configure a custom httpx.Clientor httpx.AsyncClient. For more information, see the HTTPX documentation.
raise_on_failure (bool) – Whether to raise an exception if the embedding request fails. If False, the component will log the error and continue processing the remaining documents. If True, it will raise an exception on failure.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> OpenAIDocumentEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

OpenAIDocumentEmbedder – Deserialized component.

run

run(documents: list[Document]) -> dict[str, Any]

Embeds a list of documents.

Parameters:

documents (list[Document]) – A list of documents to embed.

Returns:

dict[str, Any] – A dictionary with the following keys:
documents: A list of documents with embeddings.
meta: Information about the usage of the model.

run_async

run_async(documents: list[Document]) -> dict[str, Any]

Embeds a list of documents asynchronously.

Parameters:

documents (list[Document]) – A list of documents to embed.

Returns:

dict[str, Any] – A dictionary with the following keys:
documents: A list of documents with embeddings.
meta: Information about the usage of the model.

openai_text_embedder

OpenAITextEmbedder

Embeds strings using OpenAI models.

You can use it to embed user query and send it to an embedding Retriever.

Usage example

{/* test-ignore */}

from haystack.components.embedders import OpenAITextEmbedder

text_to_embed = "I love pizza!"
text_embedder = OpenAITextEmbedder()

print(text_embedder.run(text_to_embed))

# {'embedding': [0.017020374536514282, -0.023255806416273117, ...],
# 'meta': {'model': 'text-embedding-ada-002-v2',
#          'usage': {'prompt_tokens': 4, 'total_tokens': 4}}}

init

__init__(
    api_key: Secret = Secret.from_env_var("OPENAI_API_KEY"),
    model: str = "text-embedding-ada-002",
    dimensions: int | None = None,
    api_base_url: str | None = None,
    organization: str | None = None,
    prefix: str = "",
    suffix: str = "",
    timeout: float | None = None,
    max_retries: int | None = None,
    http_client_kwargs: dict[str, Any] | None = None,
) -> None

Creates an OpenAITextEmbedder component.

Parameters:

api_key (Secret) – The OpenAI API key. You can set it with an environment variable OPENAI_API_KEY, or pass with this parameter during initialization.
model (str) – The name of the model to use for calculating embeddings. The default model is text-embedding-ada-002.
dimensions (int | None) – The number of dimensions of the resulting embeddings. Only text-embedding-3 and later models support this parameter.
api_base_url (str | None) – Overrides default base URL for all HTTP requests.
organization (str | None) – Your organization ID. See OpenAI's production best practices for more information.
prefix (str) – A string to add at the beginning of each text to embed.
suffix (str) – A string to add at the end of each text to embed.
timeout (float | None) – Timeout for OpenAI client calls. If not set, it defaults to either the OPENAI_TIMEOUT environment variable, or 30 seconds.
max_retries (int | None) – Maximum number of retries to contact OpenAI after an internal error. If not set, it defaults to either the OPENAI_MAX_RETRIES environment variable, or set to 5.
http_client_kwargs (dict[str, Any] | None) – A dictionary of keyword arguments to configure a custom httpx.Clientor httpx.AsyncClient. For more information, see the HTTPX documentation.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> OpenAITextEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

OpenAITextEmbedder – Deserialized component.

run

run(text: str) -> dict[str, Any]

Embeds a single string.

Parameters:

text (str) – Text to embed.

Returns:

dict[str, Any] – A dictionary with the following keys:
embedding: The embedding of the input text.
meta: Information about the usage of the model.

run_async

run_async(text: str) -> dict[str, Any]

Asynchronously embed a single string.

This is the asynchronous version of the run method. It has the same parameters and return values but can be used with await in async code.

Parameters:

text (str) – Text to embed.

Returns:

dict[str, Any] – A dictionary with the following keys:
embedding: The embedding of the input text.
meta: Information about the usage of the model.

sentence_transformers_document_embedder

SentenceTransformersDocumentEmbedder

Calculates document embeddings using Sentence Transformers models.

It stores the embeddings in the embedding metadata field of each document. You can also embed documents' metadata. Use this component in indexing pipelines to embed input documents and send them to DocumentWriter to write into a Document Store.

Usage example:

{/* test-ignore */}

from haystack import Document
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
doc = Document(content="I love pizza!")
doc_embedder = SentenceTransformersDocumentEmbedder()

result = doc_embedder.run([doc])
print(result['documents'][0].embedding)

# [-0.07804739475250244, 0.1498992145061493, ...]

init

__init__(
    model: str = "sentence-transformers/all-mpnet-base-v2",
    device: ComponentDevice | None = None,
    token: Secret | None = Secret.from_env_var(
        ["HF_API_TOKEN", "HF_TOKEN"], strict=False
    ),
    prefix: str = "",
    suffix: str = "",
    batch_size: int = 32,
    progress_bar: bool = True,
    normalize_embeddings: bool = False,
    meta_fields_to_embed: list[str] | None = None,
    embedding_separator: str = "\n",
    trust_remote_code: bool = False,
    local_files_only: bool = False,
    truncate_dim: int | None = None,
    model_kwargs: dict[str, Any] | None = None,
    tokenizer_kwargs: dict[str, Any] | None = None,
    config_kwargs: dict[str, Any] | None = None,
    precision: Literal[
        "float32", "int8", "uint8", "binary", "ubinary"
    ] = "float32",
    encode_kwargs: dict[str, Any] | None = None,
    backend: Literal["torch", "onnx", "openvino"] = "torch",
    revision: str | None = None,
) -> None

Creates a SentenceTransformersDocumentEmbedder component.

Parameters:

model (str) – The model to use for calculating embeddings. Pass a local path or ID of the model on Hugging Face.
device (ComponentDevice | None) – The device to use for loading the model. Overrides the default device.
token (Secret | None) – The API token to download private models from Hugging Face.
prefix (str) – A string to add at the beginning of each document text. Can be used to prepend the text with an instruction, as required by some embedding models, such as E5 and bge.
suffix (str) – A string to add at the end of each document text.
batch_size (int) – Number of documents to embed at once.
progress_bar (bool) – If True, shows a progress bar when embedding documents.
normalize_embeddings (bool) – If True, the embeddings are normalized using L2 normalization, so that each embedding has a norm of 1.
meta_fields_to_embed (list[str] | None) – List of metadata fields to embed along with the document text.
embedding_separator (str) – Separator used to concatenate the metadata fields to the document text.
trust_remote_code (bool) – If False, allows only Hugging Face verified model architectures. If True, allows custom models and scripts.
local_files_only (bool) – If True, does not attempt to download the model from Hugging Face Hub and only looks at local files.
truncate_dim (int | None) – The dimension to truncate sentence embeddings to. None does no truncation. If the model wasn't trained with Matryoshka Representation Learning, truncating embeddings can significantly affect performance.
model_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoModelForSequenceClassification.from_pretrained when loading the model. Refer to specific model documentation for available kwargs.
tokenizer_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoTokenizer.from_pretrained when loading the tokenizer. Refer to specific model documentation for available kwargs.
config_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoConfig.from_pretrained when loading the model configuration.
precision (Literal['float32', 'int8', 'uint8', 'binary', 'ubinary']) – The precision to use for the embeddings. All non-float32 precisions are quantized embeddings. Quantized embeddings are smaller and faster to compute, but may have a lower accuracy. They are useful for reducing the size of the embeddings of a corpus for semantic search, among other tasks.
encode_kwargs (dict[str, Any] | None) – Additional keyword arguments for SentenceTransformer.encode when embedding documents. This parameter is provided for fine customization. Be careful not to clash with already set parameters and avoid passing parameters that change the output type.
backend (Literal['torch', 'onnx', 'openvino']) – The backend to use for the Sentence Transformers model. Choose from "torch", "onnx", or "openvino". Refer to the Sentence Transformers documentation for more information on acceleration and quantization options.
revision (str | None) – The specific model version to use. It can be a branch name, a tag name, or a commit id, for a stored model on Hugging Face.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> SentenceTransformersDocumentEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

SentenceTransformersDocumentEmbedder – Deserialized component.

warm_up

warm_up() -> None

Initializes the component.

run

run(documents: list[Document]) -> dict[str, list[Document]]

Embed a list of documents.

Parameters:

documents (list[Document]) – Documents to embed.

Returns:

dict[str, list[Document]] – A dictionary with the following keys:
documents: Documents with embeddings.

sentence_transformers_sparse_document_embedder

SentenceTransformersSparseDocumentEmbedder

Calculates document sparse embeddings using sparse embedding models from Sentence Transformers.

It stores the sparse embeddings in the sparse_embedding metadata field of each document. You can also embed documents' metadata. Use this component in indexing pipelines to embed input documents and send them to DocumentWriter to write a into a Document Store.

Usage example:

{/* test-ignore */}

from haystack import Document
from haystack.components.embedders import SentenceTransformersSparseDocumentEmbedder

doc = Document(content="I love pizza!")
doc_embedder = SentenceTransformersSparseDocumentEmbedder()

result = doc_embedder.run([doc])
print(result['documents'][0].sparse_embedding)

# SparseEmbedding(indices=[999, 1045, ...], values=[0.918, 0.867, ...])

init

__init__(
    *,
    model: str = "prithivida/Splade_PP_en_v2",
    device: ComponentDevice | None = None,
    token: Secret | None = Secret.from_env_var(
        ["HF_API_TOKEN", "HF_TOKEN"], strict=False
    ),
    prefix: str = "",
    suffix: str = "",
    batch_size: int = 32,
    progress_bar: bool = True,
    meta_fields_to_embed: list[str] | None = None,
    embedding_separator: str = "\n",
    trust_remote_code: bool = False,
    local_files_only: bool = False,
    model_kwargs: dict[str, Any] | None = None,
    tokenizer_kwargs: dict[str, Any] | None = None,
    config_kwargs: dict[str, Any] | None = None,
    backend: Literal["torch", "onnx", "openvino"] = "torch",
    revision: str | None = None
) -> None

Creates a SentenceTransformersSparseDocumentEmbedder component.

Parameters:

model (str) – The model to use for calculating sparse embeddings. Pass a local path or ID of the model on Hugging Face.
device (ComponentDevice | None) – The device to use for loading the model. Overrides the default device.
token (Secret | None) – The API token to download private models from Hugging Face.
prefix (str) – A string to add at the beginning of each document text.
suffix (str) – A string to add at the end of each document text.
batch_size (int) – Number of documents to embed at once.
progress_bar (bool) – If True, shows a progress bar when embedding documents.
meta_fields_to_embed (list[str] | None) – List of metadata fields to embed along with the document text.
embedding_separator (str) – Separator used to concatenate the metadata fields to the document text.
trust_remote_code (bool) – If False, allows only Hugging Face verified model architectures. If True, allows custom models and scripts.
local_files_only (bool) – If True, does not attempt to download the model from Hugging Face Hub and only looks at local files.
model_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoModelForSequenceClassification.from_pretrained when loading the model. Refer to specific model documentation for available kwargs.
tokenizer_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoTokenizer.from_pretrained when loading the tokenizer. Refer to specific model documentation for available kwargs.
config_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoConfig.from_pretrained when loading the model configuration.
backend (Literal['torch', 'onnx', 'openvino']) – The backend to use for the Sentence Transformers model. Choose from "torch", "onnx", or "openvino". Refer to the Sentence Transformers documentation for more information on acceleration and quantization options.
revision (str | None) – The specific model version to use. It can be a branch name, a tag name, or a commit id, for a stored model on Hugging Face.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> SentenceTransformersSparseDocumentEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

SentenceTransformersSparseDocumentEmbedder – Deserialized component.

warm_up

warm_up() -> None

Initializes the component.

run

run(documents: list[Document]) -> dict[str, list[Document]]

Embed a list of documents.

Parameters:

documents (list[Document]) – Documents to embed.

Returns:

dict[str, list[Document]] – A dictionary with the following keys:
documents: Documents with sparse embeddings under the sparse_embedding field.

sentence_transformers_sparse_text_embedder

SentenceTransformersSparseTextEmbedder

Embeds strings using sparse embedding models from Sentence Transformers.

You can use it to embed user query and send it to a sparse embedding retriever.

Usage example: {/* test-ignore */}

from haystack.components.embedders import SentenceTransformersSparseTextEmbedder

text_to_embed = "I love pizza!"

text_embedder = SentenceTransformersSparseTextEmbedder()

print(text_embedder.run(text_to_embed))

# {'sparse_embedding': SparseEmbedding(indices=[999, 1045, ...], values=[0.918, 0.867, ...])}

init

__init__(
    *,
    model: str = "prithivida/Splade_PP_en_v2",
    device: ComponentDevice | None = None,
    token: Secret | None = Secret.from_env_var(
        ["HF_API_TOKEN", "HF_TOKEN"], strict=False
    ),
    prefix: str = "",
    suffix: str = "",
    trust_remote_code: bool = False,
    local_files_only: bool = False,
    model_kwargs: dict[str, Any] | None = None,
    tokenizer_kwargs: dict[str, Any] | None = None,
    config_kwargs: dict[str, Any] | None = None,
    backend: Literal["torch", "onnx", "openvino"] = "torch",
    revision: str | None = None
) -> None

Create a SentenceTransformersSparseTextEmbedder component.

Parameters:

model (str) – The model to use for calculating sparse embeddings. Specify the path to a local model or the ID of the model on Hugging Face.
device (ComponentDevice | None) – Overrides the default device used to load the model.
token (Secret | None) – An API token to use private models from Hugging Face.
prefix (str) – A string to add at the beginning of each text to be embedded.
suffix (str) – A string to add at the end of each text to embed.
trust_remote_code (bool) – If False, permits only Hugging Face verified model architectures. If True, permits custom models and scripts.
local_files_only (bool) – If True, does not attempt to download the model from Hugging Face Hub and only looks at local files.
model_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoModelForSequenceClassification.from_pretrained when loading the model. Refer to specific model documentation for available kwargs.
tokenizer_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoTokenizer.from_pretrained when loading the tokenizer. Refer to specific model documentation for available kwargs.
config_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoConfig.from_pretrained when loading the model configuration.
backend (Literal['torch', 'onnx', 'openvino']) – The backend to use for the Sentence Transformers model. Choose from "torch", "onnx", or "openvino". Refer to the Sentence Transformers documentation for more information on acceleration and quantization options.
revision (str | None) – The specific model version to use. It can be a branch name, a tag name, or a commit id, for a stored model on Hugging Face.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> SentenceTransformersSparseTextEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

SentenceTransformersSparseTextEmbedder – Deserialized component.

warm_up

warm_up() -> None

Initializes the component.

run

run(text: str) -> dict[str, Any]

Embed a single string.

Parameters:

text (str) – Text to embed.

Returns:

dict[str, Any] – A dictionary with the following keys:
sparse_embedding: The sparse embedding of the input text.

sentence_transformers_text_embedder

SentenceTransformersTextEmbedder

Embeds strings using Sentence Transformers models.

You can use it to embed user query and send it to an embedding retriever.

Usage example: {/* test-ignore */}

from haystack.components.embedders import SentenceTransformersTextEmbedder

text_to_embed = "I love pizza!"

text_embedder = SentenceTransformersTextEmbedder()

print(text_embedder.run(text_to_embed))

# {'embedding': [-0.07804739475250244, 0.1498992145061493,, ...]}

init

__init__(
    model: str = "sentence-transformers/all-mpnet-base-v2",
    device: ComponentDevice | None = None,
    token: Secret | None = Secret.from_env_var(
        ["HF_API_TOKEN", "HF_TOKEN"], strict=False
    ),
    prefix: str = "",
    suffix: str = "",
    batch_size: int = 32,
    progress_bar: bool = True,
    normalize_embeddings: bool = False,
    trust_remote_code: bool = False,
    local_files_only: bool = False,
    truncate_dim: int | None = None,
    model_kwargs: dict[str, Any] | None = None,
    tokenizer_kwargs: dict[str, Any] | None = None,
    config_kwargs: dict[str, Any] | None = None,
    precision: Literal[
        "float32", "int8", "uint8", "binary", "ubinary"
    ] = "float32",
    encode_kwargs: dict[str, Any] | None = None,
    backend: Literal["torch", "onnx", "openvino"] = "torch",
    revision: str | None = None,
) -> None

Create a SentenceTransformersTextEmbedder component.

Parameters:

model (str) – The model to use for calculating embeddings. Specify the path to a local model or the ID of the model on Hugging Face.
device (ComponentDevice | None) – Overrides the default device used to load the model.
token (Secret | None) – An API token to use private models from Hugging Face.
prefix (str) – A string to add at the beginning of each text to be embedded. You can use it to prepend the text with an instruction, as required by some embedding models, such as E5 and bge.
suffix (str) – A string to add at the end of each text to embed.
batch_size (int) – Number of texts to embed at once.
progress_bar (bool) – If True, shows a progress bar for calculating embeddings. If False, disables the progress bar.
normalize_embeddings (bool) – If True, the embeddings are normalized using L2 normalization, so that the embeddings have a norm of 1.
trust_remote_code (bool) – If False, permits only Hugging Face verified model architectures. If True, permits custom models and scripts.
local_files_only (bool) – If True, does not attempt to download the model from Hugging Face Hub and only looks at local files.
truncate_dim (int | None) – The dimension to truncate sentence embeddings to. None does no truncation. If the model has not been trained with Matryoshka Representation Learning, truncation of embeddings can significantly affect performance.
model_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoModelForSequenceClassification.from_pretrained when loading the model. Refer to specific model documentation for available kwargs.
tokenizer_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoTokenizer.from_pretrained when loading the tokenizer. Refer to specific model documentation for available kwargs.
config_kwargs (dict[str, Any] | None) – Additional keyword arguments for AutoConfig.from_pretrained when loading the model configuration.
precision (Literal['float32', 'int8', 'uint8', 'binary', 'ubinary']) – The precision to use for the embeddings. All non-float32 precisions are quantized embeddings. Quantized embeddings are smaller in size and faster to compute, but may have a lower accuracy. They are useful for reducing the size of the embeddings of a corpus for semantic search, among other tasks.
encode_kwargs (dict[str, Any] | None) – Additional keyword arguments for SentenceTransformer.encode when embedding texts. This parameter is provided for fine customization. Be careful not to clash with already set parameters and avoid passing parameters that change the output type.
backend (Literal['torch', 'onnx', 'openvino']) – The backend to use for the Sentence Transformers model. Choose from "torch", "onnx", or "openvino". Refer to the Sentence Transformers documentation for more information on acceleration and quantization options.
revision (str | None) – The specific model version to use. It can be a branch name, a tag name, or a commit id, for a stored model on Hugging Face.

to_dict

to_dict() -> dict[str, Any]

Serializes the component to a dictionary.

Returns:

dict[str, Any] – Dictionary with serialized data.

from_dict

from_dict(data: dict[str, Any]) -> SentenceTransformersTextEmbedder

Deserializes the component from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary to deserialize from.

Returns:

SentenceTransformersTextEmbedder – Deserialized component.

warm_up

warm_up() -> None

Initializes the component.

run

run(text: str) -> dict[str, Any]

Embed a single string.

Parameters:

text (str) – Text to embed.

Returns:

dict[str, Any] – A dictionary with the following keys:
embedding: The embedding of the input text.

FilesExpand file tree

embedders_api.md

Latest commit

History

embedders_api.md

File metadata and controls

azure_document_embedder

AzureOpenAIDocumentEmbedder

Usage example

init

to_dict

from_dict

azure_text_embedder

AzureOpenAITextEmbedder

Usage example

init

to_dict

from_dict

hugging_face_api_document_embedder

HuggingFaceAPIDocumentEmbedder

Usage examples

With free serverless inference API

With paid inference endpoints

With self-hosted text embeddings inference

init

to_dict

from_dict

run

run_async

hugging_face_api_text_embedder

HuggingFaceAPITextEmbedder

Usage examples

With free serverless inference API

With paid inference endpoints

With self-hosted text embeddings inference

init

to_dict

from_dict

run

run_async

image/sentence_transformers_doc_image_embedder

SentenceTransformersDocumentImageEmbedder

Usage example

init

to_dict

from_dict

warm_up

run

openai_document_embedder

OpenAIDocumentEmbedder

Usage example

init

to_dict

from_dict

run

run_async

openai_text_embedder

OpenAITextEmbedder

Usage example

init

to_dict

from_dict

run

run_async

sentence_transformers_document_embedder

SentenceTransformersDocumentEmbedder

Usage example:

init

to_dict

from_dict

warm_up

run

sentence_transformers_sparse_document_embedder

SentenceTransformersSparseDocumentEmbedder

Usage example:

init

to_dict

from_dict

warm_up

run