Skip to content

Add dimensions parameter to OllamaDocumentEmbedder and OllamaTextEmbedder #3295

@kevinsrq

Description

@kevinsrq

Add dimensions parameter to OllamaDocumentEmbedder and OllamaTextEmbedder

Is your feature request related to a problem? Please describe.

The Ollama SDK (ollama-python >= 0.6.2) exposes a dimensions parameter on client.embed(...) that allows server-side embedding truncation via Matryoshka Representation Learning (MRL). Models such as qwen3-embedding, nomic-embed-text-v1.5, and mxbai-embed-large natively support reduced dimensions — useful for reducing vector store footprint, similarity search latency, and memory usage in HNSW indexes.

Currently, OllamaDocumentEmbedder and OllamaTextEmbedder in Haystack do not expose this parameter. Passing it through generation_kwargs (options=...) does not work, because the Ollama SDK treats dimensions as a top-level argument of the request payload, not as part of options. As a result, users always receive the full-dimension vector and must truncate + re-normalize client-side, which wastes bandwidth and adds redundant boilerplate everywhere the embedder is used.

Describe the solution you'd like

Add an optional dimensions: int | None = None parameter to OllamaDocumentEmbedder.__init__ (and OllamaTextEmbedder for API symmetry), forwarded to self._client.embed(...):

class OllamaDocumentEmbedder:
    def __init__(
        self,
        model: str = "nomic-embed-text",
        url: str = "http://localhost:11434",
        generation_kwargs: dict[str, Any] | None = None,
        timeout: int = 120,
        keep_alive: str | int | None = None,
        dimensions: int | None = None,  # NEW
        # ... other args
    ):
        ...
        self.dimensions = dimensions
 
    def _embed_batch(
        self,
        texts_to_embed: list[str],
        batch_size: int,
        generation_kwargs: dict[str, Any] | None = None,
    ) -> list[list[float]]:
        all_embeddings = []
        for i in tqdm(
            range(0, len(texts_to_embed), batch_size),
            disable=not self.progress_bar,
            desc="Calculating embeddings",
        ):
            batch = texts_to_embed[i : i + batch_size]
            result = self._client.embed(
                model=self.model,
                input=batch,
                options=generation_kwargs,
                dimensions=self.dimensions,  # HERE
                keep_alive=self.keep_alive,
            )
            all_embeddings.extend(result["embeddings"])
        return all_embeddings

Implementation notes

  • dimensions=None preserves current behavior (full-dim) → fully backward compatible.
  • Include dimensions in to_dict() / from_dict() for pipeline serialization.
  • Apply the same change to OllamaTextEmbedder for API parity.
  • Optional validation: warn when dimensions is set but the model does not support MRL (the Ollama server already returns an error in that case, so this may be unnecessary).

Describe alternatives you've considered

  1. Truncate + re-normalize client-side: works, but wastes bandwidth (full vector is still transferred) and adds boilerplate to every consumer of the embedder.
  2. Pass via generation_kwargs / options: does not work — dimensions is not a field of Ollama's options; it is a top-level field of the /api/embed request payload.
  3. Subclass the embedder: works as a local workaround but becomes tech debt — this is a first-class parameter of the upstream SDK and belongs on the Haystack component.

Additional context

  • Ollama SDK reference: ollama-python >= 0.6.2, embed() accepts dimensions: Optional[int].
  • Ollama server: supports dimensions on the /api/embed endpoint (MRL truncation).
  • Real-world use case: running qwen3-embedding-0.6b locally via Ollama in a hybrid retrieval pipeline (dense + SPLADE). Truncating from 1024 → 512 dims roughly halves HNSW index size with minimal recall loss when a cross-encoder re-ranker is in place.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No fields configured for Feature.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions