Skip to content

Releases: deepset-ai/haystack

v2.29.0

12 May 14:25
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

⭐️ Highlights

🔍 Combine Retrievers with MultiRetriever and TextEmbeddingRetriever

Two new retriever components make it easier to build hybrid search pipelines. MultiRetriever runs multiple text retrievers in parallel and merges their results into a single deduplicated list, ranked by reciprocal rank fusion by default. You can selectively enable or disable individual retrievers at runtime using the active_retrievers parameter. This is useful when you want to skip the embedding retriever for short or keyword-only queries, for example.

TextEmbeddingRetriever wraps an embedding-based retriever together with a text embedder into a single component, making it compatible with MultiRetriever by implementing the TextRetriever protocol. Here's how to combine BM25 and embedding retrieval in a single component:

from haystack.components.retrievers import MultiRetriever, TextEmbeddingRetriever
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever, InMemoryEmbeddingRetriever
from haystack.components.embedders import SentenceTransformersTextEmbedder

retriever = MultiRetriever(
    retrievers={
        "bm25": InMemoryBM25Retriever(document_store=doc_store),
        "embedding": TextEmbeddingRetriever(
            retriever=InMemoryEmbeddingRetriever(document_store=doc_store),
            text_embedder=SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"),
        ),
    },
    top_k=3,
)

# Run all retrievers
result = retriever.run(query="green energy sources")

# Run only the BM25 retriever
result = retriever.run(query="green energy sources", active_retrievers=["bm25"])

⬆️ Upgrade Notes

  • LLM.run and LLM.run_async no longer accept messages and streaming_callback as positional arguments — they must now be passed as keyword arguments. Update any direct calls accordingly:

    # Before
    llm.run([message], my_callback)
    
    # After
    llm.run(messages=[message], streaming_callback=my_callback)

🚀 New Features

  • Add run_async to CacheChecker, enabling it to be used in AsyncPipeline without blocking the event loop.

⚡️ Enhancement Notes

  • Document the input ordering behavior of auto-promoted lazy variadic sockets in Pipeline.connect(). When multiple senders are connected to the same list-typed receiver socket, ordering depends on the pipeline class. With Pipeline, items are ordered alphabetically by sender component name (because Pipeline.run() schedules components in alphabetical order for deterministic execution), not by the order of connect() calls. With AsyncPipeline, no ordering is guaranteed, since components in different branches may run in parallel. The docstrings now point users to a dedicated joiner component when they need explicit ordering.
  • Add join_mode parameter to the experimental MultiRetriever component, supporting "reciprocal_rank_fusion" (default) and "concatenate". Reciprocal Rank Fusion merges the ranked result lists from all retrievers into a single deduplicated list ordered by RRF score. The underlying RRF logic is extracted into a shared utility _reciprocal_rank_fusion in haystack.utils.misc, which is now also used by DocumentJoiner.
  • LLM now supports two usage modes:
    1. Template-variable mode: provide a user_prompt with Jinja2 variables (e.g. {{ query }}).
      Those variables become pipeline inputs and messages is optional. The rendered user_prompt
      is always appended after any messages provided at runtime.
    2. Pass-through mode: omit user_prompt or provide one with no template variables. messages
      becomes a required input, allowing a fully-constructed list of ChatMessages to be passed from upstream.

🐛 Bug Fixes

  • Fixed a bug in NamedEntityExtractor where the spaCy/Thinc device state was not correctly restored after execution, potentially affecting the device configuration of other spaCy components in the same process.
  • Preserve resumable snapshots when some inputs or outputs are non-serializable. Haystack now omits only the failing top-level fields (for example non-serializable callbacks or runtime objects) instead of replacing the whole payload with an empty dictionary. This applies both to agent sub-component inputs (chat_generator and tool_invoker) and to pipeline-level inputs, original_input_data, and pipeline_outputs captured by _create_pipeline_snapshot. When every field fails to serialize, the snapshot still stores a structurally valid empty payload ({"serialization_schema": {"type": "object", "properties": {}}, "serialized_data": {}}) so that resuming the snapshot does not raise DeserializationError — for example when resuming from a ToolBreakpoint where the sub-component's inputs are not strictly required.
  • Fixed tools_strict=True in OpenAIChatGenerator to recursively apply additionalProperties: false and required to all nested objects in tool parameter schemas. Previously only the top-level object was transformed, causing OpenAI's strict mode to reject tools with nested parameters.

💙 Big thank you to everyone who contributed to this release!

@Aftabbs, @albertodiazdurana, @anakin87, @ArkaD171717, @bilgeyucel, @bogdankostic, @davidsbatista, @FuturMix, @julian-risch, @kacperlukawski, @ritikraj2425, @saivedant169, @shaun0927, @sjrl, @SyedShahmeerAli12

v2.29.0-rc2

12 May 13:08
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

v2.29.0-rc2 Pre-release
Pre-release
v2.29.0-rc2

v2.29.0-rc1

11 May 14:37
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

v2.29.0-rc1 Pre-release
Pre-release
v2.29.0-rc1

v2.28.0

20 Apr 15:02
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

Upgrade Notes

  • As part of the migration from requests to httpx, request_with_retry and async_request_with_retry (in haystack.utils.requests_utils) no longer raise requests.exceptions.RequestException on failure; they now raise httpx.HTTPError instead. This also affects HuggingFaceTEIRanker, which relies on these utilities. Users catching requests.exceptions.RequestException should update their code to catch httpx.HTTPError.

  • The LLM component now requires user_prompt to be provided at initialization and it must contain at least one Jinja2 template variable (e.g. {{ variable_name }}). This ensures the component always exposes at least one required input socket, which is necessary for correct pipeline scheduling.

    required_variables now defaults to "*" (all variables in user_prompt are required), and passing an empty list raises a ValueError.

    If you are affected: update any code that instantiates LLM without a user_prompt, or with a user_prompt that has no template variables, to include at least one variable.

    Before:

    llm = LLM(chat_generator=OpenAIChatGenerator(), system_prompt="You are helpful.")

    After:

    llm = LLM(
        chat_generator=OpenAIChatGenerator(),
        system_prompt="You are helpful.",
        user_prompt='{% message role="user" %}{{ query }}{% endmessage %}',
    )
  • Agent.run() and Agent.run_async() now require messages as an explicit argument (no longer optional). If you were relying on the default None value in Haystack version 2.26 or 2.27, pass an empty list instead:

    agent.run(messages=[], ...)

    LLM.run() and LLM.run_async() are unaffected — they still accept None and default to an empty list internally.

New Features

  • Tools and components can now declare a State (or State | None) parameter in their signature to receive the live agent State object at invocation time — no extra wiring needed.

    For function-based tools created with @tool or create_tool_from_function, add a state parameter annotated as State:

    from haystack.components.agents import State
    from haystack.tools import tool
    
    @tool
    def my_tool(query: str, state: State) -> str:
        """Search using context from agent state."""
        history = state.get("history")
        ...

    For component-based tools created with ComponentTool, declare a State input socket on the component's run method:

    from haystack import component
    from haystack.components.agents import State
    from haystack.tools import ComponentTool
    
    @component
    class MyComponent:
        @component.output_types(result=str)
        def run(self, query: str, state: State) -> dict:
            history = state.get("history")
            ...
    
    tool = ComponentTool(component=MyComponent())

    In both cases ToolInvoker automatically injects the runtime State object before calling the tool, and State/Optional[State] parameters are excluded from the LLM-facing schema so the model is not asked to supply them.

    This is an alternative to the existing inputs_from_state and outputs_to_state options on Tool and ComponentTool, which map individual state keys to specific tool parameters and outputs declaratively. Injecting the full State object is more flexible and useful when a tool needs to read from or write to multiple keys, but it couples the tool implementation directly to State.

Enhancement Notes

  • Clarify in the Markdown-producing converter documentation that DocumentCleaner with its default settings can flatten Markdown output, and update the example pipelines for PaddleOCRVLDocumentConverter, MistralOCRDocumentConverter, AzureDocumentIntelligenceConverter, and MarkItDownConverter to avoid routing Markdown content through the default cleaner configuration.
  • Made _create_agent_snapshot robust towards serialization errors. If serializing agent component inputs fails, a warning is logged and an empty dictionary is used as a fallback, preventing the serialization error from masking the real pipeline runtime error.
  • Standardize HTTP request handling in Haystack by adopting httpx for both synchronous and asynchronous requests, replacing requests. Error reporting for failed requests has also been improved: exceptions now include additional details alongside the reason field.
  • Add run_async method to LLMMetadataExtractor. ChatGenerator requests now run concurrently using the existing max_workers init parameter.
  • MarkdownHeaderSplitter now accepts a header_split_levels parameter (list of integers 1–6, default all levels) to control which header depths create split boundaries. For example, header_split_levels=[1, 2] splits only on # and ## headers, merging content under deeper headers into the preceding chunk.
  • MarkdownHeaderSplitter now ignores # lines that appear inside fenced code blocks (triple-backtick or triple-tilde), preventing Python comments and other hash-prefixed lines in code from being misidentified as Markdown headers.
  • Expand the PaddleOCRVLDocumentConverter documentation with more detailed guidance on advanced parameters, common usage scenarios, and a more realistic configuration example for layout-heavy documents.

Bug Fixes

  • Fix ToolInvoker._merge_tool_outputs silently appending None to list-typed state when a tool's outputs_to_state source key is absent from the tool result. This is a common scenario with PipelineTool wrapping a pipeline that has conditional branches where not all outputs are always produced even if defined in outputs_to_state. The mapping is now skipped entirely when the source key is not present in the result dict.

  • When using the MarkdownHeaderSplitter, in the split chunks, the child header previously lost its direct parent header in the metadata. Previously if one executed the code below:

    from haystack.components.preprocessors import MarkdownHeaderSplitter
    from haystack import Document
    text = """
    # header 1
    intro text
    
    ## header 1.1
    text 1
    
    ## header 1.2
    text 2
    
    ### header 1.2.1
    text 3
    
    ### header 1.2.2
    text 4
    """
    
    document = Document(content=text)
    
    splitter = MarkdownHeaderSplitter(
            keep_headers=True,
            secondary_split="word"
    )
    result = splitter.run(documents=[document])["documents"]
    
    for doc in result:
        print(f"Header: {doc.meta['header']}, parent headers: {doc.meta['parent_headers']}")

    We would have expected this output:

    Header: header 1, parent headers: []
    Header: header 1.1, parent headers: ['header 1']
    Header: header 1.2, parent headers: ['header 1']
    Header: header 1.2.1, parent headers: ['header 1', 'header 1.2']
    Header: header 1.2.2, parent headers: ['header 1', 'header 1.2']
    

    But instead we actually got:

    Header: header 1, parent headers: []
    Header: header 1.1, parent headers: []
    Header: header 1.2, parent headers: ['header 1']
    Header: header 1.2.1, parent headers: ['header 1']
    Header: header 1.2.2, parent headers: ['header 1', 'header 1.2']
    

    The error happened when a parent header had its own content chunk before the first child header.

    This has been fixed so even when a parent header has its own content chunk before the first child header all content is preserved.

  • Reverts the change that made Agent messages optional as it caused issues with pipeline execution. As a consequence, the LLM component now defaults to an empty messages list unless provided at runtime.

💙 Big thank you to everyone who contributed to this release!

@Aftabbs, @Amanbig, @anakin87, @bilgeyucel, @bogdankostic, @davidsbatista, @dina-deifallah, @jimmyzhuu, @julian-risch, @kacperlukawski, @maxdswain, @MechaCritter, @ritikraj2425, @sarahkiener, @sjrl, @soheinze, @srini047, @tholor

v2.28.0-rc2

20 Apr 13:11
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

v2.28.0-rc2 Pre-release
Pre-release
v2.28.0-rc2

v2.28.0-rc1

20 Apr 08:46
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

v2.28.0-rc1 Pre-release
Pre-release
v2.28.0-rc1

v2.27.0

01 Apr 13:49

Choose a tag to compare

⭐️ Highlights

🔌 Automatic List Joining in Pipeline

When a component expects a list as input, pipelines now automatically join multiple inputs into that list (no extra components needed), even if they come in different but compatible types. This enables patterns like combining a plain query string with a list of ChatMessage objects into a single list[ChatMessage] input.

Supported conversations:

Source Types Target Type Behavior
T + T list[T] Combines multiple inputs into a list of the same type.
T + list[T] list[T] Merges single items and lists into a single list.
str + ChatMessage list[str] Converts all inputs to str and combines into a list.
str + ChatMessage list[ChatMessage] Converts all inputs to ChatMessage and combines into a list.

Learn more about how to simplify list joins in pipelines in 📖 Smart Pipeline Connections: Implicit List Joining

🗄️ Better Developer Experience for DocumentStores

The metadata inspection and filtering utilities (count_documents_by_filter, count_unique_metadata_by_filter, get_metadata_field_min_max, etc.) are now available in the InMemoryDocumentStore, aligning it with other document stores.

You can prototype locally in memory and easily debug, filter, and inspect the data in the document store during development, then reuse the same logic in production. See all available methods in InMemoryDocumentStore API reference.

🚀 New Features

  • Added new operations to the InMemoryDocumentStore: count_documents_by_filter, count_unique_metadata_by_filter, get_metadata_fields_info, get_metadata_field_min_max, get_metadata_field_unique_values

  • AzureOpenAIChatGenerator now exposes a SUPPORTED_MODELS class variable listing supported model IDs, for example gpt-5-mini and gpt-4o. To view all supported models go to the API reference or run:

    from haystack.components.generators.chat import AzureOpenAIChatGenerator
    print(AzureOpenAIChatGenerator.SUPPORTED_MODELS)

    We will add this for other model providers in their respective ChatGenerator components step by step.

  • Added partial support for the image-text-to-text task in HuggingFaceLocalChatGenerator.

    This allows the use of multimodal models like Qwen 3.5 or Ministral with text-only inputs. Complete multimodal support via Hugging Face Transformers might be addressed in the future.

  • Added async filter helpers to the InMemoryDocumentStore: update_by_filter_async(), count_documents_by_filter_async(), and count_unique_metadata_by_filter_async().

⚡️ Enhancement Notes

  • Add async variants of metadata methods to InMemoryDocumentStore: get_metadata_fields_info_async(), get_metadata_field_min_max_async(), and get_metadata_field_unique_values_async(). These rely on the store's thread-pool executor, consistent with the existing async method pattern.
  • Add _to_trace_dict method to ImageContent and FileContent dataclasses. When tracing is enabled, the large base64-encoded binary fields (base64_image and base64_data) are replaced with placeholder strings (e.g. "Base64 string (N characters)"), consistent with the behavior of ByteStream._to_trace_dict.
  • Pipelines now support auto-variadic connections with type conversion. When multiple senders are connected to a single list-typed input socket, the senders no longer need to all produce the exact same type since compatible conversions are applied per edge. Supported scenarios include T + T -> list[T], T + list[T] -> list[T], str + ChatMessage -> list[str], str + ChatMessage -> list[ChatMessage], and all other str <-> ChatMessage conversion variants. This enables pipeline patterns like joining a plain query string with a list of ChatMessage objects into a single list[ChatMessage] input without any extra components.

🔒 Security Notes

  • Fixed an issue in ChatPromptBuilder where specially crafted template variables could be interpreted as structured content (e.g., images, tool calls) instead of plain text.

    Template variables are now automatically sanitized during rendering, ensuring they are always treated as plain text.

🐛 Bug Fixes

  • Fix malformed log format string in DocumentCleaner. The warning for documents with None content used %{document_id} instead of {document_id}, preventing proper interpolation of the document ID.
  • Fix ToolInvoker._merge_tool_outputs silently appending None to list-typed state when a tool's outputs_to_state source key is absent from the tool result. This is a common scenario with PipelineTool wrapping a pipeline that has conditional branches where not all outputs are always produced even if defined in outputs_to_state. The mapping is now skipped entirely when the source key is not present in the result dict.
  • Fixed an off-by-one error in InMemoryDocumentStore.write_documents that caused the BM25 average document length to be systematically underestimated.
  • Resolve $defs/$ref in tool parameter schemas before sending them to the HuggingFace API. The HuggingFace API does not support JSON Schema $defs references, which are generated by Pydantic when tool parameters contain dataclass types. This fix inlines all $ref pointers and removes the $defs section from tool schemas in HuggingFaceAPIChatGenerator.
  • The default bm25_tokenization_regex in InMemoryDocumentStore now uses r"(?u)\b\w+\b", including single-character words (e.g., "a", "C") in BM25 scoring. Previously, the regex r"(?u)\b\w\w+\b" excluded these tokens. This change may slightly alter retrieval results. To restore the old behavior, explicitly pass the previous regex when initializing the document store.

💙 Big thank you to everyone who contributed to this release!

@aayushbaluni, @anakin87, @bilgeyucel, @bogdankostic, @Br1an67, @ComeOnOliver, @davidsbatista, @jnMetaCode, @julian-risch, @Krishnachaitanyakc, @maxdswain, @pandego, @RMartinWhozfoxy, @satishkc7, @sjrl, @srini047, @SyedShahmeerAli12, @v-tan, @xr843

v2.27.0-rc1

31 Mar 07:54

Choose a tag to compare

v2.27.0-rc1 Pre-release
Pre-release
v2.27.0-rc1

v2.26.1

20 Mar 09:44

Choose a tag to compare

Security Notes

  • Fixed an issue in ChatPromptBuilder where specially crafted template variables could be interpreted as structured content (e.g., images, tool calls) instead of plain text. Template variables are now automatically sanitized during rendering, ensuring they are always treated as plain text.

🐛 Bug Fixes

  • Fix malformed log format string in DocumentCleaner. The warning for documents with None content used %{document_id} instead of {document_id}, preventing proper interpolation of the document ID.

v2.26.1-rc1

19 Mar 11:13

Choose a tag to compare

v2.26.1-rc1 Pre-release
Pre-release
v2.26.1-rc1