deepset-ai
diff --git a/‎docs-website/docs/pipeline-components/generators/fallbackchatgenerator.mdx‎
Lines changed: 1 addition & 1 deletion b/‎docs-website/docs/pipeline-components/generators/fallbackchatgenerator.mdx‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs-website/docs/pipeline-components/generators/guides-to-generators/choosing-the-right-generator.mdx‎
Lines changed: 37 additions & 21 deletions b/‎docs-website/docs/pipeline-components/generators/guides-to-generators/choosing-the-right-generator.mdx‎
Lines changed: 37 additions & 21 deletions
diff --git a/‎docs-website/versioned_docs/version-2.18/pipeline-components/generators/guides-to-generators/choosing-the-right-generator.mdx‎
Lines changed: 39 additions & 26 deletions b/‎docs-website/versioned_docs/version-2.18/pipeline-components/generators/guides-to-generators/choosing-the-right-generator.mdx‎
Lines changed: 39 additions & 26 deletions
diff --git a/‎docs-website/versioned_docs/version-2.19/pipeline-components/generators/fallbackchatgenerator.mdx‎
Lines changed: 7 additions & 5 deletions b/‎docs-website/versioned_docs/version-2.19/pipeline-components/generators/fallbackchatgenerator.mdx‎
Lines changed: 7 additions & 5 deletions
diff --git a/‎docs-website/versioned_docs/version-2.19/pipeline-components/generators/guides-to-generators/choosing-the-right-generator.mdx‎
Lines changed: 39 additions & 24 deletions b/‎docs-website/versioned_docs/version-2.19/pipeline-components/generators/guides-to-generators/choosing-the-right-generator.mdx‎
Lines changed: 39 additions & 24 deletions
diff --git a/‎docs-website/versioned_docs/version-2.20/pipeline-components/generators/fallbackchatgenerator.mdx‎
Lines changed: 1 addition & 1 deletion b/‎docs-website/versioned_docs/version-2.20/pipeline-components/generators/fallbackchatgenerator.mdx‎
Lines changed: 1 addition & 1 deletion
@@ -75,7 +75,7 @@ You can use this metadata to:
 
 ### Streaming
 
-`FallbackChatGenerator` supports streaming through the `streaming_callback` parameter. The callback is passed directly to the underlying Generators.
+`FallbackChatGenerator` supports [streaming](guides-to-generators/choosing-the-right-generator.mdx#streaming-support) through the `streaming_callback` parameter. The callback is passed directly to the underlying Generators.
 
 ## Usage
 
 
@@ -36,6 +36,7 @@ When you enable streaming, the generator calls your `streaming_callback` for eve
 - **Tool calls**: The model is building a tool/function call. Read `chunk.tool_calls`.
 - **Tool result**: A tool finished and returned output. Read `chunk.tool_call_result`.
 - **Text tokens**: Normal assistant text. Read `chunk.content`.
+- **Reasoning tokens**: Extended thinking output (for models that support it). Read `chunk.reasoning`.
 
 Only one of these fields appears per chunk. Use `chunk.start` and `chunk.finish_reason` to detect boundaries. Use `chunk.index` and `chunk.component_info` for tracing.
 
@@ -46,44 +47,59 @@ For providers that support multiple candidates, set `n=1` to stream.
 Check out the parameter details in our [API Reference for StreamingChunk](/reference/data-classes-api#streamingchunk).
 :::
 
-The simplest way is to use the built-in `print_streaming_chunk` function. It handles tool calls, tool results, and text tokens.
+The simplest way is to use the built-in `print_streaming_chunk` function. It handles all chunk types and prints formatted output to stdout:
 
 ```python
 from haystack.components.generators.utils import print_streaming_chunk
 
 generator = SomeGenerator(streaming_callback=print_streaming_chunk)
-## For ChatGenerators, pass a list[ChatMessage]. For text generators, pass a prompt string.
+# For ChatGenerators, pass a list[ChatMessage]. For text generators, pass a prompt string.
 ```
 
 ### Custom Callback
 
-If you need custom rendering, you can create your own callback.
-
-Handle the three chunk types in this order: tool calls, tool result, and text.
+If you need custom rendering, write your own callback. Handle the four chunk types in order:
 
 ```python
 from haystack.dataclasses import StreamingChunk
 
 
-def my_stream(chunk: StreamingChunk):
-    if chunk.start:
-        on_start()  # e.g., open an SSE stream
+def my_streaming_callback(chunk: StreamingChunk) -> None:
+    if chunk.start and chunk.index and chunk.index > 0:
+        print("\n\n", flush=True, end="")
 
-    # 1) Tool calls: name and JSON args arrive as deltas
+    # Tool Call streaming
     if chunk.tool_calls:
-        for t in chunk.tool_calls:
-            on_tool_call_delta(index=t.index, name=t.tool_name, args_delta=t.arguments)
-
-    # 2) Tool result: final output from the tool
-    if chunk.tool_call_result is not None:
-        on_tool_result(chunk.tool_call_result)
-
-    # 3) Text tokens
+        for tool_call in chunk.tool_calls:
+            if chunk.start:
+                if chunk.index and tool_call.index > chunk.index:
+                    print("\n\n", flush=True, end="")
+                print(
+                    f">>> Tool Call: {tool_call.tool_name}\n>>> Arguments: ",
+                    flush=True,
+                    end="",
+                )
+            if tool_call.arguments:
+                print(tool_call.arguments, flush=True, end="")
+
+    # Tool Result streaming
+    if chunk.tool_call_result:
+        print(f">>> Tool Result\n{chunk.tool_call_result.result}", flush=True, end="")
+
+    # Text streaming
     if chunk.content:
-        on_text_delta(chunk.content)
-
-    if chunk.finish_reason:
-        on_finish(chunk.finish_reason)
+        if chunk.start:
+            print(">>> Assistant\n", flush=True, end="")
+        print(chunk.content, flush=True, end="")
+
+    # Reasoning streaming
+    if chunk.reasoning:
+        if chunk.start:
+            print(">>> Reasoning\n", flush=True, end="")
+        print(chunk.reasoning.reasoning_text, flush=True, end="")
+
+    if chunk.finish_reason is not None:
+        print("\n\n", flush=True, end="")
 ```
 
 ### Agents and Tools
 
@@ -23,7 +23,6 @@ The choice between Generators and ChatGenerators depends on your use case and th
 
 :::tip
 To learn more about this comparison, check out our [Generators vs Chat Generators](generators-vs-chat-generators.mdx) guide.
-
 :::
 
 ## Streaming Support
@@ -37,55 +36,70 @@ When you enable streaming, the generator calls your `streaming_callback` for eve
 - **Tool calls**: The model is building a tool/function call. Read `chunk.tool_calls`.
 - **Tool result**: A tool finished and returned output. Read `chunk.tool_call_result`.
 - **Text tokens**: Normal assistant text. Read `chunk.content`.
+- **Reasoning tokens**: Extended thinking output (for models that support it). Read `chunk.reasoning`.
 
 Only one of these fields appears per chunk. Use `chunk.start` and `chunk.finish_reason` to detect boundaries. Use `chunk.index` and `chunk.component_info` for tracing.
 
 For providers that support multiple candidates, set `n=1` to stream.
 
-:::note
-Parameter Details
+:::info[Parameter Details]
 
 Check out the parameter details in our [API Reference for StreamingChunk](/reference/data-classes-api#streamingchunk).
 :::
 
-The simplest way is to use the built-in `print_streaming_chunk` function. It handles tool calls, tool results, and text tokens.
+The simplest way is to use the built-in `print_streaming_chunk` function. It handles all chunk types and prints formatted output to stdout:
 
 ```python
 from haystack.components.generators.utils import print_streaming_chunk
 
 generator = SomeGenerator(streaming_callback=print_streaming_chunk)
-## For ChatGenerators, pass a list[ChatMessage]. For text generators, pass a prompt string.
+# For ChatGenerators, pass a list[ChatMessage]. For text generators, pass a prompt string.
 ```
 
 ### Custom Callback
 
-If you need custom rendering, you can create your own callback.
-
-Handle the three chunk types in this order: tool calls, tool result, and text.
+If you need custom rendering, write your own callback. Handle the four chunk types in order:
 
 ```python
 from haystack.dataclasses import StreamingChunk
 
 
-def my_stream(chunk: StreamingChunk):
-    if chunk.start:
-        on_start()  # e.g., open an SSE stream
+def my_streaming_callback(chunk: StreamingChunk) -> None:
+    if chunk.start and chunk.index and chunk.index > 0:
+        print("\n\n", flush=True, end="")
 
-    # 1) Tool calls: name and JSON args arrive as deltas
+    # Tool Call streaming
     if chunk.tool_calls:
-        for t in chunk.tool_calls:
-            on_tool_call_delta(index=t.index, name=t.tool_name, args_delta=t.arguments)
-
-    # 2) Tool result: final output from the tool
-    if chunk.tool_call_result is not None:
-        on_tool_result(chunk.tool_call_result)
-
-    # 3) Text tokens
+        for tool_call in chunk.tool_calls:
+            if chunk.start:
+                if chunk.index and tool_call.index > chunk.index:
+                    print("\n\n", flush=True, end="")
+                print(
+                    f">>> Tool Call: {tool_call.tool_name}\n>>> Arguments: ",
+                    flush=True,
+                    end="",
+                )
+            if tool_call.arguments:
+                print(tool_call.arguments, flush=True, end="")
+
+    # Tool Result streaming
+    if chunk.tool_call_result:
+        print(f">>> Tool Result\n{chunk.tool_call_result.result}", flush=True, end="")
+
+    # Text streaming
     if chunk.content:
-        on_text_delta(chunk.content)
-
-    if chunk.finish_reason:
-        on_finish(chunk.finish_reason)
+        if chunk.start:
+            print(">>> Assistant\n", flush=True, end="")
+        print(chunk.content, flush=True, end="")
+
+    # Reasoning streaming
+    if chunk.reasoning:
+        if chunk.start:
+            print(">>> Reasoning\n", flush=True, end="")
+        print(chunk.reasoning.reasoning_text, flush=True, end="")
+
+    if chunk.finish_reason is not None:
+        print("\n\n", flush=True, end="")
 ```
 
 ### Agents and Tools
@@ -105,8 +119,7 @@ We also support [Amazon Bedrock](../amazonbedrockgenerator.mdx): it provides acc
 
 When discussing open (weights) models, we're referring to models with public weights that anyone can deploy on their infrastructure. The datasets used for training are shared less frequently. One could choose to use an open model for several reasons, including more transparency and control of the model.
 
-:::note
-Commercial Use
+:::info[Commercial Use]
 
 Not all open models are suitable for commercial use. We advise thoroughly reviewing the license, typically available on Hugging Face, before considering their adoption.
 :::
 
@@ -75,7 +75,7 @@ You can use this metadata to:
 
 ### Streaming
 
-`FallbackChatGenerator` supports streaming through the `streaming_callback` parameter. The callback is passed directly to the underlying Generators.
+`FallbackChatGenerator` supports [streaming](guides-to-generators/choosing-the-right-generator.mdx#streaming-support) through the `streaming_callback` parameter. The callback is passed directly to the underlying Generators.
 
 ## Usage
 
@@ -84,10 +84,7 @@ You can use this metadata to:
 Basic usage with fallback from a primary to a backup model:
 
 ```python
-from haystack.components.generators.chat import (
-    FallbackChatGenerator,
-    OpenAIChatGenerator,
-)
+from haystack.components.generators.chat import FallbackChatGenerator, OpenAIChatGenerator
 from haystack.dataclasses import ChatMessage
 
 ## Create primary and backup generators
@@ -104,6 +101,11 @@ result = generator.run(messages=messages)
 print(result["replies"][0].text)
 print(f"Successful generator: {result['meta']['successful_chat_generator_class']}")
 print(f"Total attempts: {result['meta']['total_attempts']}")
+
+>> Natural Language Processing (NLP) is a field of artificial intelligence that
+>> focuses on the interaction between computers and humans through natural language...
+>> Successful generator: OpenAIChatGenerator
+>> Total attempts: 1
 ```
 
 With multiple providers:
 
@@ -23,7 +23,6 @@ The choice between Generators and ChatGenerators depends on your use case and th
 
 :::tip
 To learn more about this comparison, check out our [Generators vs Chat Generators](generators-vs-chat-generators.mdx) guide.
-
 :::
 
 ## Streaming Support
@@ -37,54 +36,70 @@ When you enable streaming, the generator calls your `streaming_callback` for eve
 - **Tool calls**: The model is building a tool/function call. Read `chunk.tool_calls`.
 - **Tool result**: A tool finished and returned output. Read `chunk.tool_call_result`.
 - **Text tokens**: Normal assistant text. Read `chunk.content`.
+- **Reasoning tokens**: Extended thinking output (for models that support it). Read `chunk.reasoning`.
 
 Only one of these fields appears per chunk. Use `chunk.start` and `chunk.finish_reason` to detect boundaries. Use `chunk.index` and `chunk.component_info` for tracing.
 
 For providers that support multiple candidates, set `n=1` to stream.
 
-:::note[Parameter Details]
+:::info[Parameter Details]
 
 Check out the parameter details in our [API Reference for StreamingChunk](/reference/data-classes-api#streamingchunk).
 :::
 
-The simplest way is to use the built-in `print_streaming_chunk` function. It handles tool calls, tool results, and text tokens.
+The simplest way is to use the built-in `print_streaming_chunk` function. It handles all chunk types and prints formatted output to stdout:
 
 ```python
 from haystack.components.generators.utils import print_streaming_chunk
 
 generator = SomeGenerator(streaming_callback=print_streaming_chunk)
-## For ChatGenerators, pass a list[ChatMessage]. For text generators, pass a prompt string.
+# For ChatGenerators, pass a list[ChatMessage]. For text generators, pass a prompt string.
 ```
 
 ### Custom Callback
 
-If you need custom rendering, you can create your own callback.
-
-Handle the three chunk types in this order: tool calls, tool result, and text.
+If you need custom rendering, write your own callback. Handle the four chunk types in order:
 
 ```python
 from haystack.dataclasses import StreamingChunk
 
 
-def my_stream(chunk: StreamingChunk):
-    if chunk.start:
-        on_start()  # e.g., open an SSE stream
+def my_streaming_callback(chunk: StreamingChunk) -> None:
+    if chunk.start and chunk.index and chunk.index > 0:
+        print("\n\n", flush=True, end="")
 
-    # 1) Tool calls: name and JSON args arrive as deltas
+    # Tool Call streaming
     if chunk.tool_calls:
-        for t in chunk.tool_calls:
-            on_tool_call_delta(index=t.index, name=t.tool_name, args_delta=t.arguments)
-
-    # 2) Tool result: final output from the tool
-    if chunk.tool_call_result is not None:
-        on_tool_result(chunk.tool_call_result)
-
-    # 3) Text tokens
+        for tool_call in chunk.tool_calls:
+            if chunk.start:
+                if chunk.index and tool_call.index > chunk.index:
+                    print("\n\n", flush=True, end="")
+                print(
+                    f">>> Tool Call: {tool_call.tool_name}\n>>> Arguments: ",
+                    flush=True,
+                    end="",
+                )
+            if tool_call.arguments:
+                print(tool_call.arguments, flush=True, end="")
+
+    # Tool Result streaming
+    if chunk.tool_call_result:
+        print(f">>> Tool Result\n{chunk.tool_call_result.result}", flush=True, end="")
+
+    # Text streaming
     if chunk.content:
-        on_text_delta(chunk.content)
-
-    if chunk.finish_reason:
-        on_finish(chunk.finish_reason)
+        if chunk.start:
+            print(">>> Assistant\n", flush=True, end="")
+        print(chunk.content, flush=True, end="")
+
+    # Reasoning streaming
+    if chunk.reasoning:
+        if chunk.start:
+            print(">>> Reasoning\n", flush=True, end="")
+        print(chunk.reasoning.reasoning_text, flush=True, end="")
+
+    if chunk.finish_reason is not None:
+        print("\n\n", flush=True, end="")
 ```
 
 ### Agents and Tools
@@ -104,7 +119,7 @@ We also support [Amazon Bedrock](../amazonbedrockgenerator.mdx): it provides acc
 
 When discussing open (weights) models, we're referring to models with public weights that anyone can deploy on their infrastructure. The datasets used for training are shared less frequently. One could choose to use an open model for several reasons, including more transparency and control of the model.
 
-:::note[Commercial Use]
+:::info[Commercial Use]
 
 Not all open models are suitable for commercial use. We advise thoroughly reviewing the license, typically available on Hugging Face, before considering their adoption.
 :::
 
@@ -75,7 +75,7 @@ You can use this metadata to:
 
 ### Streaming
 
-`FallbackChatGenerator` supports streaming through the `streaming_callback` parameter. The callback is passed directly to the underlying Generators.
+`FallbackChatGenerator` supports [streaming](guides-to-generators/choosing-the-right-generator.mdx#streaming-support) through the `streaming_callback` parameter. The callback is passed directly to the underlying Generators.
 
 ## Usage