Skip to content

Latest commit

 

History

History
221 lines (160 loc) · 8.53 KB

File metadata and controls

221 lines (160 loc) · 8.53 KB
title AmazonBedrockChatGenerator
id amazonbedrockchatgenerator
slug /amazonbedrockchatgenerator
description This component enables chat completion using models through Amazon Bedrock service.

AmazonBedrockChatGenerator

This component enables chat completion using models through Amazon Bedrock service.

Most common position in a pipeline After a ChatPromptBuilder
Mandatory init variables model: The model to use

aws_access_key_id: AWS access key ID. Can be set with AWS_ACCESS_KEY_ID env var.

aws_secret_access_key: AWS secret access key. Can be set with AWS_SECRET_ACCESS_KEY env var.

aws_region_name: AWS region name. Can be set with AWS_DEFAULT_REGION env var.
Mandatory run variables messages: A list of ChatMessage instances
Output variables replies: A list of ChatMessage objects

meta: A list of dictionaries with the metadata associated with each reply, such as token count, finish reason, and so on
API reference Amazon Bedrock
GitHub link https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/amazon_bedrock

Amazon Bedrock is a fully managed service that makes high-performing foundation models from leading AI startups and Amazon available through a unified API. You can choose from various foundation models to find the one best suited for your use case.

AmazonBedrockChatGenerator enables chat completion using chat models from Amazon, Anthropic, Cohere, Meta, Mistral, and more with a single component.

Overview

This component uses AWS for authentication. You can use the AWS CLI to authenticate through your IAM. For more information on setting up an IAM identity-based policy, see the official documentation.

:::info Using AWS CLI

Consider using AWS CLI as a more straightforward tool to manage your AWS services. With AWS CLI, you can quickly configure your boto3 credentials. This way, you won't need to provide detailed authentication parameters when initializing Amazon Bedrock Generator in Haystack. :::

To use this component for text generation, initialize an AmazonBedrockGenerator with the model name, the AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION) should be set as environment variables, be configured as described above or passed as Secret arguments. Note, make sure the region you set supports Amazon Bedrock.

Tool Support

AmazonBedrockChatGenerator supports function calling through the tools parameter, which accepts flexible tool configurations:

  • A list of Tool objects: Pass individual tools as a list
  • A single Toolset: Pass an entire Toolset directly
  • Mixed Tools and Toolsets: Combine multiple Toolsets with standalone tools in a single list

This allows you to organize related tools into logical groups while also including standalone tools as needed.

from haystack.tools import Tool, Toolset
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

# Create individual tools
weather_tool = Tool(name="weather", description="Get weather info", ...)
news_tool = Tool(name="news", description="Get latest news", ...)

# Group related tools into a toolset
math_toolset = Toolset([add_tool, subtract_tool, multiply_tool])

# Pass mixed tools and toolsets to the generator
generator = AmazonBedrockChatGenerator(
    model="anthropic.claude-3-5-sonnet-20240620-v1:0",
    tools=[math_toolset, weather_tool, news_tool]  # Mix of Toolset and Tool objects
)

For more details on working with tools, see the Tool and Toolset documentation.

Streaming

This Generator supports streaming the tokens from the LLM directly in output. To do so, pass a function to the streaming_callback init parameter.

Prompt Caching

AmazonBedrockChatGenerator supports prompt caching, to reduce inference response latency and input token costs.

Prompt caching on Bedrock is available for selected models. It allows you to define cache points within a request, as long as the input meets a model-specific minimum token threshold.

Each request can contain up to four cache points.

Caching messages

This generator allows you to control cache points at the ChatMessage level via the meta field.

For example, to cache a long user message to be reused across multiple requests:

from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.amazon_bedrock import (
    AmazonBedrockChatGenerator,
)

msg = ChatMessage.from_user(
    "long message...",
    meta={"cachePoint": {"type": "default", "ttl": "5m"}},
)

generator = AmazonBedrockChatGenerator(
    model="anthropic.claude-sonnet-4-5-20250929-v1:0",
)

result = generator.run(messages=[msg])

If the cache point is successfully written, the number of cached input tokens is available at:

result["replies"][0].meta["usage"]["cache_write_input_tokens"]

Caching tools

You can also cache tool definitions using the tools_cachepoint_config initialization parameter. When provided, all tools sent to the model are cached, if they exceed the minimum token threshold and the selected model supports prompt caching.

from haystack_integrations.components.generators.amazon_bedrock import (
    AmazonBedrockChatGenerator,
)

# define or load your tools

generator = AmazonBedrockChatGenerator(
    model="anthropic.claude-sonnet-4-5-20250929-v1:0",
    tools=my_tools,
    tools_cachepoint_config={"type": "default", "ttl": "5m"},
)

# send a request to the Language Model

For more details on how prompt caching works in Amazon Bedrock, see the official documentation.

Usage

To start using Amazon Bedrock with Haystack, install the amazon-bedrock-haystack package:

pip install amazon-bedrock-haystack

On its own

Basic usage:

from haystack_integrations.components.generators.amazon_bedrock import (
    AmazonBedrockChatGenerator,
)
from haystack.dataclasses import ChatMessage

generator = AmazonBedrockChatGenerator(model="meta.llama2-70b-chat-v1")
messages = [
    ChatMessage.from_system(
        "You are a helpful assistant that answers question in Spanish only",
    ),
    ChatMessage.from_user("What's Natural Language Processing? Be brief."),
]

response = generator.run(messages)
print(response)

With multimodal inputs:

from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.amazon_bedrock import (
    AmazonBedrockChatGenerator,
)

llm = AmazonBedrockChatGenerator(model="anthropic.claude-3-5-sonnet-20240620-v1:0")

image = ImageContent.from_file_path("apple.jpg")
user_message = ChatMessage.from_user(
    content_parts=["What does the image show? Max 5 words.", image],
)

response = llm.run([user_message])["replies"][0].text
print(response)

# Red apple on straw mat.

In a pipeline

In a RAG pipeline:

from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.amazon_bedrock import (
    AmazonBedrockChatGenerator,
)

pipe = Pipeline()
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component("llm", AmazonBedrockChatGenerator(model="meta.llama2-70b-chat-v1"))
pipe.connect("prompt_builder", "llm")

country = "Germany"
system_message = ChatMessage.from_system(
    "You are an assistant giving out valuable information to language learners.",
)
messages = [
    system_message,
    ChatMessage.from_user("What's the official language of {{ country }}?"),
]

res = pipe.run(
    data={
        "prompt_builder": {
            "template_variables": {"country": country},
            "template": messages,
        },
    },
)
print(res)