Skip to content

feat: add JSON structured output support to BedrockChatGenerator#3108

Open
davidsbatista wants to merge 24 commits intomainfrom
feat/add-structured-ouput-BedrockChatGenerator
Open

feat: add JSON structured output support to BedrockChatGenerator#3108
davidsbatista wants to merge 24 commits intomainfrom
feat/add-structured-ouput-BedrockChatGenerator

Conversation

@davidsbatista
Copy link
Copy Markdown
Contributor

@davidsbatista davidsbatista commented Apr 7, 2026

Related Issues

Proposed Changes:

This PR introduces two major changes.,

1. Structured output support (json_schema)

Pass a json_schema dict in generation_kwargs to request validated JSON responses from the model. The generator builds the correct outputConfig for the Bedrock Converse API and stores the parsed result in reply.meta["structured_output"].

  result = generator.run(
      messages=[ChatMessage.from_user("Return user data as JSON")],
      generation_kwargs={
          "json_schema": {
              "name": "user_schema",
              "schema": {"type": "object", "properties": {"name": {"type": "string"}}},
          }
      },
  )
  # result["replies"][0].meta["structured_output"] -> {"name": "..."}

2. aioboto3aiobotocore (async path)

The outputConfig requires boto3>=1.42.84, but aioboto3 pins aiobotocore==2.25.1 which is incompatible with that version of botocore at runtime. Replaced aioboto3 with aiobotocore>=3.4.0 directly — same underlying library, no functional change to async behaviour, resolves the version conflict.

Files changed

  • pyproject.toml — updated deps and mypy overrides
  • common/amazon_bedrock/utils.py — async session now uses aiobotocore.session.AioSession
  • chat/chat_generator.py — run_async() uses session.create_client() with explicit credentials
  • chat/utils.py — _parse_structured_output() helper
  • tests/ — unit + integration tests for both paths

How did you test it?

  • Added new integrations tests for both the run and run_async() when having a json schema in the generation_kwargs
  • added unit tests for the _parse_structured_output
  • added a few more tests to covering previously untested paths in AmazonBedrockChatGenerator

Notes for the reviewer

  • there are also some changes which are purely cosmetic, due to applying a new ruff/linting to the existing code

Checklist

@davidsbatista davidsbatista changed the title Feat/add structured ouput bedrock chat generator Feat: add JSON structured output support to BedrockChatGenerator Apr 7, 2026
@github-actions github-actions bot added the type:documentation Improvements or additions to documentation label Apr 7, 2026
@davidsbatista davidsbatista changed the title Feat: add JSON structured output support to BedrockChatGenerator feat: add JSON structured output support to BedrockChatGenerator Apr 7, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Coverage report (amazon_bedrock)

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  integrations/amazon_bedrock/src/haystack_integrations/common/amazon_bedrock
  utils.py
  integrations/amazon_bedrock/src/haystack_integrations/components/generators/amazon_bedrock/chat
  chat_generator.py 671-672, 688-689, 692
  utils.py 715
Project Total  

This report was generated by python-coverage-comment-action

@davidsbatista davidsbatista removed the type:documentation Improvements or additions to documentation label Apr 7, 2026
@github-actions github-actions bot added the type:documentation Improvements or additions to documentation label Apr 7, 2026
@davidsbatista davidsbatista marked this pull request as ready for review April 7, 2026 15:05
@davidsbatista davidsbatista requested a review from a team as a code owner April 7, 2026 15:05
@davidsbatista davidsbatista requested review from anakin87 and bogdankostic and removed request for a team and bogdankostic April 7, 2026 15:05
Copy link
Copy Markdown
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments

region_name=aws_region_name,
profile_name=aws_profile_name,
)
session = aiobotocore.session.AioSession()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you explain why the aws_* variables are not passed in? I'm not familiar with aiobotocore and haven't found comprehensive docs...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see we use them at client creation below. In any case, feel free to add details if relevant

Comment on lines +177 to +181
aws_access_key_id: Secret | None = None,
aws_secret_access_key: Secret | None = None,
aws_session_token: Secret | None = None,
aws_region_name: Secret | None = None,
aws_profile_name: Secret | None = None,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use this pattern across all integrations, so I don't see the need for changing it.
The docstring also generates the API reference, which makes it clear to users what the default value is.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a mistake, thanks for catching it!

- `stopSequences`: List of stop sequences to stop generation.
- `temperature`: Sampling temperature.
- `topP`: Nucleus sampling parameter.
- `json_schema`: Request structured JSON output validated against a schema. Provide a dict with:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of json_schema, I'd use response_format, which is more adopted across integrations

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can expose it like that, but then I guess for the API to process it, I have to rename it to json_schema before it is sent

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it might not be needed

- `temperature`: Sampling temperature.
- `topP`: Nucleus sampling parameter.
- `json_schema`: Request structured JSON output validated against a schema. See
:meth:`_prepare_request_params` for full details.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • we generally don't use :meth: syntax
  • I would not point users to a private method (not available on the API reference). If we want to avoid duplications, let's point users to __init__

return replies


def _parse_structured_output(replies: list[ChatMessage]) -> list[ChatMessage]:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to leave JSON parsing to users, as we do in other integrations.

- `temperature`: Sampling temperature.
- `topP`: Nucleus sampling parameter.
- `json_schema`: Request structured JSON output validated against a schema. See
:meth:`_prepare_request_params` for full details.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as in run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:amazon-bedrock type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants