Skip to content

feat: Bedrock - support for FileContent + citations#2883

Merged
anakin87 merged 12 commits intomainfrom
filecontent-bedrock
Mar 3, 2026
Merged

feat: Bedrock - support for FileContent + citations#2883
anakin87 merged 12 commits intomainfrom
filecontent-bedrock

Conversation

@anakin87
Copy link
Copy Markdown
Member

@anakin87 anakin87 commented Feb 27, 2026

Related Issues

Proposed Changes:

  • support passing FileContent: pass to Bedrock Documents or Videos, depending on the MIME type.
  • support enabling citations (particularly relevant to properly use PDFs with Claude on Bedrock); build message text is split between standard text and citations text
  • small refactoring of private functions

How did you test it?

CI, new unit tests and integration tests

Checklist

@github-actions github-actions Bot added integration:amazon-bedrock type:documentation Improvements or additions to documentation labels Feb 27, 2026
@anakin87 anakin87 marked this pull request as ready for review February 27, 2026 12:37
@anakin87 anakin87 requested a review from a team as a code owner February 27, 2026 12:37
@anakin87 anakin87 requested review from bogdankostic and removed request for a team February 27, 2026 12:37
Copy link
Copy Markdown
Contributor

@bogdankostic bogdankostic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking already good, I just have two minor comments.

raw_name = os.path.splitext(file_content.filename)[0] if file_content.filename else "filename"
# Bedrock requires name to be present but is very strict about the format.
# See https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_DocumentBlock.html
sanitized_name = re.sub(r"\s+", " ", re.sub(r"[^a-zA-Z0-9\s\-\[\]()]", "", raw_name)).strip()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanitized_name can become empty if the filename has only stripped characters, but bedrock requires a minimum length of 1. Let's handle this edge case.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. Handled better in da26239

replies.append(
ChatMessage.from_assistant(
" ".join(text_content),
"".join(text_content),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In l.485 we strip all white space from text content (text := entry.get("text", "").strip()), joining them together like this might lead to missing white space I guess.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally changed this part of the code because while experimenting with citations, I realized that " ".join() was introducing duplicate spaces (since the API already includes spacing).

Based on your comment, I propose to use "".join() and preserve the original text/citation as-is.
See ee79dc2

@anakin87 anakin87 requested a review from bogdankostic March 2, 2026 15:43
@anakin87 anakin87 merged commit 34d31b2 into main Mar 3, 2026
12 checks passed
@anakin87 anakin87 deleted the filecontent-bedrock branch March 3, 2026 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:amazon-bedrock type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Amazon Bedrock Attachments

2 participants