Skip to content

fix: fix wrong batching in Google GenAI Document Embedder#2951

Merged
anakin87 merged 2 commits intomainfrom
fix-genai-doc-embedder-bug
Mar 11, 2026
Merged

fix: fix wrong batching in Google GenAI Document Embedder#2951
anakin87 merged 2 commits intomainfrom
fix-genai-doc-embedder-bug

Conversation

@anakin87
Copy link
Copy Markdown
Member

@anakin87 anakin87 commented Mar 11, 2026

Related Issues

We noticed that the Google GenAI Document Embedder behaved differently than when using the Google API directly, leading to non-meaningful results for text retrieval.

After investigation, it turned out that the batching mechanism was wrong, actually passing 1-character texts to the embedding API

Proposed Changes:

  • fix batching
  • add unit and integration test to catch this behavior in the future

How did you test it?

CI, new tests

Checklist

@anakin87 anakin87 marked this pull request as ready for review March 11, 2026 11:54
@anakin87 anakin87 requested a review from a team as a code owner March 11, 2026 11:54
@anakin87 anakin87 requested review from sjrl and removed request for a team March 11, 2026 11:54
@anakin87 anakin87 self-assigned this Mar 11, 2026
Copy link
Copy Markdown
Contributor

@sjrl sjrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Apologies for missing this on the original review of this contribution.

@anakin87 anakin87 merged commit 85eeab0 into main Mar 11, 2026
11 checks passed
@anakin87 anakin87 deleted the fix-genai-doc-embedder-bug branch March 11, 2026 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants