Skip to content

fix(pgvector): order retrieval by distance operator#3370

Merged
julian-risch merged 2 commits into
deepset-ai:mainfrom
2830500285:codex/pgvector-indexable-order-by
Jun 1, 2026
Merged

fix(pgvector): order retrieval by distance operator#3370
julian-risch merged 2 commits into
deepset-ai:mainfrom
2830500285:codex/pgvector-indexable-order-by

Conversation

@2830500285
Copy link
Copy Markdown
Contributor

Pgvector currently computes Haystack-facing scores correctly, but then orders embedding retrieval by the score alias. For cosine similarity that turns the pgvector distance operator into an expression and orders it descending, so HNSW/IVFFlat indexes cannot satisfy the ORDER BY shape pgvector expects.

This keeps the selected score column unchanged, while sorting by the raw pgvector distance operator in ascending order. The result ordering stays the same for cosine similarity and inner product, l2_distance remains distance-ascending, and Postgres can use the vector index for the retrieval sort.

SELECT *, 1 - (embedding <=> '[...]') AS score
FROM schema.table
ORDER BY embedding <=> '[...]' ASC
LIMIT 5

Fixes #2199

@2830500285 2830500285 requested a review from a team as a code owner June 1, 2026 04:02
@2830500285 2830500285 requested review from julian-risch and removed request for a team June 1, 2026 04:02
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jun 1, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

Coverage report (pgvector)

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  integrations/pgvector/src/haystack_integrations/document_stores/pgvector
  document_store.py 1371
Project Total  

This report was generated by python-coverage-comment-action

Copy link
Copy Markdown
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and I ran integration tests locally. Thank you for this contribution @2830500285 !

@julian-risch julian-risch merged commit 44c7e74 into deepset-ai:main Jun 1, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No score and good results on PgVector when using cosine_similarity because of the DESC and the 1 - cosine_distance. Wrong usage of order by

3 participants