Skip to content

refactor(pgvector): extract PostgreSQLDocumentStore abstract base class#3254

Open
SyedShahmeerAli12 wants to merge 4 commits intodeepset-ai:mainfrom
SyedShahmeerAli12:refactor/pgvector-base-class
Open

refactor(pgvector): extract PostgreSQLDocumentStore abstract base class#3254
SyedShahmeerAli12 wants to merge 4 commits intodeepset-ai:mainfrom
SyedShahmeerAli12:refactor/pgvector-base-class

Conversation

@SyedShahmeerAli12
Copy link
Copy Markdown
Contributor

@SyedShahmeerAli12 SyedShahmeerAli12 commented Apr 29, 2026

Related Issues

Proposed Changes:

Splits PgvectorDocumentStore into two layers:

  • _base.py (new file) — PostgreSQLDocumentStore abstract base class containing all SQL constants and every data-layer method (sync and async). Declares _ensure_db_setup and _ensure_db_setup_async as @abstractmethod.
  • document_store.py (slimmed from ~2040 to ~230 lines) — PgvectorDocumentStore(PostgreSQLDocumentStore) now only contains the psycopg connection setup (_ensure_db_setup, _ensure_db_setup_async) and serialization (to_dict, from_dict).
  • __init__.py — exports PostgreSQLDocumentStore alongside PgvectorDocumentStore so future integrations (AlloyDB, etc.) can import the base class from pgvector-haystack.

SupabasePgvectorDocumentStore requires no changes — it inherits from PgvectorDocumentStore and gets the base class for free.

How did you test it?

All existing unit tests pass (pytest -m "not integration" — 60 tests):

60 passed, 93 deselected in 0.32s

Imports verified:

from haystack_integrations.document_stores.pgvector import PgvectorDocumentStore, PostgreSQLDocumentStore
# PgvectorDocumentStore bases: ['PostgreSQLDocumentStore']

Notes for the reviewer

  • VALID_VECTOR_FUNCTIONS and related constants are re-exported from document_store.py for backward compatibility (imported by PgvectorEmbeddingRetriever).
  • Error messages that previously referenced "PgvectorDocumentStore" by name have been made generic so they stay meaningful when called from AlloyDB or other subclasses.

Checklist

Move all SQL constants and data-layer methods into a new
`_base.py` abstract base class (`PostgreSQLDocumentStore`).
`PgvectorDocumentStore` is now a thin subclass that only
implements `_ensure_db_setup` and `_ensure_db_setup_async`
using psycopg.

Future PostgreSQL-backed integrations (AlloyDB, etc.) can
subclass `PostgreSQLDocumentStore` and override only the
connection layer, inheriting all SQL/data logic for free.

Closes deepset-ai#3239
@SyedShahmeerAli12 SyedShahmeerAli12 requested a review from a team as a code owner April 29, 2026 13:30
@SyedShahmeerAli12 SyedShahmeerAli12 requested review from anakin87 and removed request for a team April 29, 2026 13:30
@github-actions github-actions Bot added integration:pgvector type:documentation Improvements or additions to documentation labels Apr 29, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Coverage report (pgvector)

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  integrations/pgvector/src/haystack_integrations/document_stores/pgvector
  _base.py 208, 256, 290-310, 316-354, 360-404, 413-421, 431-439, 447-490, 498-528, 534-568, 579-590, 599-615, 629-652, 667-690, 696-707, 726-762, 783-814, 822-835, 843-856, 866-873, 879-886, 900-932, 942-974, 985-1024, 1035-1074, 1082-1100, 1118-1135, 1147-1164, 1177-1178, 1180-1184, 1192-1226, 1245-1259, 1274-1289, 1292-1300, 1310-1323, 1333-1348, 1370, 1383-1402, 1439-1451, 1473-1487, 1527-1528, 1537-1540, 1560-1572, 1583-1595, 1605-1661, 1681, 1693-1713, 1725-1747, 1761-1788, 1801-1803, 1820-1841, 1858-1881
  document_store.py 216, 247-248
Project Total  

This report was generated by python-coverage-comment-action

@SyedShahmeerAli12
Copy link
Copy Markdown
Contributor Author

SyedShahmeerAli12 commented Apr 29, 2026

@anakin87 @davidsbatista this PR implements the refactoring proposed in #3239. Would love your feedback when you get a chance!

@davidsbatista
Copy link
Copy Markdown
Contributor

davidsbatista commented Apr 30, 2026

@SyedShahmeerAli12, please, focus on one PR at a time. You currently have 6 PRs open, all of which need review, or are in review and need refining, refactoring, or cleaning. I ask you no to open any more PRs until all are closed or merged.

@anakin87 anakin87 requested a review from davidsbatista May 6, 2026 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:pgvector type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor PgvectorDocumentStore making it reusable for future PostgreSQL-related integrations

2 participants