perf(core): batch postgres vector sync preparation#723
perf(core): batch postgres vector sync preparation#723phernandez wants to merge 1 commit intomainfrom
Conversation
Signed-off-by: phernandez <paul@basicmachines.co>
|
Additional local verification note: Failure:
The targeted tests that passed before opening the PR were:
|
|
Claude finished @phernandez's task in 5m 32s —— View job PR Review:
|
| Item | Status | Notes |
|---|---|---|
| Python 3.12+ types and async patterns | progress_callback missing annotation on override |
|
| SQLAlchemy 2.0 best practices | ✅ | F-string SQL building is safe but should be commented |
| 100-character line length | ✅ | |
| No commented-out code | ✅ | |
| Unit tests for new code | ✅ | New TestBatchPrepareConcurrency and TestWriteEmbeddings |
| 100% test coverage | _update_timestamp_sql has # pragma: no cover on line 892 — verify that's still needed |
|
| No hardcoded secrets | ✅ | |
| Error handling / logging | ✅ | Good structured logging with loguru |
| Performance | ✅ | Core intent: latency hiding via concurrent prepare |
| Code duplication | ❌ | ~100 lines duplicated from base class orchestration loop |
| Known test regression | ❌ | test_semantic_quality[asyncio-postgres-openai] recall@5 = 0.0 |
| File path / Windows compat | ✅ | No file path operations in changed code |
| No emoji/unicode in logs | ✅ | |
| Literate programming comments | ✅ | Good Trigger/Why/Outcome comments on the new method |
| Alembic migration | N/A | No schema changes |
Bottom Line
The bulk-embedding upsert and concurrent-prepare window are good ideas. The main concerns are:
-
Unresolved test regression — please verify whether
recall@5 = 0.0is a pre-existing failure or introduced by this PR. If it's a regression, the concurrentasyncio.gatherprepare path or the new bulk-upsert chunk path may have a bug that silently loses embedding writes. -
Code duplication — the nearly verbatim copy of the base class orchestration loop should be refactored before this lands.
Summary
Testing
uv run pytest -q tests/repository/test_postgres_search_repository_unit.py tests/repository/test_semantic_search_base.py