fix: eliminate N+1 query in search hydrate_results#713
Conversation
Replace per-result get_entities_by_id calls with a single batch fetch using an ID-keyed lookup dict. Also fixes a correctness bug where from_entity used positional indexing instead of from_id lookup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Drew Cain <groksrc@gmail.com>
|
Claude finished @groksrc's task in 1m 2s —— View job PR Review: fix/search-hydrate-n-plus-1
SummarySolid fix. The N+1 elimination is correct, the positional-index bug fix is real and well-documented, and the test suite is thorough. A few minor observations below. Code Quality & Standards
Core Change Analysis (
|
| Check | Status |
|---|---|
| Unit tests for new logic | ✅ 11 tests in test_search_hydration.py |
| 100% coverage maintained | ✅ |
| Edge cases covered | ✅ |
| Telemetry test updated | ✅ |
The SpyEntityService pattern (tracking call count and args) is a good choice over mocking — it tests behavioral contracts without coupling to implementation details.
The test for the specific bug (test_relation_with_distinct_entity_and_from_ids) clearly documents what the old code got wrong. ✅
The FakeEntityService in test_utils_telemetry.py was correctly updated to return SimpleNamespace objects with .id fields, since the new code builds the lookup dict via {e.id: e for e in entities}. ✅
Security & Performance
| Check | Status |
|---|---|
| No hardcoded secrets | ✅ |
| Input validation | ✅ (unchanged, None guard via if eid is not None) |
| Performance improvement | ✅ O(n) → O(1) DB calls |
| No sensitive data in logs | ✅ |
Compatibility
| Check | Status |
|---|---|
| Windows-compatible file path comparisons | ✅ (no path comparisons added) |
| No emojis/unicode in console/log output | ✅ |
Verdict
LGTM. The change is correct, well-tested, and a meaningful improvement for cloud-hosted deployments. The one minor nit about the telemetry span wrapping an empty-results path is cosmetic and not a blocker.
Summary
Fixes #707 — search
hydrate_resultsmade a separateget_entities_by_id()call per search result, creating an N+1 query pattern. On cloud (Neon PostgreSQL, ~300ms per round-trip), this added ~3s for 10 results.from_entitypreviously used positional indexing (entities[0]) which was wrong whenentity_id != from_idPerformance impact
Test plan
test_search_hydration.py— prove single DB call, ID dedup, correct field mapping per type, edge cases🤖 Generated with Claude Code