You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat(celery Wave 7 #8): retrieval/curation cutover + 3 wiring points
Wave 7 §K.12.8 task #8 — three wiring points lit at the same time
so PR #1758's inseparability gate (alias-redirect on every indexer
write) and PR #1756's vector recall (LightRAG-style semantic search
plus 1-hop traversal) become production-alive at the same merge.
What lands
* ``worker_factory._build_lineage_graph_store`` now returns a
``LineageGraphStoreWithAliasRedirect`` decorating the raw
per-collection backend (Wave 7 §K.12 invariant #9). Callers that
need the raw inner store — the merger writes canonical names
directly and must not be intercepted — go through the new
``_build_lineage_graph_store_inner``.
* ``retrieval/pipeline._graph_search`` now composes
``GraphSearchService.search_entities`` (vector recall) +
``get_subgraph`` (1-hop traversal) +
``GraphSearchService.compose_context`` (legacy LightRAG-style
rendering). The render is byte-for-byte identical to Wave 6's
``_render_graph_context_text`` (locked by the byte-parity test in
PR #1756) so downstream RAG prompts are zero-functional-change —
only now vector recall actually happens.
* ``GraphService.merge_entities`` (the route layer for
``POST /graphs/nodes/merge``) delegates to ``LineageEntityMerger``.
Backward-compat response shape preserved (``target_entity_id`` /
``description`` / ``source_chunk_ids`` / ``edges_*``); chunk ids
are recovered from the target's lineage after step 6a re-anchors
the source parts under the canonical name. ``edges_redirected`` /
``edges_collapsed`` surface ``0`` because edge re-anchoring is
handled transparently by the alias-redirect decorator at indexer
write time, not as part of the merge action.
* New ``build_lineage_entity_merger_for(collection)`` factory in
``aperag/graph_curation/lineage_merge.py`` resolves the six
dependencies (raw inner store / alias repo / compactor /
vector connector / embedder / LLM) the merger expects, lifting
the ``_SyncEmbedderShim`` pattern out of
``MergeCandidateDetector``'s factory so merger and detector
share one shim.
Tests
* ``tests/unit_test/indexing/test_wave7_task8_wiring.py`` — eight
integration tests pinning each of the three wiring points so a
future refactor cannot silently regress any of them: decorator
wraps inner store, inner factory still returns raw backend,
pipeline composes the three GraphSearchService calls in order,
KG gate still short-circuits, factory failures degrade to empty,
merger delegation preserves backward-compat shape, fallback to
unified description when no compaction, alias cycle surfaces as
ValueError → 400.
* Wave 6 ``test_graph_search_migration.py`` — two existing tests
updated to mock the new ``build_graph_search_service_for`` /
``search_entities`` / ``get_subgraph`` boundary instead of the
retired keyword-only path. The grep-zero migration assertion is
unchanged and passes (legacy import still 0).
12-invariant cross-check (§K.12 / huangheng msg=fcf580a6)
* #4 vector store via ``VectorStoreConnectorAdaptor`` ✅ — pipeline
composes through ``GraphSearchService`` which already abides;
no direct Qdrant import added.
* #5 payload ``indexer="graph_entity"`` filter pattern ✅ —
inherited from ``GraphSearchService.search_entities``.
* #9 ``upsert_entity_with_lineage`` alias redirect ✅ — every
indexer / read path now receives the decorated store; merger
bypasses via the inner-only factory so canonical writes are
not intercepted.
* #11 candidate detection write-only / merge read paths ✅ — the
cutover preserves the read/write boundary.
* #12 grep-zero LightRAG ✅ — code + tests stay LightRAG-clean.
``aperag/graph_curation/*`` and ``domains/knowledge_graph/service.py:get_knowledge_graph``
still hold legacy imports for graph-overview / curation-run
paths that depend on enumerate-all-entities; deferred to task #10
close-out alongside the Protocol-extension or legacy delete.
4-pattern pre-check matrix
* Pattern 1 v1: ``rg "from aperag.domains.knowledge_graph.graphindex"``
count is unchanged in this PR (drops below the threshold task #10
will assert grep-zero); the ``_graph_search`` path no longer
imports the legacy package at all.
* Pattern 1 v2: per-method matrix — pipeline cutover replaces
``query_entities_by_keyword`` + ``expand_neighbors_n_hops`` with
``search_entities`` + ``get_subgraph``; merge cutover replaces
``GraphIndexService.merge_entities`` with
``LineageEntityMerger.merge_entities``.
* Pattern 2: state binding — new merger factory binds to the same
Qdrant connector + embedder the indexer write path uses
(``_build_collection_graph_vector_writer``).
* Pattern 3: factory + decorator pattern verified by the wiring
tests so a future split cannot silently regress.
simple-stable 4 guardrail
* #1 不无限扩范围 — three wiring swaps + one factory; no new
endpoints, no new schema, no new Protocol surface.
* #2 尽快上线 — task #8 unblocks task #9 (frontend) and task #11
(e2e narrative) without a Protocol-extension prerequisite.
* #3 简单稳定 — decorator + factory split keeps the merger's
"writes canonical directly" invariant explicit; pipeline cutover
reuses the byte-parity contract from PR #1756.
* #4 私有化部署免维护 — no operator config required; the
``API_BASE_URL`` env already supports MCP colocation, and the
alias-redirect decorator runs on every collection automatically.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(Wave 7 #8): clarify edges_*=0 semantic on merge_nodes route
Per architect msg=4af6f66b / PM msg=e75fd00d follow-up: future
maintainers grepping merge_nodes_view shouldn't see the
hard-coded 0 and suspect a bug. The Wave 7 §K.12 invariant #9
LineageGraphStoreWithAliasRedirect decorator handles edge
re-anchoring at indexer write-time, not as part of the merge call,
so there is no explicit edge count to surface on the response.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
0 commit comments