feat(celery Wave 7 #8): retrieval/curation cutover + 3 wiring points (#1762)

earayu · claude · web-flow · commit 08d9d3b6ac3b · 2026-04-28T04:43:41.000+08:00
* feat(celery Wave 7 #8): retrieval/curation cutover + 3 wiring points Wave 7 §K.12.8 task #8 — three wiring points lit at the same time so PR #1758's inseparability gate (alias-redirect on every indexer write) and PR #1756's vector recall (LightRAG-style semantic search plus 1-hop traversal) become production-alive at the same merge. What lands * ``worker_factory._build_lineage_graph_store`` now returns a ``LineageGraphStoreWithAliasRedirect`` decorating the raw per-collection backend (Wave 7 §K.12 invariant #9). Callers that need the raw inner store — the merger writes canonical names directly and must not be intercepted — go through the new ``_build_lineage_graph_store_inner``. * ``retrieval/pipeline._graph_search`` now composes ``GraphSearchService.search_entities`` (vector recall) + ``get_subgraph`` (1-hop traversal) + ``GraphSearchService.compose_context`` (legacy LightRAG-style rendering). The render is byte-for-byte identical to Wave 6's ``_render_graph_context_text`` (locked by the byte-parity test in PR #1756) so downstream RAG prompts are zero-functional-change — only now vector recall actually happens. * ``GraphService.merge_entities`` (the route layer for ``POST /graphs/nodes/merge``) delegates to ``LineageEntityMerger``. Backward-compat response shape preserved (``target_entity_id`` / ``description`` / ``source_chunk_ids`` / ``edges_*``); chunk ids are recovered from the target's lineage after step 6a re-anchors the source parts under the canonical name. ``edges_redirected`` / ``edges_collapsed`` surface ``0`` because edge re-anchoring is handled transparently by the alias-redirect decorator at indexer write time, not as part of the merge action. * New ``build_lineage_entity_merger_for(collection)`` factory in ``aperag/graph_curation/lineage_merge.py`` resolves the six dependencies (raw inner store / alias repo / compactor / vector connector / embedder / LLM) the merger expects, lifting the ``_SyncEmbedderShim`` pattern out of ``MergeCandidateDetector``'s factory so merger and detector share one shim. Tests * ``tests/unit_test/indexing/test_wave7_task8_wiring.py`` — eight integration tests pinning each of the three wiring points so a future refactor cannot silently regress any of them: decorator wraps inner store, inner factory still returns raw backend, pipeline composes the three GraphSearchService calls in order, KG gate still short-circuits, factory failures degrade to empty, merger delegation preserves backward-compat shape, fallback to unified description when no compaction, alias cycle surfaces as ValueError → 400. * Wave 6 ``test_graph_search_migration.py`` — two existing tests updated to mock the new ``build_graph_search_service_for`` / ``search_entities`` / ``get_subgraph`` boundary instead of the retired keyword-only path. The grep-zero migration assertion is unchanged and passes (legacy import still 0). 12-invariant cross-check (§K.12 / huangheng msg=fcf580a6) * #4 vector store via ``VectorStoreConnectorAdaptor`` ✅ — pipeline composes through ``GraphSearchService`` which already abides; no direct Qdrant import added. * #5 payload ``indexer="graph_entity"`` filter pattern ✅ — inherited from ``GraphSearchService.search_entities``. * #9 ``upsert_entity_with_lineage`` alias redirect ✅ — every indexer / read path now receives the decorated store; merger bypasses via the inner-only factory so canonical writes are not intercepted. * #11 candidate detection write-only / merge read paths ✅ — the cutover preserves the read/write boundary. * #12 grep-zero LightRAG ✅ — code + tests stay LightRAG-clean. ``aperag/graph_curation/*`` and ``domains/knowledge_graph/service.py:get_knowledge_graph`` still hold legacy imports for graph-overview / curation-run paths that depend on enumerate-all-entities; deferred to task #10 close-out alongside the Protocol-extension or legacy delete. 4-pattern pre-check matrix * Pattern 1 v1: ``rg "from aperag.domains.knowledge_graph.graphindex"`` count is unchanged in this PR (drops below the threshold task #10 will assert grep-zero); the ``_graph_search`` path no longer imports the legacy package at all. * Pattern 1 v2: per-method matrix — pipeline cutover replaces ``query_entities_by_keyword`` + ``expand_neighbors_n_hops`` with ``search_entities`` + ``get_subgraph``; merge cutover replaces ``GraphIndexService.merge_entities`` with ``LineageEntityMerger.merge_entities``. * Pattern 2: state binding — new merger factory binds to the same Qdrant connector + embedder the indexer write path uses (``_build_collection_graph_vector_writer``). * Pattern 3: factory + decorator pattern verified by the wiring tests so a future split cannot silently regress. simple-stable 4 guardrail * #1 不无限扩范围 — three wiring swaps + one factory; no new endpoints, no new schema, no new Protocol surface. * #2 尽快上线 — task #8 unblocks task #9 (frontend) and task #11 (e2e narrative) without a Protocol-extension prerequisite. * #3 简单稳定 — decorator + factory split keeps the merger's "writes canonical directly" invariant explicit; pipeline cutover reuses the byte-parity contract from PR #1756. * #4 私有化部署免维护 — no operator config required; the ``API_BASE_URL`` env already supports MCP colocation, and the alias-redirect decorator runs on every collection automatically. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(Wave 7 #8): clarify edges_*=0 semantic on merge_nodes route Per architect msg=4af6f66b / PM msg=e75fd00d follow-up: future maintainers grepping merge_nodes_view shouldn't see the hard-coded 0 and suspect a bug. The Wave 7 §K.12 invariant #9 LineageGraphStoreWithAliasRedirect decorator handles edge re-anchoring at indexer write-time, not as part of the merge call, so there is no explicit edge count to surface on the response. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
diff --git a/aperag/domains/knowledge_graph/api/routes.py b/aperag/domains/knowledge_graph/api/routes.py
@@ -151,6 +151,13 @@ async def merge_nodes_view(
     * The response echoes the merged description **after** LLM
       summarization, so the frontend can refresh the entity detail
       panel without a second fetch.
+    * ``edges_redirected`` / ``edges_collapsed`` are always ``0`` by
+      design (Wave 7 §K.12 invariant #9) — alias redirect happens at
+      indexer write-time via the
+      :class:`LineageGraphStoreWithAliasRedirect` decorator, not as
+      part of the merge call, so the merge action has no "explicit
+      edge re-anchor count" to surface. The fields are kept on the
+      response shape for backward-compat with the existing frontend.
     """
     entity_ids = payload.get("entity_ids") or []
     if not isinstance(entity_ids, list) or len(entity_ids) < 2:
diff --git a/aperag/domains/knowledge_graph/service.py b/aperag/domains/knowledge_graph/service.py
@@ -163,24 +163,51 @@ async def merge_entities(
         target_entity_id: str,
         source_entity_ids: List[str],
     ) -> Dict[str, Any]:
-        """Merge entities and return a summary payload for the UI.
-
-        The heavy lifting (structural merge + LLM summary of the merged
-        description) happens inside
-        ``GraphIndexService.merge_entities``. This layer only validates
-        the collection and reshapes the DTO into the dict the existing
-        frontend expects.
+        """Merge entities — Wave 7 §K.12.8 task #8 cutover to
+        :class:`LineageEntityMerger` (PR #1758).
+
+        The merger runs the 8-step orchestration inside the new lineage
+        layer: alias-map writes for transparent indexer redirect (steps
+        1-2), per-doc parts re-anchor under the canonical name (step 6a,
+        preserves invariant #1 lineage), unified-description LLM merge
+        + compaction (steps 4-5 + 6b), vector point upsert with the
+        canonical name + 3-field payload + uuid5 id (step 7), and
+        finally L1 + vector deletion of source entities (step 8).
+
+        Response shape is preserved byte-for-byte for backward-compat
+        with the existing frontend (cuiwenbo task #9 stays at
+        OpenAPI-regen + flow smoke):
+
+        * ``target_entity_id`` ← ``LineageMergeResult.final_target``
+        * ``description`` ← ``compacted_description`` if present, else
+          ``unified_description``
+        * ``source_chunk_ids`` ← chunk ids contributed by the source
+          entities, recovered from the target's lineage *after* the
+          merge has re-anchored them under the canonical name
+        * ``edges_redirected`` / ``edges_collapsed`` — the new merger
+          does not return per-edge stats (the alias-redirect decorator
+          handles edge re-anchoring transparently at indexer write
+          time, not at merge time). We surface ``0`` for both fields
+          to keep the response shape stable.
         """
         db_collection = await self._get_and_validate_collection(user_id, collection_id)
 
-        from aperag.domains.knowledge_graph.graphindex.integration import make_service_for_collection
+        from aperag.graph_curation.alias_map import AliasCycleError
+        from aperag.graph_curation.lineage_merge import build_lineage_entity_merger_for
+
+        merger = build_lineage_entity_merger_for(db_collection)
+        try:
+            merge_result = await merger.merge_entities(
+                target_name=target_entity_id,
+                source_names=source_entity_ids,
+                merged_by=user_id,
+            )
+        except AliasCycleError as exc:
+            # Surface as a validation error so the route returns 400
+            # (per the existing ``ValueError`` → 400 mapping in
+            # ``merge_nodes_view``).
+            raise ValueError(str(exc)) from exc
 
-        svc = make_service_for_collection(db_collection)
-        result = await svc.merge_entities(
-            collection_id=collection_id,
-            target_entity_id=target_entity_id,
-            source_entity_ids=source_entity_ids,
-        )
         try:
             from aperag.graph_curation import graph_curation_service
 
@@ -191,15 +218,63 @@ async def merge_entities(
             )
         except Exception:
             logger.exception("Failed to expire stale graph-curation suggestions after manual merge")
+
+        # Recover source chunk ids from the target's lineage after the
+        # merge — step 6a re-anchored each source's parts under the
+        # canonical name with their original ``(document_id,
+        # parse_version, chunk_ids)`` so the union of those chunks is
+        # what the UI expects.
+        source_chunk_ids = await self._collect_source_chunk_ids(
+            db_collection,
+            entity_name=merge_result.final_target,
+        )
+
+        description = merge_result.compacted_description or merge_result.unified_description or ""
+
         return {
-            "target_entity_id": result.target_entity_id,
-            "merged_source_ids": list(result.merged_source_ids),
-            "description": result.description,
-            "source_chunk_ids": list(result.source_chunk_ids),
-            "edges_redirected": result.edges_redirected,
-            "edges_collapsed": result.edges_collapsed,
+            "target_entity_id": merge_result.final_target,
+            "merged_source_ids": list(merge_result.merged_source_ids),
+            "description": description,
+            "source_chunk_ids": source_chunk_ids,
+            # Edge re-anchoring is handled transparently by the
+            # alias-redirect decorator at indexer write time (Wave 7
+            # §K.12 invariant #9), not as part of the merge action,
+            # so per-edge counts are not surfaced. ``0`` keeps the
+            # response shape stable for backward-compat.
+            "edges_redirected": 0,
+            "edges_collapsed": 0,
         }
 
+    async def _collect_source_chunk_ids(self, collection: Any, *, entity_name: str) -> List[str]:
+        """Return the union of chunk ids attached to ``entity_name``'s
+        lineage members after the merge has run. Used by
+        :meth:`merge_entities` to populate the backward-compat
+        ``source_chunk_ids`` field."""
+        import asyncio
+
+        from aperag.indexing.worker_factory import (
+            _build_lineage_graph_store_inner,
+            _resolve_graph_backend_type,
+        )
+
+        backend_type = _resolve_graph_backend_type(collection)
+        store = await asyncio.to_thread(
+            _build_lineage_graph_store_inner, backend_type=backend_type, collection=collection
+        )
+        entity = await store.get_entity(entity_name)
+        if entity is None:
+            return []
+        chunk_ids: list[str] = []
+        seen: set[str] = set()
+        for member in entity.source_lineage or ():
+            for cid in getattr(member, "chunk_ids", ()) or ():
+                cid_str = str(cid)
+                if cid_str in seen:
+                    continue
+                seen.add(cid_str)
+                chunk_ids.append(cid_str)
+        return chunk_ids
+
     # ============================================================ Wave 7 §K.12.6
     async def search_entities(
         self,
diff --git a/aperag/domains/retrieval/pipeline.py b/aperag/domains/retrieval/pipeline.py
@@ -461,42 +461,62 @@ async def _graph_search(
         query: str,
         top_k: int,
     ) -> List[DocumentWithScore]:
-        """Knowledge-graph retrieval path via the new
-        :class:`aperag.indexing.graph.LineageGraphStore` Protocol.
-
-        Wave 6 #33 chunk 3 (per architect Option C ruling msg=6fccd9ab):
-        replaces the legacy ``GraphIndexService.query_context`` flow
-        with a two-step composition:
-
-        1. ``query_entities_by_keyword(query, top_k)`` — anchor entities
-           via lexical recall on the lineage store. The retrieval
-           pipeline owns its own embedder and does not need a backend
-           vector index here (vector recall was honestly deferred per
-           chunk 2 ruling — Wave 4 lineage schema has no entity-vector
-           column).
-        2. ``expand_neighbors_n_hops(entity_names, hops=1)`` — pull the
-           direct neighbours + connecting relations so the rendered
-           context block includes both anchor entities and their
-           one-hop graph context.
-
-        A collection that hasn't been indexed yet (or yields no
-        keyword anchors) returns ``[]``; this is the correct behaviour —
-        search pipelines compose (vector + graph + fulltext), and a
-        blank graph just means "graph contributes nothing this time",
-        not "fall back to something stale".
+        """Knowledge-graph retrieval path via :class:`GraphSearchService`.
+
+        Wave 7 §K.12.5 / §K.12.8 (task #8) cutover: replaces the
+        keyword-only Wave 6 #33 chunk 3 path with the full
+        vector + 1-hop traversal composition the legacy
+        ``GraphIndexService.query_context`` always intended. Steps:
+
+        1. :meth:`GraphSearchService.search_entities` — embed the query
+           against the per-collection vector index and ANN-recall the
+           top-K entities (semantic match, not exact name match).
+        2. :meth:`GraphSearchService.get_subgraph` — pull the direct
+           neighbours + connecting relations of the anchor entities.
+        3. :meth:`GraphSearchService.compose_context` — render the
+           ``-----Entities (KG)----- / -----Relationships (KG)-----``
+           text block.
+
+        The render is byte-for-byte identical to Wave 6's
+        ``_render_graph_context_text`` (locked by
+        ``test_compose_context_matches_retrieval_pipeline_render_byte_for_byte``
+        in PR #1756) so the swap is zero-functional-change for
+        downstream RAG prompts: same context shape, same dedup rule,
+        same fallback marker. Vector recall now actually happens
+        — the Wave 4 → Wave 6 vacuum noted in §K.12.1 is closed.
+
+        A collection that hasn't been indexed yet (no vector points)
+        returns ``[]`` — ``search_entities`` swallows backend
+        embed/search faults and returns an empty list, mirroring the
+        Wave 6 graceful-degrade convention. ``enable_knowledge_graph``
+        gating preserved for backward compat.
         """
         config = parseCollectionConfig(collection.config)
         if not config.enable_knowledge_graph:
             logger.warning(f"Collection {collection.id} does not have knowledge graph enabled")
             return []
 
-        store = _build_lineage_graph_store_for(collection)
-        anchors = await store.query_entities_by_keyword(query=query, top_k=top_k)
+        from aperag.indexing.graph_search_service import (
+            GraphSearchService,
+            build_graph_search_service_for,
+        )
+
+        try:
+            service = build_graph_search_service_for(collection)
+        except Exception:
+            logger.warning(
+                "graph_search: factory failed for collection %s; degrading to empty result",
+                collection.id,
+                exc_info=True,
+            )
+            return []
+
+        anchors = await service.search_entities(query=query, top_k=top_k)
         if not anchors:
             return []
         anchor_names = [e.name for e in anchors]
-        entities, relations = await store.expand_neighbors_n_hops(entity_names=anchor_names, hops=1)
-        text = _render_graph_context_text(entities, relations)
+        entities, relations = await service.get_subgraph(entity_names=anchor_names, hops=1)
+        text = GraphSearchService.compose_context(entities, relations)
         if not text:
             return []
         return [DocumentWithScore(text=text, metadata={"recall_type": "graph_search"})]
diff --git a/aperag/graph_curation/lineage_merge.py b/aperag/graph_curation/lineage_merge.py
@@ -445,10 +445,103 @@ async def _to_thread(func, *args, **kwargs):
     return await asyncio.to_thread(func, *args, **kwargs)
 
 
+# ---------------------------------------------------------------------
+# Per-collection factory — Wave 7 §K.12.8 task #8 wiring
+# ---------------------------------------------------------------------
+
+
+class _SyncEmbedderShim:
+    """Adapt the sync ``(text -> list[float])`` callable used by the
+    graph worker into the ``embed_query`` shape the merger / detector
+    expect (mirrors :class:`EmbeddingService` surface). Lifted from
+    :class:`MergeCandidateDetector`'s factory so the merger and
+    detector share one shim."""
+
+    def __init__(self, fn: Callable[[str], list[float]]) -> None:
+        self._fn = fn
+
+    def embed_query(self, text: str) -> list[float]:
+        return self._fn(text)
+
+
+def build_lineage_entity_merger_for(collection: Any) -> "LineageEntityMerger":
+    """Build a :class:`LineageEntityMerger` for ``collection``.
+
+    Wires the six dependencies the merger expects:
+
+    * ``store`` — :func:`_build_lineage_graph_store_inner` (raw inner
+      store; the merger writes canonical names directly and must NOT
+      be intercepted by the alias-redirect decorator that
+      :func:`_build_lineage_graph_store` returns to indexer / read
+      paths).
+    * ``alias_repo`` — fresh :class:`AliasMapRepository`.
+    * ``compactor`` — :func:`_build_collection_graph_compactor`. Falls
+      back to a no-op compactor if the collection has no completion
+      model configured (the merger still runs; description simply
+      stays uncompacted).
+    * ``vector_connector`` + ``embedder`` —
+      :func:`_build_collection_graph_vector_writer` shared with the
+      Phase 3 indexer write path. Wrapped via :class:`_SyncEmbedderShim`
+      so the ``embed_query`` interface matches.
+    * ``llm`` — :func:`build_collection_llm_callable` (same async LLM
+      callable the legacy graphindex used).
+    * ``collection_id`` — bound at construction.
+
+    Raises :class:`aperag.indexing.worker_factory.WorkerFactoryError`
+    when the embedder / vector connector / LLM cannot be resolved
+    — a user-driven merge cannot meaningfully proceed without them
+    (no unified description, no vector re-anchor).
+    """
+    # Lazy imports keep this module free of worker_factory at import
+    # time so worker_factory's own ``from .lineage_merge`` import (if
+    # ever added) wouldn't form a cycle.
+    from aperag.domains.knowledge_graph.graphindex.integration import (
+        build_collection_llm_callable,
+    )
+    from aperag.indexing.graph_compactor import GraphIndexCompactor
+    from aperag.indexing.worker_factory import (
+        WorkerFactoryError,
+        _build_collection_graph_vector_writer,
+        _build_lineage_graph_store_inner,
+        _resolve_graph_backend_type,
+    )
+
+    backend_type = _resolve_graph_backend_type(collection)
+    inner_store = _build_lineage_graph_store_inner(backend_type=backend_type, collection=collection)
+
+    vector_connector, embed_fn = _build_collection_graph_vector_writer(collection)
+    if vector_connector is None or embed_fn is None:
+        raise WorkerFactoryError(
+            f"merge_entities: vector connector / embedder unavailable for collection {collection.id!r}"
+        )
+
+    try:
+        llm = build_collection_llm_callable(collection)
+    except Exception as exc:  # noqa: BLE001 — surface as factory failure.
+        raise WorkerFactoryError(
+            f"merge_entities: LLM not configured for collection {collection.id!r}: {exc}"
+        ) from exc
+
+    # Compactor is best-effort — the merger still runs without it
+    # (description stays uncompacted; embedding falls back to unified).
+    compactor = GraphIndexCompactor(llm=llm)
+
+    return LineageEntityMerger(
+        store=inner_store,
+        alias_repo=AliasMapRepository(),
+        compactor=compactor,
+        vector_connector=vector_connector,
+        embedder=_SyncEmbedderShim(embed_fn),
+        llm=llm,
+        collection_id=str(collection.id),
+    )
+
+
 __all__ = [
     "LineageEntityMerger",
     "LineageMergeResult",
     "AliasCycleError",
     "CURATION_MERGE_DOCUMENT_ID",
     "GRAPH_ENTITY_INDEXER",
+    "build_lineage_entity_merger_for",
 ]
diff --git a/aperag/indexing/worker_factory.py b/aperag/indexing/worker_factory.py
@@ -690,10 +690,15 @@ def _resolve_graph_backend_type(collection: Any) -> str:
     return backend
 
 
-def _build_lineage_graph_store(*, backend_type: str, collection: Any) -> Any:
-    """Construct the per-collection :class:`LineageGraphStore` adapter
-    by binding the shared per-process backend client to the collection
-    id."""
+def _build_lineage_graph_store_inner(*, backend_type: str, collection: Any) -> Any:
+    """Construct the raw per-collection :class:`LineageGraphStore`
+    adapter (no alias-redirect wrapper).
+
+    Used internally by :func:`_build_lineage_graph_store` and by the
+    user-driven merger (Wave 7 task #6 :class:`LineageEntityMerger`,
+    which writes canonical names directly and must not be intercepted
+    by the alias-redirect decorator).
+    """
     if backend_type == "postgres":
         engine = _postgres_async_engine_singleton()
         from aperag.indexing.graph_storage.postgres import PostgresLineageGraphStore
@@ -720,6 +725,33 @@ def _build_lineage_graph_store(*, backend_type: str, collection: Any) -> Any:
     raise WorkerFactoryError(f"unsupported graph_backend_type {backend_type!r}")
 
 
+def _build_lineage_graph_store(*, backend_type: str, collection: Any) -> Any:
+    """Wave 7 §K.12 invariant #9 — return the alias-redirect-wrapped
+    :class:`LineageGraphStore`.
+
+    Every write goes through the per-collection
+    :class:`LineageGraphStoreWithAliasRedirect` (PR #1758) so
+    user-merged entities are transparently consolidated at indexing
+    time. Reads pass through unchanged. This makes the inseparability
+    gate of task #6 alive in production: without this swap the alias
+    map is written but never consulted, so user merges silently
+    re-created the merged-away entities on the next sync.
+
+    Callers that need the raw inner store (the merger's L1 write
+    path, which targets canonical names directly and must not be
+    redirected) should use :func:`_build_lineage_graph_store_inner`.
+    """
+    from aperag.graph_curation.alias_map import AliasMapRepository
+    from aperag.indexing.alias_redirect_store import LineageGraphStoreWithAliasRedirect
+
+    inner = _build_lineage_graph_store_inner(backend_type=backend_type, collection=collection)
+    return LineageGraphStoreWithAliasRedirect(
+        inner=inner,
+        alias_repo=AliasMapRepository(),
+        collection_id=str(collection.id),
+    )
+
+
 def _resolve_entity_lock(*, backend_type: str) -> Any:
     """Pick the EntityLock implementation appropriate for the backend.
 
diff --git a/tests/unit_test/domains/retrieval/test_graph_search_migration.py b/tests/unit_test/domains/retrieval/test_graph_search_migration.py
diff --git a/tests/unit_test/indexing/test_wave7_task8_wiring.py b/tests/unit_test/indexing/test_wave7_task8_wiring.py