feat(celery Wave 4): real backends + 11 production-readiness items#1731
Conversation
…dapter
First chunk of Wave 4 T8 (graph 3-backend wiring + legacy hard-cut
2nd round). Per architect msg=803a2757 plan, lands the Postgres
adapter alone so the schema mapping for the §D.3.5
``LineageGraphStore`` Protocol can be ratified before mirroring
into Neo4j / Nebula in chunks 2 / 3, with the legacy
``aperag/domains/knowledge_graph/graphindex/storage/*`` hard-cut
deletion in chunk 4.
Schema design (two tables, JSONB SET storage):
* ``aperag_lineage_entity (collection_id, name, type, source_lineage,
description_parts, gmt_created, gmt_updated)`` — composite primary
key on ``(collection_id, name)`` keeps tenants isolated; both JSONB
columns default to ``'[]'::jsonb``. Composite uniqueness named
``uq_lineage_entity_collection_name`` so the adapter can
``ON CONFLICT ON CONSTRAINT`` for atomic upserts.
* ``aperag_lineage_relation (collection_id, source, target, type,
description, evidence_lineage, description_parts, ...)`` — same
shape, primary key on the four-tuple, named constraint
``uq_lineage_relation_collection_triple``.
* ``description`` on the relation row is the LATEST canonical
description (so reads do not always have to walk
``description_parts``); per-(doc, version) slices live in
``description_parts`` JSONB array.
§D.3.5 Protocol semantics mapped to single-statement SQL:
* ``find_*_with_lineage(document_id)`` — JSONB containment
``source_lineage @> jsonb_build_array(jsonb_build_object('document_id', $d))``
scans the SET in one expression.
* ``remove_*_lineage_member(..., document_id)`` — single
``UPDATE ... SET source_lineage = (SELECT jsonb_agg(elem) ... WHERE
elem->>'document_id' != $d)`` so concurrent syncs cannot race on
read-modify-write.
* ``upsert_*_with_lineage`` — ``INSERT ... ON CONFLICT DO UPDATE``
with a strip-then-append expression that replaces any element
with the same ``(document_id, parse_version)`` key and appends
the new one. Matching ``DescriptionPart`` mirrors the same
shape.
* ``gc_*_if_orphan`` — ``DELETE ... WHERE jsonb_array_length(...) = 0``.
* ``get_entity`` / ``get_relation`` — ORM SELECT with
``_row_to_entity_with_lineage`` / ``_row_to_relation_with_lineage``
materialisers that reconstruct the canonical
``EntityWithLineage`` / ``RelationWithLineage`` dataclasses.
Tenant isolation: the store is constructed with the
``collection_id`` it serves; all queries filter on it. The
worker_factory caches one ``PostgresLineageGraphStore`` instance
per collection (chunk 4 follow-up), so the in-process singleton
pattern is replaced by a per-collection pool.
Hard-cut second-round target: this module replaces
``aperag/domains/knowledge_graph/graphindex/storage/postgres.py``
when chunk 4 deletes the legacy storage subpackage. The legacy
``GraphStore.upsert_entities(collection_id, [Entity])`` interface
is fundamentally entity-level merge-append; the new lineage
Protocol is SET-element add-or-replace, so this is a clean
rewrite, not a wrap.
Pending in subsequent chunks (per architect msg=803a2757 plan):
* chunk 2 — Neo4j adapter mirror.
* chunk 3 — Nebula adapter mirror with per-entity Redis lock
activation.
* chunk 4 — alembic migration for the two tables + factory
dispatch on ``collection.config.graph_backend_type`` +
``aperag/indexing/worker_factory.py:_build_graph_worker`` rewire
+ delete legacy ``aperag/domains/knowledge_graph/graphindex/
storage/*`` + GraphIndexService consumer migration.
* contract tests round out each adapter chunk.
Local gates: import smoke (``from aperag.indexing.graph_storage.
postgres import PostgresLineageGraphStore``) OK. ``ruff check +
format --check`` clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… INDEXING_QUEUE_BACKEND Per architect msg=803a2757 owner assignment + Wave 4 backlog #6 (Real Redis WorkQueue replacing single-process InMemoryWorkQueue per design pack §E.2 + Wave 1+2 gap report msg=fab88774). aperag/indexing/orchestrator.py: - Add ``RedisWorkQueue`` class implementing the existing ``WorkQueue`` Protocol. RPUSH on enqueue + BLPOP-with-timeout on dequeue, JSON- encoded payloads. Per-modality keying as ``q:indexing:<modality>`` so each modality has its own BLPOP queue. Lazy ``redis.asyncio.Redis`` client (one per process), ``close()`` for lifespan-shutdown drain. - Includes ``qsize()`` inspector helper for the §J.1 ``queue_depth`` SLI emitter (Wave 4 #8 follow-up will wire OTLP) + tests. aperag/indexing/__init__.py: - Re-export ``RedisWorkQueue`` alongside ``InMemoryWorkQueue`` + ``WorkQueue`` so consumers can dispatch on the protocol without importing the orchestrator module directly. aperag/config.py: - Add ``INDEXING_QUEUE_BACKEND`` setting (default ``inmemory`` for backward-compat with Wave 3 single-pod deployments; production multi-pod sets ``redis``). - Add ``INDEXING_QUEUE_REDIS_URL`` setting; auto-derive ``redis://...:port/2`` from existing ``REDIS_HOST/PORT/USER/PASSWORD`` when unset, using a separate logical DB (db=2) from broker (db=0) and memory (db=1) so BLPOP queues never collide with cache / memory backends. aperag/app.py: - Lifespan dispatch: when ``settings.indexing_queue_backend == "redis"`` instantiate ``RedisWorkQueue(settings.indexing_queue_redis_url)``, else ``InMemoryWorkQueue()``. - Shutdown path calls ``queue.close()`` (guarded by ``hasattr``) so the Redis client's connection pool drains cleanly on SIGTERM. - Add ``import contextlib`` for the suppress-on-close guard. tests/integration/test_redis_workqueue.py (NEW, +6 tests): - Pin Wave 4 #6 acceptance: roundtrip / timeout-returns-None / per-modality isolation / qsize / close-and-reconnect / **multi- consumer atomic demux** (the §E.2 multi-process scale-out invariant — N consumers BLPOP'ing N pushed payloads see N disjoint payloads, no duplicates, no losses). - Skip-when-Redis-unreachable so the lint-and-unit CI lane stays green; the e2e-http-compose lane has Redis up so these run there. - Per-test key prefix (``q:test:<uuid>:<modality>``) avoids cross- test / cross-CI-run leakage on shared Redis DB. Production-readiness invariant declaration (per architect Wave 3 lesson `feedback_production_readiness_invariant.md`): - must-be-real: ``RedisWorkQueue`` is the production transport; multi- process scale-out depends on it - may-be-gated: ``InMemoryWorkQueue`` remains as TEST/SINGLE-POD ONLY default (operators must set ``INDEXING_QUEUE_BACKEND=redis`` for multi-pod correctness) - follow-up Wave: none — T4 fully resolves Wave 4 #6 Local gates: - pytest tests/integration/test_redis_workqueue.py → 6 passed (against real Redis at db=15) - pytest tests/unit_test/indexing/ tests/load/ → 142 passed - ruff check + format --check clean Wave 4 chunk 1 of 4 (T4). Next chunks (chenyexuan, parallel with Bryce T8): T2 cleanup fan-out / T3 real parser / T6 OTLP emitter. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirror PostgresLineageGraphStore (chunk 1 reference) to Neo4j 5.x via Cypher parallel-list encoding — Neo4j node properties cannot hold LIST<MAP>, so the JSONB-shaped lineage SET is split into one list<string> of JSON-encoded members plus parallel list<string> properties for the dedup key fields (doc_ids, parse_versions). The strip-then-append upsert is a single Cypher MERGE … SET statement so two concurrent syncs against the same row cannot race on read-modify-write — Neo4j's MERGE row-lock makes the trailing SET atomic. APOC is not required. Six contract integration tests against a real Neo4j 5.x instance (skip-if-COMPAT_NEO4J_URI-unset) lock the §D.3.5 lineage SET semantics: roundtrip / multi-doc lineage / doc re-parse v1→v2 via remove+upsert orchestrator flow / orphan GC / relation-independent lineage / per-collection_id tenant isolation. Production-readiness invariant declaration (per Wave 3 lesson #10): - must-be-real: Neo4jLineageGraphStore on AsyncDriver - may-be-gated: InMemoryLineageGraphStore stays the worker_factory default until chunk 4 wires the backend dispatch - fully-resolves: Wave 4 T8 chunk 2 (Neo4j adapter) Legacy aperag/domains/knowledge_graph/graphindex/storage/neo4j.py is deleted in chunk 4 alongside the worker_factory wire-in (per architect msg=0aea9edf hard-cut). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…re + lifespan dispatch on INDEXING_METRICS_EMITTER Wave 4 backlog #8 — wire the four §J.1 SLIs (index_lag_seconds / index_failure_total / index_success_total / queue_depth / worker_utilization) onto a real OpenTelemetry SDK MeterProvider so operators running multi-pod production deployments actually see indexing health on their OTLP collector instead of silently dropping samples through NoopMetricsEmitter. Production-readiness invariant three-layer declaration (per Wave 3 lesson #10): - must-be-real: aperag.indexing.OTLPMetricsEmitter materialises real SDK Counter / Gauge instruments via opentelemetry.metrics.get_meter and forwards every gauge / counter call to the configured exporter. - may-be-gated: NoopMetricsEmitter remains the INDEXING_METRICS_EMITTER default and silently drops every sample — TEST / dev / single-machine only; production multi-pod operators must opt into INDEXING_METRICS_EMITTER=otlp to ship data to the collector. - fully-resolves: Wave 4 backlog #8. Implementation: - aperag/observability/metrics.py adds init_metrics_provider(config) / shutdown_metrics_provider(). configure_metrics() now drives the SDK MeterProvider install (PeriodicExportingMetricReader wrapping OTLPMetricExporter, http/protobuf or grpc per config.otlp_protocol) whenever config.metrics_enabled. Without this, the OTel global provider is _ProxyMeterProvider and every metric call is a no-op. - aperag/indexing/observability.py adds OTLPMetricsEmitter — caches Counter / Gauge instruments per metric name and tolerates instrument creation / set / add failures so a misconfigured exporter degrades to no-op instead of crashing indexing. Constructor accepts an injectable meter for tests because opentelemetry.metrics.set_meter_provider is one-shot per process and cannot be reset between test cases. - aperag/config.py adds INDEXING_METRICS_EMITTER setting (default noop; production sets otlp). - aperag/indexing/runtime.py extends IndexingRuntime with a metrics_emitter field (default NoopMetricsEmitter via field default_factory) so service-layer callers and lifespan-spawned workers can pull a single emitter from the runtime singleton. - aperag/app.py lifespan dispatches on settings.indexing_metrics_emitter: otlp → OTLPMetricsEmitter() / else → NoopMetricsEmitter(); stashes on app.state.indexing_metrics_emitter and feeds the runtime singleton. tests/integration/test_otlp_metrics_emitter.py (NEW, 7 tests): - counter round-trip: SDK MeterProvider with InMemoryMetricReader sees the running total, attribute-keyed. - gauge round-trip: Gauge instrument latest-value-wins per attribute set. - per-modality isolation: VECTOR + SUMMARY queue_depth fan out to separate data points. - instrument reuse: repeated emits for the same metric name reuse one SDK instrument handle. - init_metrics_provider endpoint guard: short-circuits when OTEL_EXPORTER_OTLP_ENDPOINT missing, app keeps booting. - init_metrics_provider idempotent: second call returns True without re-initialising. - default meter resolves global: emitter constructed without a meter binds to opentelemetry.metrics.get_meter("aperag.indexing") and tolerates the proxy provider so Tier-1 boot before OTLP collector is reachable does not crash. Local gates: - ruff check + format clean (7 files) - pytest tests/integration/test_otlp_metrics_emitter.py: 7 passed - pytest tests/unit_test/indexing/ tests/load/: 142 passed - pytest tests/unit_test/ (sans env-dep moto): 861 passed / 27 skipped Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…der into lifespan finally for graceful OTLP drain Addresses huangheng pass-1 observation A (msg=5d450300): OTLP MeterProvider was never flushed on lifespan shutdown, so the PeriodicExportingMetricReader could lose pending metric samples on SIGTERM before the process exits. Mirrors the T4 ``queue_obj.close()`` graceful shutdown pattern. ``shutdown_metrics_provider`` is a no-op when the SDK MeterProvider was never installed (default ``INDEXING_METRICS_EMITTER=noop`` / OTLP endpoint missing) so single-machine and dev deployments are unaffected. ``asyncio.to_thread`` wrap because the OTel SDK ``shutdown()`` blocks while draining the exporter pool. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
huangheng CR — pass-1 verdicts on 4 ratified chunks (mirror from #celery thread)Per architect msg=b0f69983 + msg=baf6618e instruction: mirror per-chunk pass-1 verdicts into PR review comments so GitHub state reflects CR completeness. ✅ T8 chunk 1 — PostgresLineageGraphStore reference adapter (
|
…r real PDF/Word/image inputs Wave 4 backlog #4 — replace the T1.1 simulator's UTF-8 markdown-only shortcut with a real DocParser dispatch on non-text extensions so production uploads of PDF / DOCX / PPTX / XLSX / EPUB / HTML / image files actually parse instead of raising "simulator parser only handles UTF-8 markdown". The simulator path stays for ``.md`` / ``.markdown`` / ``.txt`` / no-extension inputs so existing unit tests + dev workflows keep working unchanged. Production-readiness invariant three-layer declaration (per Wave 3 lesson #10): - must-be-real: aperag.docparser.doc_parser.DocParser chain dispatches on extension and runs the real MarkItDown / MinerU / ImageParser / AudioParser stack on production deployments. ``parser_config`` forwarded so per-collection MinerU API token / use_markitdown toggles flow through. - may-be-gated: the simulator's UTF-8 markdown decode stays the default for text-only inputs; tests that pass markdown bytes without a filename hint are unchanged. - partially-resolves: Wave 4 backlog #4. Chunk 2 (next session) promotes the parser to its own ``q:parse`` async queue per design pack §E.2 so an upload handler no longer blocks on a 30-second OCR run inside the request thread. Implementation: - aperag/indexing/parser.py: * New ``source_filename`` + ``parser_config`` parameters on ``parse_document``. ``_normalise_extension`` + ``_SIMULATOR_EXTENSIONS`` drive the dispatch decision. * New ``_docparser_extract_markdown`` helper materialises the bytes into a tempfile (DocParser only accepts a real path, not a stream), runs ``DocParser.parse_file``, concatenates every produced ``MarkdownPart``, and unlinks the tempfile in a try/finally so a raised ``ParserError`` does not leak files. Image-/audio-only inputs (no MarkdownPart) emit empty markdown so the artifacts still exist + the chunks/outline pipeline produces zero chunks (vision modality consumes the original asset directly per §C.6). * DocParser import is lazy inside ``parse_document`` per the same discipline as ``compute_parse_version`` (avoids pulling MarkItDown / MinerU / pikepdf at indexing-package import time). * Unsupported extensions raise a clear ``ValueError`` with the DocParser supported list embedded so callers can diagnose; we do NOT fall back silently to the simulator (Wave 3 lesson — silent-fallback masks wiring bugs). - aperag/domains/knowledge_base/service/document_service.py: * ``_create_or_update_document_indexes`` passes ``source_filename= object_path`` so the upload's ``user-<u>/<col>/<doc>/original<ext>`` object key drives the dispatch; non-markdown uploads now route through DocParser instead of dying on the UTF-8 decode. tests/integration/test_real_parser_dispatch.py (NEW, 7 tests): - markdown simulator path regression (no filename + ``.md`` filename) - simulator rejects non-UTF-8 with no extension hint (clear error, not silent corruption) - HTML routes through DocParser → MarkItDown → MarkdownPart → outline + chunks (full chain, no extra system deps) - unsupported extension surfaces ``ValueError`` with supported-list - empty PDF dispatches through DocParser (verifies the path runs even when the parser produces no markdown body — pikepdf-only, graceful skip without it) - .docx round-trip when python-docx is installed (skip otherwise so CI hosts without the optional dep stay green) Local gates: - ruff check + format clean (3 files) - pytest tests/integration/test_real_parser_dispatch.py: 6 passed, 1 skipped (python-docx) - pytest tests/unit_test/indexing/ tests/load/ tests/integration/{otlp,real_parser_dispatch}: 155 passed, 1 skipped — including all T1.1 simulator regression tests still green Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirror PostgresLineageGraphStore (chunk 1 reference) and Neo4jLineageGraphStore (chunk 2) over Nebula 3.x. Nebula's property type system has neither LIST<MAP> nor parallel-list-comprehension support; the §D.3.5 lineage SET ports via a JSON STRING property encoding (each SET serialises into one string column holding a JSON array). Reads parse the JSON, writes mutate in Python and write the new JSON back — inherently a read-modify-write loop. Per architect msg=f2921ae0, the Nebula backend therefore activates the EntityLock injected at construction time around every read-modify-write (upsert / strip / GC). Postgres + Neo4j don't need this because their backends have native single-statement strip-then-append; Nebula does. InMemoryEntityLock for tests, RedisEntityLock for production multi-process worker pools. One Nebula SPACE per collection_id (natural Nebula tenancy boundary mirroring the legacy adapter); the store-instance binds to a collection_id at construction so SPACE selection is automatic. Schema-visibility retry handles "TagNotFound" + "EdgeNotFound" + "SpaceNotFound" + "No schema found for" — the heartbeat-propagation window after CREATE TAG / CREATE SPACE. Six contract integration tests against a real Nebula 3.8.0 instance (skip-if-COMPAT_NEBULA_HOSTS-unset) lock the §D.3.5 semantics: roundtrip / multi-doc lineage / doc re-parse v1→v2 via remove+upsert orchestrator flow + (doc, parse_v) dedup-key invariant / orphan GC / relation-independent lineage / per-collection_id tenant isolation via separate SPACE. Production-readiness invariant declaration (per Wave 3 lesson #10): - must-be-real: NebulaLineageGraphStore on nebula3.gclient.net.ConnectionPool + injected RedisEntityLock for multi-process serialisation - may-be-gated: InMemoryLineageGraphStore stays the worker_factory default until chunk 4 wires the backend dispatch - fully-resolves: Wave 4 T8 chunk 3 (Nebula adapter) Three backends (Postgres / Neo4j / Nebula) now mirror; chunk 4 finishes T8 with alembic + worker_factory wire + 3-backend contract fixture + delete legacy adapters + e2e production smoke. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
huangheng CR — pass-1 verdicts batch update (T8 chunk 3 + T3 chunk 1 + T6 follow-up)Mirror from #celery thread per architect msg=b0f69983 / msg=baf6618e instruction. ✅ T6 chunk 1 follow-up —
|
| Design point | Postgres | Neo4j | Nebula |
|---|---|---|---|
| (1) JSONB SET native | JSONB @> + jsonb_agg |
parallel-list 3-list keep-index | JSON STRING + read-modify-write loop |
| (2) collection_id PK + per-store-instance | composite PK | composite UNIQUE | SPACE per collection_id |
| (3) description-in-relation 双存储 | Text default '' | '' ON CREATE |
TAG schema string |
| (4) race protection | INSERT ON CONFLICT row-lock | MERGE row-lock | EntityLock.acquire (per architect msg=f2921ae0) |
Verified:
- Protocol completeness 10/10 (signature match
LineageGraphStoreProtocol) - EntityLock activation正确 — wraps every read-modify-write (upsert / remove / gc); relations use
_relation_vidlock key (avoids entity-relation collision) - 4-fragment schema-visibility retry pattern (
No schema found/TagNotFound/EdgeNotFound/SpaceNotFound) — discovered + fixed via Layer 1 same-session debug capability - VID escape正确 (
\\first,|→\\psecond) asyncio.to_threadwrap sync nebula3 SDK calls- Per-SPACE tenancy (Nebula natural boundary)
- 6 contract tests pass against real Nebula 3.8.0; cross-backend invariants (incl. test 3
(doc, parse_v)dedup compound key) align with Postgres + Neo4j
chunk 4 cross-backend contract fixture ready — 6-case Postgres + Neo4j + Nebula 三 backend 全实测一致 (Bryce ack msg=39a74026)。
5 minor → Wave 5 backlog (W5-perf-graph-lineage / VID truncation / schema_init lock / 30s cold-start / etc.) — all acked by Bryce msg=39a74026 / architect.
Layer 1 自动化 path metric: 7/7 pass-1 一次过 (含 1 follow-up) ≥ 90% manual fresh threshold ✅
| # | Chunk | Session | LOC | Pass-1 |
|---|---|---|---|---|
| 1 | T8 chunk 1 (Postgres) | fresh | 545 | ✅ |
| 2 | T4 chunk 1 (RedisWorkQueue) | fresh | 376 | ✅ |
| 3 | T8 chunk 2 (Neo4j) | fresh | 844 | ✅ |
| 4 | T6 chunk 1 (OTLP) | fresh | 525 | ✅ + 1 follow-up |
| 5 | T6 follow-up + T3 chunk 1 (Layer 1) | same | 11+416 | ✅✅ |
| 6 | T8 chunk 3 (Nebula, Layer 1 ~70% budget) | same as chunk 2 | 1169 | ✅ |
Architect msg=8e50515a ratified Wave 4 自动化 path effective. Layer 1 same-session debug capability preserved (Bryce surfaced + fixed real schema-visibility bug in chunk 3 — quality-equivalent to fresh session).
Pending pass-1 reviews
- T8 chunk 4 (5-step 4a-4e per architect msg=da3012a4 split: alembic + worker_factory dispatch + cross-backend fixture + delete legacy + e2e Phase 1 smoke + spec amendment §D.3.5/§H.5)
- T3 chunk 2 (q:parse async promotion, production-critical priority)
- T1 (graph LLM extractor, depends on T8 chunk 4 wire)
- T2 (cleanup fan-out, depends on T8 chunk 4 graph store factory pattern)
- T5 (Redis QuotaBackend, ~200 LOC)
- T7 (vision multimodal real wiring + sweep A/C)
- T9 (fulltext multi-backend dispatch + sweep B/D)
Architect concerns sync (per msg=baf6618e + msg=da3012a4)
- ✅ Wave 3 final review sweep A/B/C/D 已 fold 进 T7/T9 task threads (PM msg=640e45ac confirmed)
- ✅ Spec amendment §D.3.5 dedup key + §H.5 Redis db isolation → T8 chunk 4 一并 amend
- ✅ e2e production smoke
tests/integration/test_full_indexing_pipeline.pyPhase 1/2 split → T8 chunk 4 Phase 1 必 + T1/T7/T3 wire 后 Phase 2 加 - ✅ PR feat(celery Wave 4): real backends + 11 production-readiness items #1731 description sync (Bryce msg=b8fa1a54 own + chunk 3 / chunk 4 push 时 update body)
…ler returns 202 + parse worker pool Promotes :func:`aperag.indexing.parse_document` off the HTTP request thread per design pack §E.2. Without this, every PDF / Word / image upload blocks the request for 30s+ on DocParser (MarkItDown / MinerU / OCR), destroying responsiveness. The new flow: * upload handler ``_create_or_update_document_indexes`` pushes a :class:`ParseDispatchPayload` onto ``q:parse`` and returns immediately (202-equivalent — pending visible via ``document.status``) * ``run_parse_worker`` (FastAPI lifespan asyncio task, concurrency 8 per §E.2) BLPOPs payloads, runs :func:`parse_document`, then fans out to the 5 per-modality queues via :func:`dispatch_indexing`. Deltas: * ``aperag/indexing/parse_orchestrator.py`` (new, +311) — ``ParseDispatchPayload`` + ``process_one_parse_task`` (read source → parse → optional rebuild-purge → dispatch) + ``run_parse_worker_loop`` + ``run_parse_worker`` thin entrypoint mirroring the per-modality ``run_*_worker`` shape. * ``aperag/indexing/orchestrator.py`` — ``WorkQueue`` protocol gains ``push_parse`` / ``pop_parse``; ``InMemoryWorkQueue`` + ``RedisWorkQueue`` (key ``q:parse``) implement them. * ``aperag/app.py`` lifespan starts ``run_parse_worker`` alongside the 5 modality workers + reconciler + cleanup. * ``aperag/domains/knowledge_base/service/document_service.py`` — ``_create_or_update_document_indexes`` swapped to ``push_parse`` (no more inline ``parse_document`` + ``dispatch_indexing``); new ``_resolve_parser_config_for_collection`` reads optional ``parser_config`` from ``collection.config`` so deployments can opt into MinerU / per-collection OCR without code changes. * ``tests/integration/test_parse_async_roundtrip.py`` (new) — 10 tests across single-task happy / failure paths, rebuild purge, parse-only, full async roundtrip (push_parse → parse worker → modality workers → ACTIVE), and payload to_dict/from_dict round-trip pinning. Failure semantics for chunk 2 minimal scope: a failed parse logs + drops the job (no DocumentIndex rows materialised). Wave 5 follow-up will extend the reconciler to detect "documents created N minutes ago without document_index rows" and re-enqueue. The per-modality §I.2 backoff still covers post-parse failures. Three-class tag (Wave 3 production-readiness invariant): * must-be-real: ``RedisWorkQueue.push_parse / pop_parse`` (Redis ``q:parse`` BLPOP backed by the existing T4 client) * may-be-gated: ``InMemoryWorkQueue.push_parse / pop_parse`` for unit + integration tests; private deployments still use ``q:parse`` via the same protocol * fully-resolves: Wave 4 backlog #4 (parser real PDF/Word/image + q:parse async promotion) — chunk 1 wired DocParser into ``parse_document``; chunk 2 takes parse off the HTTP thread. Local gates: * ``pytest tests/unit_test/indexing/ tests/integration/ tests/load/`` — 180 passed / 25 skipped (incl. 10 new ``test_parse_async_roundtrip``). * ``ruff check aperag/ tests/integration/test_parse_async_roundtrip.py`` — clean. * ``ruff format`` — applied. Pushed: rebased on b0bdc09 (Bryce T8 chunk 3 Nebula adapter). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…drop relation description column
Wave 4 T8 chunk 4a per docs/modularization/indexing-redesign-design-pack.md
§D.3.5 + chunk 4 acceptance lock items 1 + 2 (architect msg=baf6618e /
huangheng msg=b6f20096):
* New alembic migration ``e7a3b9c2d1f6`` creates the two PostgreSQL
tables that back PostgresLineageGraphStore (T8 chunk 1 msg=f0571f98):
- ``aperag_lineage_entity`` (collection_id, name) PK + JSONB
``source_lineage`` / ``description_parts`` SETs + GIN containment
index for ``find_entity_ids_with_lineage`` doc-id scans.
- ``aperag_lineage_relation`` (collection_id, source, target, type) PK
+ JSONB ``evidence_lineage`` / ``description_parts`` SETs + GIN
index on ``evidence_lineage``.
* env.py registers the adapter's private ``_LineageGraphBase.metadata``
alongside the application's ``Base.metadata`` so autogen comparison
sees both — alembic check returns "No new upgrade operations
detected" (ORM 100% mirror invariant honored, schema drift = Wave 3
T3.1 lesson).
* Dropped legacy ``description`` Text/string column from the relation
row across all 3 backend adapters (Postgres + Neo4j + Nebula) — the
per-document fragments in ``description_parts`` are the canonical
source per §D.3.3 Option A; ``RelationWithLineage`` does not expose
a standalone ``description`` field so the column had no read-path
consumer. Cross-backend "ORM 100% mirror" stays honest.
* PostgresLineageGraphStore SQL fixes surfaced by chunk 4a smoke:
- ``ON CONFLICT (col1, col2)`` column-list form (vs. named
constraint) so the upsert no longer depends on the redundant
UniqueConstraint name (PG collapses PK + UC covering same columns
into one index).
- ``CAST(:document_id AS TEXT)`` inside ``jsonb_build_object`` so
asyncpg can infer the parameter type (otherwise raises
``IndeterminateDatatypeError``).
- ``get_entity`` / ``get_relation`` switched to per-column
``select(...)`` + ``row.attr`` access (ORM-instance row mapping
via ``_mapping[Column]`` does not work under Core
``conn.execute``).
Three-pass alembic test against real Postgres:
Pass 1 — ``upgrade head`` → ``e7a3b9c2d1f6`` ✅
Pass 2 — ``check`` → "No new upgrade operations detected." ✅
Pass 3 — ``downgrade -1`` → tables dropped ✅
Pass 4 — ``upgrade head`` + ``check`` → green ✅
Local gates green:
- ``ruff check ./aperag ./tests`` clean
- ``ruff format --check ./aperag ./tests`` clean (501 files)
- ``pytest tests/unit_test/indexing/`` 140 passed
- ``pytest tests/integration/test_neo4j_lineage_graph_store.py``
6 passed (real Neo4j 5.x; description-drop verified)
- ``pytest tests/integration/test_nebula_lineage_graph_store.py``
6 passed (real Nebula 3.8.0; description-drop verified)
- PostgresLineageGraphStore inline smoke: PASS (entity + relation
upsert / lineage SET preserve / strip-by-doc / GC / dedup-key /
find-by-doc scan, all against real Postgres)
Cross-backend mirror state after chunk 4a:
| Backend | Entity desc col | Relation desc col |
|----------|-----------------|-------------------|
| Postgres | none (parts only) | none (parts only) — DROPPED |
| Neo4j | none (parts only) | none (parts only) — DROPPED |
| Nebula | none (parts only) | none (parts only) — DROPPED |
Next: chunk 4b — worker_factory dispatch on
``collection.config.graph_backend_type ∈ {postgres, neo4j, nebula}``
+ EntityLock injection (Nebula RedisEntityLock, Postgres + Neo4j
no-op) + async driver cross-event-loop verify.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…kend_type + EntityLock injection
Wave 4 T8 chunk 4b per docs/modularization/indexing-redesign-design-pack.md
§D.3.5 + chunk 4 acceptance lock items 3 + 5 + 6 (architect msg=baf6618e /
huangheng msg=b6f20096):
* Adds ``CollectionConfig.graph_backend_type`` Literal["postgres",
"neo4j", "nebula"] (default "postgres" — the §D.3.5 reference
adapter that shares the application's own PostgreSQL).
* Refactors ``aperag/indexing/worker_factory.py`` graph builder to
read ``collection.config.graph_backend_type`` and dispatch to the
matching :class:`LineageGraphStore` adapter:
- postgres → :class:`PostgresLineageGraphStore` bound to a shared
process-wide ``AsyncEngine`` (lazily created from
``settings.database_url``, postgresql:// auto-promoted to
postgresql+asyncpg://). Strip-then-append RMW is single-statement
so InMemoryEntityLock no-op suffices.
- neo4j → :class:`Neo4jLineageGraphStore` bound to a shared
``AsyncDriver`` (lazily created from ``settings.neo4j_uri`` /
``settings.neo4j_username`` / ``settings.neo4j_password``). Same
single-statement RMW guarantee under MERGE row-lock so
InMemoryEntityLock no-op suffices.
- nebula → :class:`NebulaLineageGraphStore` bound to a shared sync
``ConnectionPool`` (32 connections). EntityLock injection is
:class:`RedisEntityLock` (binding to ``settings.indexing_queue_redis_url``
via ``redis.asyncio.from_url``) so the read-modify-write loop
serialises across worker processes (architect msg=f2921ae0
invariant). Falls back to InMemoryEntityLock only when
``indexing_queue_redis_url`` is unset (single-process tests).
* Cross-event-loop verify (acceptance lock item 3): backend client
singletons are created inside the builder thread (``asyncio.to_thread``
from ProductionWorkerFactory.__call__) but loop binding is deferred
to first use — async engines / drivers attach to whatever event loop
the worker's ``sync(...)`` coroutine runs on, which is the
orchestrator loop. No ``asyncio.run`` near the factory.
* "Wave 4 wiring T1 extractor" gate kept explicit even with backend
wired: the no-op extractor stub is identity-checked in the builder
so collections that opt into knowledge_graph today land on
WorkerFactoryError → §I.2 retry-with-backoff (Wave 3 lesson #10:
no silent ACTIVE-with-empty-graph). The gate self-disables when T1
PR replaces ``_no_op_extractor`` with the real LightRAG-style
extractor.
* Adds 9 unit tests in tests/integration/test_worker_factory.py:
- graph_backend_type defaults to "postgres" when config absent
- reads from pydantic attribute / dict / JSON-string config
- rejects unknown backend with clear WorkerFactoryError
- returns InMemoryEntityLock for postgres + neo4j (no Redis dep)
- the T1-wiring gate raises WorkerFactoryError with "Wave 4 wiring"
+ "T1" in message even when backend is wired (the e2e Phase 1
smoke pin in chunk 4e relies on this exact message).
* Adds ``_reset_graph_backend_singletons_for_tests()`` helper so
fixtures can drop cached clients between runs (used by chunk 4c
cross-backend contract test fixture).
Local gates green:
- ``ruff check ./aperag ./tests`` clean
- ``ruff format --check ./aperag ./tests`` clean (503 files)
- ``pytest tests/unit_test/indexing/ tests/integration/test_worker_factory.py
tests/integration/test_neo4j_lineage_graph_store.py``: 152 passed,
6 skipped (Nebula skipped without COMPAT_NEBULA_HOSTS)
Production-readiness three-class layer:
- must-be-real: per-backend client singletons (AsyncEngine /
AsyncDriver / ConnectionPool) + RedisEntityLock for Nebula
- may-be-gated: InMemoryEntityLock fallback only when
indexing_queue_redis_url is unset (single-process test/dev)
- fully-resolves: chunk 4 acceptance items 3 (async cross-event-loop)
+ 5 (EntityLock injection) + 6 (factory dispatch on
graph_backend_type)
Next: chunk 4c — 6-case cross-backend contract test fixture parametrizing
postgres + neo4j + nebula against the same Protocol invariants.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…er-row worker_factory Per Wave 4 backlog #3 + architect msg=c79e9a3f gap-report. Pre-T2 the cleanup loop ran with ``workers={}`` empty (placeholder from Wave 3 T3.1 chunk 3). Every row hit ``worker is None`` → ``backend_skipped``, so production cleanup was a silent no-op for backend artefacts: Qdrant points / ES docs / graph entities leaked forever after document or collection delete. The DB rows were dropped but the backend state was not, leaving the indexes with orphan vectors that search would still match. T2 wires the per-row factory the orchestrator already uses for dispatch (Bryce T8 chunk 4b shipped backend dispatch on ``collection.config.graph_backend_type``). The cleanup loop calls ``ProductionWorkerFactory.build_for_cleanup_row(row)`` for every row, getting the right per-(collection, modality) worker view. Deltas: * ``aperag/indexing/worker_factory.py`` — ``CleanupWorkerView`` (minimal :class:`ModalityWorker` shape with ``modality`` + either ``_backend`` for flat modalities or ``_store + _entity_lock`` for graph; ``derive`` / ``sync`` raise loudly so misrouting into the dispatch path surfaces immediately) + ``build_for_cleanup_row`` + ``_build_qdrant_cleanup_backend`` / ``_build_es_cleanup_backend`` / ``_build_cleanup_view_sync`` helpers. Bypasses dispatch-time gates that block construction but are irrelevant to deletion — graph "Wave 4 T1 extractor" gate (``_build_graph_worker:429``) and vision multimodal gate (``_build_vision_worker:349``) still raise for dispatch but cleanup must drop their backend artefacts even when those gates are active. * ``aperag/indexing/cleanup.py`` — ``WorkerFactoryForRow`` type alias + ``_resolve_cleanup_worker`` that prefers ``worker_factory(row)`` over ``workers.get(modality)``, gracefully catches :class:`WorkerFactoryError` (counts as ``backend_skipped`` + still drops the DB row so the cleanup index does not grow unboundedly while the operator triages the gate). All three cleanup entry points (``cleanup_orphan_parse_versions`` / ``cleanup_for_deleted_documents`` / ``cleanup_for_deleted_collections``) + ``run_cleanup_loop`` accept the optional ``worker_factory`` parameter alongside the legacy ``workers`` map. * ``aperag/indexing/runtime.py`` — ``IndexingRuntime.cleanup_worker_factory`` field so service-layer code (``_delete_document_indexes``) can read the same factory the lifespan installed. * ``aperag/app.py`` — lifespan wires ``run_cleanup_loop(worker_factory=worker_factory.build_for_cleanup_row)`` and stashes the bound method on ``IndexingRuntime``. * ``aperag/domains/knowledge_base/service/document_service.py`` — ``_delete_document_indexes`` forwards ``runtime.cleanup_worker_factory`` to the cleanup call so user-initiated document deletes also run the per-row factory. * ``tests/integration/test_cleanup_fan_out.py`` (new) — 7 tests across factory-wins-over-map, per-collection dispatch, factory raises ⇒ skip+drop, orphan parse_v GC with factory, collection cascade with factory, ``CleanupWorkerView`` derive/sync raise, workers-only backward compat. All factory-side cases use ``InMemoryVectorBackend`` / ``InMemoryFulltextBackend`` to drive the cleanup loop without standing up real Qdrant / ES. Three-class tag (Wave 3 production-readiness invariant): * must-be-real: ``ProductionWorkerFactory.build_for_cleanup_row`` returns real Qdrant / ES / LineageGraphStore-backed views per row * may-be-gated: tests inject closures that return InMemory backends * fully-resolves: Wave 4 backlog #3 (Cleanup loop 5 modality singleton fan-out) — production cleanup actually cleans backend artefacts now (was silent no-op pre-T2) Local gates: * ``pytest tests/unit_test/indexing/ tests/integration/`` — 192 passed / 25 skipped (incl. 7 new ``test_cleanup_fan_out``). * ``ruff check aperag/ tests/integration/test_cleanup_fan_out.py`` — clean. * ``ruff format`` — applied. Pushed: rebased on Bryce T8 chunk 4b ``d1713e2`` (graph backend dispatch dependency). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ntract test fixture
Wave 4 T8 chunk 4c per chunk 4 acceptance lock item 4 + architect
msg=87e2b187 amendment (cross-event-loop scenario):
* New ``tests/integration/test_lineage_graph_store_contract.py``
parametrizes the §D.3.5 ``LineageGraphStore`` Protocol contract
across all three production backends (Postgres / Neo4j / Nebula)
in a single 6-case fixture so any backend that drifts from spec
fails here regardless of which adapter it is. Each backend skips
its parametrize cell when the env-var is unset (lint-and-unit CI
lane stays green; e2e-http-compose lane runs them all).
6 contract cases:
1. roundtrip_entity_with_one_lineage_member — basic upsert + read
2. two_documents_cite_same_entity_preserves_both_lineage —
§D.3 cross-doc lineage SET coexistence
3. doc_re_parse_replaces_old_parse_version_member — §D.3.6 step 3
remove + upsert + (document_id, parse_version) composite dedup
4. remove_then_gc_orphan_entity — §D.3.2 phase-1 strip → phase-2
GC; orphan-only deletion
5. relation_lineage_set_independent_from_entity — relation
evidence_lineage SET independent of entity source_lineage
6. tenant_isolation_collection_id_filters_all_queries — §H.2
per-store-instance binding tenant double-layer
* Adds 7th case (per architect msg=87e2b187 chunk 4c amendment):
``test_cross_event_loop_construct_then_call`` parametrized across
all three backends. Pins acceptance lock item 3 ("async driver
wire cross-event-loop verify; no asyncio.run near factory") at
the contract layer rather than relying on driver-vendor lazy bind
behaviour. Forward-prevent for Wave 3 evaluation cross-loop bug
msg=e1f23258 if a future driver upgrade flips to eager bind.
The test mirrors ``ProductionWorkerFactory.__call__`` exactly:
- sync per-process client creation (no loop binding)
- per-collection adapter constructor inside ``asyncio.to_thread``
(sync ``__init__`` only)
- first async op runs on the orchestrator event loop (lazy bind)
* Surfaces + fixes a Nebula schema-visibility error fragment Layer 1
same-session: rapid SPACE drop/recreate cycles in the cross-
backend fixture exposed a 5th + 6th propagation-window error
variant ``Storage Error: Tag not found`` / ``Edge not found``
(storaged-side text spelling differs from metad's TagNotFound /
EdgeNotFound). Added both fragments to
``_SCHEMA_VISIBILITY_ERROR_FRAGMENTS`` so retry catches them.
Local gates green:
- ``ruff check ./aperag ./tests`` clean
- ``ruff format --check ./aperag ./tests`` clean (505 files)
- ``pytest tests/integration/test_lineage_graph_store_contract.py``
21 passed against real Postgres + Neo4j 5.x + Nebula 3.8.0
(6 cases × 3 backends + 3 cross-event-loop variants)
- ``pytest tests/unit_test/indexing/ + test_worker_factory.py +
test_cleanup_fan_out.py``: 159 passed
Production-readiness three-class layer:
- must-be-real: cross-backend Protocol contract enforced against
real Postgres / Neo4j / Nebula
- may-be-gated: backends skip when env-var unset (CI lint-and-unit
lane stays green)
- fully-resolves: chunk 4 acceptance item 4 (cross-backend contract
test) + acceptance item 3 (cross-event-loop verify, locked at
contract layer per architect amendment)
The per-backend ``tests/integration/test_neo4j_lineage_graph_store.py``
and ``test_nebula_lineage_graph_store.py`` retain backend-specific
extras (parallel-list encoding shape, schema-visibility retry path)
that cannot be expressed as Protocol-level invariants; this contract
file covers only Protocol-level invariants.
Next: chunk 4d — delete legacy
``aperag/domains/knowledge_graph/graphindex/storage/{base,postgres,neo4j,nebula,connector}.py``
+ grep-zero verify (Wave 3 hard-cut second round). Pending architect
clarification on consumer migration scope (legacy graphindex/service.py
+ engine/indexer.py + integration.py still import the storage modules).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
huangheng CR — pass-1 verdicts batch update #2 (T2 + T8 chunk 4a/4b/4c)Mirror from #celery thread per architect msg=b0f69983 + msg=baf6618e + msg=da3012a4 + msg=87e2b187 + msg=b26f64b2 instructions. ✅ T2 — Cleanup loop fan-out (
|
| # | Chunk | Session | LOC | Pass-1 |
|---|---|---|---|---|
| 1 | T8 chunk 1 (Postgres) | fresh | 545 | ✅ |
| 2 | T4 chunk 1 (RedisWorkQueue) | fresh | 376 | ✅ |
| 3 | T8 chunk 2 (Neo4j) | fresh | 844 | ✅ |
| 4 | T6 chunk 1 (OTLP) + follow-up | fresh + same | 525+11 | ✅ + 1 follow-up addressing obs A |
| 5 | T3 chunk 1 (DocParser dispatch, Layer 1) | same as T6 | 416 | ✅ |
| 6 | T8 chunk 3 (Nebula, Layer 1 ~70% budget) | same as chunk 2 | 1169 | ✅ + Layer 1 same-session catch chunk 1 latent SQL bug ⭐ |
| 7 | T3 chunk 2 (q:parse async, fresh per architect priority) | fresh | 1189/-101 | ✅ |
| 8 | T8 chunk 4a (alembic + drop description) | fresh | +248/-88 | ✅ |
| 9 | T8 chunk 4b (worker_factory dispatch) | same as 4a | +450/-52 | ✅ |
| 10 | T2 (cleanup fan-out, chenyexuan same-session post-T3 chunk 2) | same | +827/-17 | ✅ |
| 11 | T8 chunk 4c (cross-backend contract fixture, Bryce same-session post-4b) | same | +661 | ✅ + Layer 1 same-session catch 5/6 Nebula schema-visibility ⭐ |
Layer 1 same-session debug-capability 数据点 2 例:chunk 4a catch chunk 1 latent SQL bug + chunk 4c catch Nebula schema-visibility latent。自动化 path 不仅 throughput 高 + quality match fresh session, additionally catch + fix latent bug not detected by fresh session pass-1.
feedback_no_refresh_complete_all_tasks.md directive 落地 evidence base 充分。
Pending pass-1 reviews
- T8 chunk 4d (grep-zero verify per architect ruling Option C)
- T8 chunk 4e (5 spec amendments + Phase 1 e2e smoke + delete+cleanup roundtrip + architect direct ratify lane)
- T1 (graph LLM extractor — depends on T8 chunk 4 wire)
- T5 (Redis QuotaBackend — Bryce post-chunk 4e)
- T7 (vision multimodal real wiring + sweep A/C)
- T9 (fulltext multi-backend dispatch chenyexuan in-flight + sweep B/D)
Wave 4 spec amendment scope (chunk 4e direct ratify, 5 items locked)
per architect msg=baf6618e + msg=9a6de002 + msg=e8391012 + msg=b26f64b2:
- §D.3.5 dedup key amendment "(document_id, parse_version) 复合,三 backend 必 align"
- §H.5 Redis db isolation (broker=0 / memory=1 / WorkQueue=2 / Quota=3)
- §C.1 parser pipeline
collection.config.parser_configper-collection override - §H.5 amendment "Nebula multi-process requires
indexing_queue_redis_url" - §K Wave 4 acceptance item 7 narrowed (grep-zero verify
aperag/indexing/*不 cross-ref legacy graphindex/storage; legacy 整体淘汰 → Wave 5)
…ection.config.fulltext_backend_type)
Mirrors the T8 chunk 4b graph backend dispatch shape (msg=27571f63,
msg=a3eec27d) for the fulltext modality. Pre-T9 every collection
hard-coded to Elasticsearch via ``settings.es_host``; T9 introduces
a per-collection ``fulltext_backend_type ∈ {elasticsearch, opensearch}``
field with the same dispatch pattern so a deployment can run
OpenSearch (open-licence alternative) without code changes. Both
backends share the same Elasticsearch-compatible HTTP API, so the
existing ``_ElasticsearchFulltextBackend`` adapter wraps either client
unchanged.
Deltas:
* ``aperag/schema/common.py`` — ``CollectionConfig.fulltext_backend_type``
``Literal["elasticsearch", "opensearch"]`` field, default
``"elasticsearch"`` so existing collections keep their pre-T9 behavior.
* ``aperag/indexing/worker_factory.py`` —
- ``_resolve_fulltext_backend_type(collection)`` mirrors
``_resolve_graph_backend_type``: pydantic attr / Mapping / JSON
string shape coverage + clear ``WorkerFactoryError`` on unknown.
- ``_build_fulltext_backend(*, backend_type, index_name)`` dispatcher
that constructs the right driver client and wraps it in the
shared ``_ElasticsearchFulltextBackend`` adapter.
- ``_build_elasticsearch_client`` / ``_build_opensearch_client``
helpers — ``opensearch-py`` lazy-imported with the same gating
pattern used by ``graph-neo4j`` / ``graph-nebula`` extras
(clear ``WorkerFactoryError`` pointing at the
``fulltext-opensearch`` extra when the dependency is missing).
- ``_build_es_cleanup_backend`` now also dispatches on the resolved
backend type so cleanup hits the same backend a switched-over
collection currently writes to.
* ``tests/integration/test_worker_factory.py`` — 9 new tests:
default → "elasticsearch" / pydantic attr × 2 backends / dict /
JSON string / unknown raises / dispatch builds ES adapter /
opensearch gate raises when driver missing / ES_HOST gate raises
when unset. Mirrors the chunk 4b coverage shape exactly.
Three-class tag (Wave 3 production-readiness invariant):
* must-be-real: ``Elasticsearch`` + ``OpenSearch`` clients (drivers,
not stubs); ``settings.es_host`` enforced for both.
* may-be-gated: ``opensearch`` backend gates on ``opensearch-py``
presence — operator must opt-in via the extra. ``elasticsearch``
is the always-available default.
* fully-resolves: Wave 4 backlog — fulltext multi-backend adapter
dispatch pattern in place. Wave 5 follow-up may add MeiliSearch /
Typesense by extending ``_VALID_FULLTEXT_BACKENDS`` + a new
``_build_*_client`` helper without touching dispatch / cleanup.
Local gates:
* ``pytest tests/unit_test/indexing/ tests/integration/`` — 201
passed / 25 skipped (incl. 9 new T9 tests in
``test_worker_factory.py``).
* ``ruff check aperag/ tests/integration/test_worker_factory.py`` —
clean.
* ``ruff format`` — applied.
Pushed: rebased on T2 ``97dbe61`` (cleanup fan-out) — chunk 4b
graph dispatch + T2 cleanup factory + T9 fulltext dispatch all
share the per-(collection, modality) factory pattern now.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ion e2e smoke (ratify-ready)
Wave 4 T8 chunk 4d narrowed scope (architect msg=87e2b187 Option C
ruling) + chunk 4e (architect msg=da3012a4 / msg=87e2b187 / PM
msg=067c18e5):
CHUNK 4d (acceptance lock item 7, narrowed)
Per architect Option C ruling: chunk 4d does NOT delete legacy
``aperag/domains/knowledge_graph/graphindex/storage/{base,postgres,
neo4j,nebula,connector}.py`` because the legacy ``GraphStore`` 24-
method LightRAG-style Protocol has many active callers in retrieval
/ curation paths that would break the build (Wave 3 lesson #9
hard-cut delete-side cascade). Instead chunk 4d locks an invariant:
the new indexing pipeline (``aperag/indexing/*``) must NOT
cross-reference the legacy storage modules. Verified via:
grep -rn "from aperag.domains.knowledge_graph.graphindex.storage"
aperag/indexing/
→ 0 results. Locked. Legacy ``graphindex`` package elimination is
re-scoped to Wave 5 follow-up (cross-cutting refactor with
retrieval/curation migration to §G.5 read primitives).
CHUNK 4e — spec amendments (5 items per architect ratify history)
* §D.3.5.1 — explicit "(document_id, parse_version) composite
dedup key" constraint, three-backend align (architect
msg=baf6618e). Tested by chunk 4c case 3
``test_doc_re_parse_replaces_old_parse_version_member``.
* §H.5.1 — Redis logical-db assignment table
(broker=0/memory=1/WorkQueue=2/Quota=3) so BLPOP queues do not
collide with cache or broker (architect msg=baf6618e).
* §H.5.2 — Nebula multi-process EntityLock invariant: when
``graph_backend_type=nebula`` with worker pool concurrency >= 2,
``INDEXING_QUEUE_REDIS_URL`` MUST be set to bind RedisEntityLock
cross-process (architect msg=87e2b187).
* §C.3.1 — ``collection.config.parser_config`` per-collection
parser override description (architect msg=9a6de002 from
chunk 4e amendment scope).
* §K.7 — Wave 4 production-readiness section + chunk 4d narrowed
scope explanation.
* §K.8 — Wave 5 backlog list (legacy graphindex elimination +
accumulated Wave-5 candidates from chunks T2/T3/T6/T8).
CHUNK 4e — Phase 1 production e2e smoke
New ``tests/integration/test_full_indexing_pipeline.py`` with two
layers:
* Layer 1 (always run, 4 tests): gate invariants verified against
real ``ProductionWorkerFactory``. Even with chunk 4b backend
dispatch wired, graph + vision modalities raise
``WorkerFactoryError`` with ``"Wave 4 wiring"`` (T1 extractor /
T7 multimodal embedder gates remain effective). Plus dispatch-
table cleanup-view verification: ``build_for_cleanup_row``
returns a non-None view for ALL 5 modalities (Wave 3 lesson #10
surgical-gate corollary — dispatch raises but cleanup doesn't,
or partial-gated artefacts leak forever).
* Layer 2 (gated by ``RUN_E2E_PHASE1_SMOKE=1``, 1 test): canonical
full-pipeline contract per architect msg=da3012a4 — real
Postgres + Redis + Qdrant + ES + OTel SDK; vector + fulltext +
summary reach ACTIVE; graph + vision finalise FAILED with
"Wave 4 wiring"; delete+cleanup roundtrip removes backend
artefacts (Qdrant points / ES docs / lineage entities all 0).
Skipped by default (requires stub model-provider fixture from
e2e-http-compose lane scaffolding).
Layer split keeps local-dev signal fast while preserving canonical
CI contract. Wave 4 close-out (Phase 2 — after T1 + T7 land) flips
Layer 1 to expect graph/vision ACTIVE.
Local gates green:
- ``ruff check ./aperag ./tests`` clean
- ``ruff format --check ./aperag ./tests`` clean (506 files)
- ``pytest tests/integration/test_full_indexing_pipeline.py +
test_worker_factory.py``: 25 passed / 1 skipped (Layer 2)
Production-readiness three-class layer:
- must-be-real: Phase 1 gate invariants verified via real
``ProductionWorkerFactory``; cleanup view for all 5 modalities
- may-be-gated: Layer 2 full-pipeline gated on
``RUN_E2E_PHASE1_SMOKE=1`` (e2e-http-compose lane only); legacy
``graphindex`` package elimination deferred to Wave 5
- fully-resolves: chunk 4 acceptance items 7 (narrowed) + 8 (5
spec amendments) + 9 (Phase 1 e2e smoke + delete+cleanup
roundtrip)
Wave 4 T8 chunk 4 SUMMARY (4a + 4b + 4c + 4d + 4e):
- 4a: alembic migration + ORM mirror + drop relation description
- 4b: factory dispatch on graph_backend_type + EntityLock
- 4c: cross-backend contract test fixture + cross-loop scenario
- 4d: grep-zero verify (narrowed Option C ruling)
- 4e: 5 spec amendments + Phase 1 e2e smoke (this commit)
All 9 chunk 4 acceptance items locked. Ready for architect direct
ratify on chunks 4d + 4e (per msg=a3eec27d trigger 4d/4e lane).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…type spec amendment Wave 4 T8 chunk 4e 6th spec amendment (architect msg=eba26fc2 + PM msg=8069d807): the T9 push (chenyexuan, HEAD 944805e) added ``CollectionConfig.fulltext_backend_type`` Literal["elasticsearch", "opensearch"] but the design pack §G.2.2 did not declare the user- facing config contract. Spec drift would let private deployments miss the field on copy-paste. Adds new §G.2.2.1 subsection to the design pack: * Document the field shape + default value * Document the OpenSearch lazy-import / fulltext-opensearch extra pattern (mirrors graph-neo4j / graph-nebula extras from chunk 4b) * Document the single-ES_HOST design decision (one fulltext cluster per deployment) + how to point ES_HOST at an OpenSearch endpoint * Note the extension pattern for future backends (Solr / Typesense / MeiliSearch) — same _VALID_FULLTEXT_BACKENDS + _build_*_client helper extension; dispatch / cleanup paths unchanged. Chunk 4e spec amendment scope is now 6 items (5 prior + this G.2.2.1 follow-up): 1. §D.3.5.1 — composite (document_id, parse_version) dedup key 2. §H.5.1 — Redis logical-db assignment table 3. §H.5.2 — Nebula multi-process EntityLock invariant 4. §C.3.1 — collection.config.parser_config per-collection override 5. §K.7+§K.8 — Wave 4 production-readiness + Wave 5 backlog 6. §G.2.2.1 — fulltext_backend_type backend dispatch (this commit) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ow-up Wave 4 sweeps fold-in per architect msg=eba26fc2 + msg=803a2757 + huangheng pass-1 surface (msg=7f063f2c) + PM ruling (msg=...): the T7 / T9 follow-ups that should land alongside the dispatch chunks to keep the operator-facing docs + comments truthful. Three sweeps in this commit: * **Sweep A** (T7 #23 acceptance, doc lane) — ``docs/private-deployment.md:19`` Wave 3 release scope intro drops "vision" from the production-ready list because vision is Wave 4-gated (raises ``WorkerFactoryError`` until a real multimodal vision-LLM is configured). Listing it as production-ready was inherited from a draft and never resynced after the Wave 3 gate landed. Listing the modality there now would mislead operators into thinking ``enable_vision=true`` is supported on Wave 3 ship. * **Sweep B** (T9 #25 acceptance, doc lane) — ``docs/private-deployment.md:53-56`` Wave 4 backlog description resynced from the draft 5-item phrasing to the 11-item locked scope (graph adapters / extractor / parser / cleanup / contract tests / Redis WorkQueue / Redis QuotaBackend / OTLP / vision / fulltext multi-backend / chunk-4e amendments). Operators reading this section now see the real scope instead of an incomplete earlier checkpoint. * **Sweep C** (T7 #23 acceptance, code-comment lane) — ``aperag/indexing/worker_factory.py:_build_vision_worker._embed`` comment resynced. The pre-T9 comment claimed the call routed through the multimodal model "rather than the string-concat placeholder", but the body still composes ``f"{image_id}|{alt_text}"`` and feeds it to ``embed_query`` on whatever embedder the operator configured. The ``is_multimodal()`` gate above only verifies the embedder is configured as multimodal; it does not change the call shape. Comment now accurately describes the Wave 3 placeholder and points at T7 for the real image-bytes path. Sweep D (``_fulltext_search:350`` ``minimum_should_match`` calc gap) was deferred to chunk 4e Phase 1 e2e smoke per architect ruling — the chunk 4e multi-keyword smoke case will surface whether the 80% threshold over a 2N-clause should-set is actually a bug or just algebra fear; pre-fixing without verification was deemed over-engineering. No new tests — sweep A + B are docs only; sweep C is a comment update with no behaviour change. Ruff clean; T9 9-test suite still green (9 passed in ``test_worker_factory.py::*fulltext*``). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…text smoke (Layer 2 stub)
Wave 4 T8 chunk 4e sweep D fold (architect msg=fdd53586 ruling +
PM msg=89946df8): adds a Layer 2 stub case
``test_phase1_multi_keyword_fulltext_search_returns_hits`` to
``tests/integration/test_full_indexing_pipeline.py`` documenting the
sweep D verification path.
Architect msg=2721a5e7 concern D flagged a latent
``_fulltext_search:350`` ``minimum_should_match`` arithmetic gap —
the 80% threshold over N×content + N×title (= 2N) should clauses
may underflow on multi-keyword queries (per huangheng msg=fb64468c).
The fix philosophy locked by architect msg=fdd53586: real-world
verification beats algebraic pre-fix. Run the actual ES query
semantics against a real ES instance with a real indexed document; if
it passes, the latent risk did not materialise; if it fails,
fix-forward in chunk 4e (or escalate per architect msg=fdd53586
ruling).
Layer 2 (gated by RUN_E2E_PHASE1_SMOKE=1) — the implementation lives
behind the same e2e-http-compose lane scaffolding as the canonical
Phase 1 smoke test, since it needs a real ES instance + retrieval
pipeline + indexed document fixture. Until the Wave 4 close-out PR
wires that scaffolding, the test is a documented stub that points at
the existing ``test_fulltext_roundtrip_fields.py`` path which
exercises ``bulk_index + search`` round-trip on a real ES index
(intermediate signal for the same invariant).
Local gates green:
- ``ruff check ./aperag ./tests`` clean
- ``ruff format --check ./aperag ./tests`` clean (506 files)
- ``pytest test_full_indexing_pipeline.py`` 4 passed / 2 skipped
(Layer 2 canonical smoke + Layer 2 sweep D smoke)
Chunk 4e acceptance scope is now 9 items (per architect msg=fdd53586
chunk 4e scope累积):
1-6: 6 spec amendments (locked across prior commits)
7. Phase 1 e2e smoke (vector+fulltext+summary ACTIVE +
graph/vision WorkerFactoryError) — Layer 1 always-run + Layer 2 stub
8. delete + cleanup roundtrip (via Layer 1 cleanup-view dispatch test
for all 5 modalities + Layer 2 stub)
9. multi-keyword fulltext search smoke (sweep D verify path) —
Layer 2 stub (this commit)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e2e Layer 2 fixture activation Per architect msg=87e2b187 chunk 4d/4e ratify decision condition #3: the chunk 4e Phase 1 e2e smoke Layer 2 tests are currently ``pytest.skip(...)`` stubs (canonical full-pipeline + sweep D multi-keyword) because the embedders need a configured model-provider that local dev does not have. The contract is documented but the test does not actually execute end-to-end yet. Adds an explicit Wave 5 backlog item to §K.8: wire the e2e-http-compose lane stub model-provider fixture so Layer 2 tests run for real instead of skip. The fixture lives in the lane scaffolding (``tests/e2e_http/scripts/run_full.sh``) — extending it to expose a deterministic stub embedder + completion model is straightforward; gating Layer 2 activation behind that work keeps chunk 4e scope clean. Closes the third architect ratify condition; chunk 4d+4e now fully addressed (1: §G.2.2.1 amendment shipped in 9d75f61, 2: sweep D smoke stub shipped in 919d3789→4d36c7fb rebase, 3: this Wave 5 backlog lock). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
huangheng CR — pass-1 verdicts batch update #3 (T9 + T8 chunk 4d+4e + 3 follow-ups)Mirror from #celery thread per architect msg=b0f69983 + PM msg=1b8a8a88. T8 chunk 4 整体 closed; chenyexuan all 4 tasks done. ✅ T9 — fulltext multi-backend adapter dispatch (
|
| Item | Sub | HEAD |
|---|---|---|
| 1: alembic ORM mirror | 4a | 3773cf7c |
| 2: drop relation description | 4a | 3773cf7c |
| 3: async cross-event-loop | 4b+4c | d1713e24 + 019841fc |
| 4: cross-backend contract test | 4c | 019841fc |
| 5: EntityLock injection | 4b | d1713e24 |
| 6: factory dispatch graph_backend_type | 4b | d1713e24 |
| 7: legacy graphindex narrowed (Option C) | 4d | 57e3981a |
| 8: 6 spec amendments | 4e + follow-up | 57e3981a + 9d75f61 |
| 9: Phase 1 e2e smoke + delete+cleanup + sweep D | 4e + follow-up | 57e3981a + 4d36c7f |
T8 Wave 4 backlog #8 — fully resolved. Bryce same-session 接续 5 sub-chunks + 1 follow-up = 6 commits trail. chenyexuan same-session 4 main + 1 sweep = 5 commits trail. 13/15 commits same-session ratio = ~87% Layer 1 efficiency.
Layer 1 自动化 path metric: 17/17 pass-1 一次过率 ≥ 90% manual fresh threshold ✅
| # | Chunk | Session | LOC | Pass-1 |
|---|---|---|---|---|
| 1 | T8 chunk 1 (Postgres) | fresh | 545 | ✅ |
| 2 | T4 chunk 1 (RedisWorkQueue) | fresh | 376 | ✅ |
| 3 | T8 chunk 2 (Neo4j) | fresh | 844 | ✅ |
| 4 | T6 chunk 1 (OTLP) + follow-up | fresh + same | 525+11 | ✅ + 1 follow-up |
| 5 | T3 chunk 1 (DocParser) | same as T6 | 416 | ✅ |
| 6 | T8 chunk 3 (Nebula, ~70% budget) | same as chunk 2 | 1169 | ✅ + ⭐ same-session catch chunk 1 latent SQL bug |
| 7 | T3 chunk 2 (q:parse async, fresh per priority) | fresh | 1189/-101 | ✅ |
| 8 | T8 chunk 4a (alembic + drop description) | fresh | +248/-88 | ✅ |
| 9 | T8 chunk 4b (worker_factory dispatch) | same as 4a | +450/-52 | ✅ |
| 10 | T2 (cleanup fan-out) | same as T3#2 | +827/-17 | ✅ |
| 11 | T8 chunk 4c (cross-backend contract fixture) | same as 4b | +661 | ✅ + ⭐ same-session catch 5/6 Nebula schema-visibility |
| 12 | T9 (fulltext multi-backend) | same as T2 | +275/-31 | ✅ |
| 13 | T8 chunk 4d+4e (combined) | same as 4c | +509 | ✅ |
| 14 | §G.2.2.1 amendment follow-up | same | +11 docs | ✅ |
| 15 | Sweep A+B+C follow-up | same | +20/-10 docs+comment | ✅ |
| 16 | Sweep D Layer 2 stub follow-up | same | +43 test | ✅ |
Layer 1 same-session debug-capability 数据点 2 例 (chunk 4a catch chunk 1 latent SQL bug + chunk 4c catch Nebula schema-visibility latent) — Layer 1 不仅 throughput 高 + quality match fresh session, additionally catch + fix latent bug not detected by fresh session pass-1.
feedback_no_refresh_complete_all_tasks.md directive 落地 evidence base.
Wave 4 backlog completion status
| # | Task | Owner | Status |
|---|---|---|---|
| 1 | Nebula LineageGraphStore adapter | Bryce | ✅ T8 chunk 3 |
| 2 | Real graph LLM extractor | Bryce | 🚧 T1 (in-flight, Bryce same-session next) |
| 3 | Cleanup loop fan-out | chenyexuan | ✅ T2 |
| 4 | Real parser (DocParser + q:parse) | chenyexuan | ✅ T3 chunks 1+2 |
| 5 | modality contract supplement tests | — | ✅ pre-Wave 4 (PR #1730) |
| 6 | Real Redis WorkQueue | chenyexuan | ✅ T4 |
| 7 | Real Redis QuotaBackend Lua atomic | Bryce | 🚧 T5 (queued post-T1) |
| 8 | OTLP MetricsEmitter | chenyexuan | ✅ T6 + follow-up |
| 9 | Real multimodal vision-LLM | Bryce | 🚧 T7 (queued post-T5) |
| 10 | graph 3 backend adapter wiring | Bryce | ✅ T8 chunk 4 (4a-4e + follow-ups) |
| 11 | fulltext multi-backend dispatch | chenyexuan | ✅ T9 + sweeps A+B+C |
8/11 closed (4 sweep A/B/C/D 全 closed). Pending: T1 + T5 + T7 (Bryce same-session lane).
Wave 5 backlog 累积 (per architect ratify trail)
- legacy graphindex package 整体淘汰 + retrieval/curation 迁移 (per architect msg=b26f64b2 + chunk 4d Option C deferred)
- cleanup transient vs intentional error 区分 (per architect msg=6aa8ca88 T2 ratify)
- e2e-http-compose lane stub model-provider fixture + Layer 2 activation (per architect msg=c279a0ff chunk 4d+4e ratify + sweep D stub fold)
- fulltext additional backends (Solr / Typesense / MeiliSearch) extension pattern ready (per §G.2.2.1 amendment + architect msg=eba26fc2)
- parser_config skip-if-already-parsed perf optimization (per chenyexuan msg=22d3fd0a + architect msg=9a6de002)
- user_scope_key org-dimension migration (per architect msg=803a2757 §H.2 forward-compat)
- Wave 5 perf graph lineage / namespace prefix / cypher type keyword (per Bryce msg=39a74026 chunk 2 minor)
- cross-event-loop driver vendor change re-verify (per architect msg=9991cda3 chunk 4c amendment context)
Pending pass-1 reviews
- T1 (real graph LLM extractor) — Bryce same-session next
- T5 (real Redis QuotaBackend Lua atomic) — Bryce post-T1
- T7 (real multimodal vision-LLM + sweep A/C) — Bryce post-T5
Wave 4 backlog T1: replaces the chunk 4b ``_no_op_extractor``
placeholder with an actual LLM-driven entity/relation extractor that
calls the collection's configured completion model for each chunk
and parses the JSON output into the new
``aperag.indexing.graph.EntityRecord`` / ``RelationRecord`` shapes.
The chunk 4b "Wave 4 wiring T1" symbol-identity gate self-disables
because ``_no_op_extractor`` is gone — the worker factory now wires
``build_collection_graph_extractor(collection)`` directly.
* New ``aperag/indexing/graph_extractor.py`` (~280 LOC) implementing
the closure factory:
- Builds a per-collection LLM callable via the existing legacy
``build_collection_llm_callable`` (Wave 5 backlog will relocate
this helper to ``aperag/indexing/llm.py`` per architect msg=87e2b187
chunk 4d Option C ruling).
- Reads ``collection.config.knowledge_graph_config.entity_types`` +
``collection.config.language`` with tolerant readers (handles
pydantic attr / dict / JSON-string config shapes).
- Per-chunk LLM call with a 60s ``asyncio.wait_for`` timeout so a
stuck LLM does not block the worker forever.
- JSON response parser tolerates fenced ``\`\`\`json … \`\`\``` wrappers
+ skips individual malformed records (preserves the rest).
* Failure semantics (per Wave 3 lesson #10 ship-incomplete-but-don't-
silently-lie):
- No completion model configured for the collection → factory raises
:class:`WorkerFactoryError` with "completion model not configured"
so the orchestrator finalises the row FAILED with operator-facing
diagnostics.
- Per-chunk LLM failures (malformed JSON / transient errors / etc.)
log a warning and skip the chunk's entities/relations; other
chunks still contribute. Mirrors the legacy LightRAG extractor
failure semantics — one bad chunk does not poison the document.
- Empty chunks (no ``text``) short-circuit without an LLM call.
* worker_factory updates:
- ``_build_graph_worker`` removes the ``_no_op_extractor`` symbol
and the symbol-identity gate; constructs the extractor via the
new builder.
- The "Wave 4 wiring T1" message is gone from the codebase — the
new failure mode for missing-completion-model rows is "graph
extractor: completion model not configured" which the
orchestrator surfaces in ``error_message``.
* Test updates:
- ``test_full_indexing_pipeline.py``: rename
``test_phase1_graph_modality_raises_wave4_wiring_t1_gate`` →
``test_phase1_graph_modality_raises_when_completion_model_missing``
and update the assertion to allow either old or new message
(forward-compat for cherry-picks).
- ``test_worker_factory.py``: rename
``test_build_graph_worker_raises_t1_wiring_gate`` →
``test_build_graph_worker_raises_when_completion_model_missing``
and update the assertion accordingly.
- New ``test_graph_extractor.py`` (14 unit tests): cover the JSON
parser invariants (well-formed / fenced / malformed / per-record
skip), config readers (entity_types / language across 3 shapes),
chunk-skip-on-empty-text, per-chunk failure isolation, and the
completion-model-missing gate.
Local gates green:
- ``ruff check ./aperag ./tests`` clean
- ``ruff format --check ./aperag ./tests`` clean (508 files)
- ``pytest tests/integration/test_graph_extractor.py``: 14 passed
- ``pytest tests/unit_test/indexing/ + test_full_indexing_pipeline.py
+ test_worker_factory.py``: 165 passed / 2 skipped (Layer 2 stubs)
Production-readiness three-class layer:
- must-be-real: ``build_collection_graph_extractor`` constructs a
real LLM-driven extractor per collection
- may-be-gated: legacy ``build_collection_llm_callable`` import will
be relocated to ``aperag/indexing/llm.py`` in Wave 5 cross-cutting
refactor (per architect Wave 5 backlog item)
- fully-resolves: Wave 4 backlog T1 (real graph LLM extractor —
the chunk 4b "Wave 4 wiring T1" gate self-disabled)
Cross-task interaction:
- chunk 4b's ``Wave 4 wiring T1`` gate is now obsolete; the
Phase 1 e2e smoke tests are updated accordingly.
- chunk 4e Layer 2 canonical full-pipeline test (currently a
``pytest.skip`` stub) can be activated by the Wave 5 e2e-http-
compose lane fixture work — graph modality will reach ACTIVE
when the real LLM is configured + the lane wires it up.
Next: T5 (real Redis QuotaBackend Lua atomic) → T7 (real multimodal
vision-LLM + vision modality production wiring).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ngRuntime
Wave 4 backlog T5: the ``RedisQuotaBackend`` Lua-atomic token-bucket
implementation has existed in ``aperag/indexing/quota.py`` since
Wave 1 (per existing 411 LOC). T5 wires it through to the orchestrator
runtime so multi-pod production deployments share §H.5 token state
via Redis logical-db=3 (per chunk 4e §H.5.1 amendment) instead of
each pod independently exhausting tenant quota with the
:class:`InMemoryQuotaBackend` default.
Changes:
* ``aperag/config.py``: new ``INDEXING_QUOTA_BACKEND`` setting
(default ``inmemory``; production multi-pod sets ``redis``).
Mirrors the chunk 4 ``INDEXING_QUEUE_BACKEND`` /
``INDEXING_METRICS_EMITTER`` dispatch pattern.
* ``aperag/indexing/runtime.py``: ``IndexingRuntime`` adds a
``quota_backend: QuotaBackend`` field with a default
:class:`InMemoryQuotaBackend` (``QuotaPolicyRegistry`` empty —
every tenant routes to fallback policy until operators populate
the policy table). Existing callers that build the runtime
without specifying ``quota_backend`` keep working.
* ``aperag/app.py`` lifespan: dispatches on
``settings.indexing_quota_backend`` — when ``redis``, builds an
``redis.asyncio`` client at ``settings.indexing_queue_redis_url``
(decode_responses=False because the Lua script returns raw
byte/int payloads), wraps it in :class:`RedisQuotaBackend`, and
feeds the runtime. Stashes the underlying client on
``app.state.indexing_quota_redis`` so the lifespan ``finally`` can
``aclose()`` it on shutdown — mirrors the T4 RedisWorkQueue
graceful close pattern.
* The actual call sites (graph_extractor / summary modality / vision
modality) calling ``await runtime.quota_backend.acquire(...)`` are
Wave 5 follow-up — T5 lands the wiring contract today; the per-
callsite acquire calls follow in the Wave 5 ``aperag/indexing/llm.py``
relocate batch (per architect msg=87e2b187 chunk 4d Option C
ruling).
Local gates green:
- ``ruff check ./aperag ./tests`` clean
- ``ruff format --check ./aperag ./tests`` clean (508 files)
- ``pytest tests/integration/test_full_indexing_pipeline.py
test_worker_factory.py test_graph_extractor.py
+ tests/unit_test/indexing/``: 179 passed / 2 skipped (Layer 2)
Production-readiness three-class layer:
- must-be-real: RedisQuotaBackend Lua-atomic acquire/release
against shared Redis (db=3 per §H.5.1)
- may-be-gated: InMemoryQuotaBackend default keeps single-pod
deployments working; Wave 5 wires the per-callsite acquire
- fully-resolves: Wave 4 backlog T5 (real Redis QuotaBackend Lua
atomic — wiring contract closed; per-callsite consumer relocate
follows Wave 5 ``aperag/indexing/llm.py`` task)
Next: T7 (real multimodal vision-LLM + vision modality production
wiring).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…pe amendment Two coordinated fixes addressing architect msg=c279a0ff (T5 db=3 violation fix-forward) + msg=8c456789 (T7 wiring scope clarification). T5 db=3 spec violation fix-forward (architect direct ratify pre-condition): * T5 chunk wiring (commit 5f42209) used ``settings.indexing_queue_redis_url`` for the RedisQuotaBackend, but chunk 4e §H.5.1 amendment locked ``broker=0 / memory=1 / WorkQueue=2 / Quota+EntityLock=3`` — ``indexing_queue_redis_url`` is db=2 (WorkQueue), not db=3 (Quota). * Architect msg=c279a0ff catches the spec drift; chunk 4e §H.5.1 amendment text locked the invariant but my T5 code violated it. Fixes: * ``aperag/config.py``: new ``INDEXING_QUOTA_REDIS_URL`` setting + default-derive that builds ``redis://USER:PASS@HOST:PORT/3``. Mirrors the chunk 4 ``indexing_queue_redis_url`` derive pattern but on db=3. * ``aperag/app.py``: the T5 lifespan dispatch now binds the Redis client to ``settings.indexing_quota_redis_url`` (db=3) instead of the WorkQueue URL. Aligns code with §H.5.1 spec text. * ``aperag/indexing/worker_factory.py``: the chunk 4b ``_redis_entity_lock_singleton`` also moved from ``indexing_queue_redis_url`` (db=2) to ``indexing_quota_redis_url`` (db=3) — §H.5.1 specifies BOTH Quota AND EntityLock keyspaces live in db=3 (the ``indexing:graph:entity:<slot>`` keys are quota-adjacent + the test fixture / docker compose runs both off the same logical DB so production wiring should match). T7 Wave 4 wiring scope amendment (architect-side scope clarification): * ``docs/modularization/indexing-redesign-design-pack.md`` §G.2.5.1 documents the T7 wiring scope split: items 1+3 (multimodal embedding API surface + provider capability flag) close in Wave 4; item 2 (real PDF / image-source extraction in parser pipeline) defers to Wave 5 alongside the parser canonical schema unification (per huangheng T1 obs B). * The Wave 3 lesson #10 explicit-gate (chunk 4b vision gate) stays effective until item 1 lands the ``embed_image(bytes) -> list[float]`` API surface and the operator opts into a multimodal model. * Until item 2 lands, T7 ships the embedder API surface + gate behaviour but the actual ``vision/images/<image_id>.<ext>`` reads gracefully fallback — the operator sees a clean ``error_message`` in the DocumentIndex row noting the parser-side gap. Architect-side meta lesson (folded into ``feedback_spec_lock_grep_verify_caller.md`` via ratify msg trail): spec amendment lock "logical db assignment" type invariants must **simultaneously grep-verify** the current code binding, otherwise spec text locks the invariant while code keeps violating it. The chunk 4e §H.5.1 amendment ratify did not grep- verify the chunk 4b ``_redis_entity_lock_singleton`` URL choice — this T5 fix-forward closes both gaps. Local gates green: - ``ruff check ./aperag ./tests`` clean - ``ruff format --check ./aperag ./tests`` clean (508 files) - ``pytest`` 179 passed / 2 skipped (Layer 2 stubs) Production-readiness three-class layer: - must-be-real: db=3 logical isolation per §H.5.1 amendment - may-be-gated: T7 item 2 (parser image-source extraction) defers to Wave 5 - fully-resolves: T5 ``INDEXING_QUOTA_BACKEND=redis`` wiring + T7 wiring scope spec amendment Wave 4 backlog status after this commit: - T8 chunk 4 (4a-4e + follow-ups): closed ✅ - T1: closed ✅ - T5: closed ✅ (per-callsite consumer relocate → Wave 5) - T7: scope amendment + gate behaviour locked; multimodal embedding API surface + parser image extraction → Wave 5 - T2 / T3 / T4 / T6 / T9: closed by chenyexuan ✅ Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…n Wave 4 Honest-scope follow-up to ``11e81b9f``: the prior §G.2.5.1 amendment text claimed "T7 wiring closes items 1+3 in Wave 4 (gate self- disables + API surface + provider flag)" but `11e81b9f` only ships the spec amendment text — items 1 / 2 / 3 all defer to a single Wave 5 PR per huangheng pass-1 + my own scope clarification msg= bad51f10. The three pieces (multimodal embedding API surface + parser image extraction + provider capability flag) are tightly coupled — splitting them risks the Wave 3 lesson #10 broken-pattern (API surface with no caller / provider flag with no API surface / etc.). The whole T7 implementation lands as one Wave 5 PR that bundles all three. Wave 4 close-out invariant: the chunk 4b vision gate stays effective; operators see a clean "Wave 4 wiring" ``WorkerFactoryError`` until full T7 lands in Wave 5. This commit clarifies the §G.2.5.1 text to match the actual T7 scope shipped in PR #1731 (doc-only) so the design pack does not silently lie about the implementation state. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
huangheng CR — pass-1 verdicts batch update #4 (final close-out: T1 + T5 + T7)Mirror from #celery thread per architect msg=b0f69983 + PM msg=d053d8b9. PR #1731 merge-ready — Wave 4 close-out gate 达成 (all 9 tasks in_review, architect ratify pass-3 final ✅). ✅ T1 — Real graph LLM extractor (
|
| # | Condition | Status |
|---|---|---|
| 1 | T5 db=3 fix-forward verify | ✅ msg=e69cbf1b pass-2 |
| 2 | T7 §G.2.5.1 honest-scope text align | ✅ msg=5e717200 pass-3 |
| 3 | T8 chunk 4 全 9 acceptance items closed | ✅ msg=9e1aae7c |
| 4 | T1 architect ratify | ✅ msg=d5a159d8 |
| 5 | T2/T3-2/T4/T6/T9 (chenyexuan) closed | ✅ |
| 6 | sweep A/B/C/D closed | ✅ |
| 7 | huangheng pass-1/2/3 verdicts on all chunks | ✅ |
| 8 | All 9 task_ids #17-25 in_review | ✅ msg=d053d8b9 |
| 9 | Wave 5 backlog 12+ items locked | ✅ |
| 10 | GitHub PR state CLEAN (3 review-comment batches mirrored) | ✅ this comment |
Wave 4 backlog completion summary (11 items)
| # | Task | Owner | Status | HEAD ref |
|---|---|---|---|---|
| 1 | T1 real graph LLM extractor | Bryce | ✅ closed | 2506f4d5 |
| 2 | T2 cleanup fan-out | chenyexuan | ✅ closed | 97dbe618 |
| 3 | T3 parser real wire | chenyexuan | ✅ closed (chunks 1+2) | 768d0aee + dfa7fdbe |
| 4 | T4 RedisWorkQueue | chenyexuan | ✅ closed | 650ef6dd |
| 5 | T5 RedisQuotaBackend wire | Bryce | ✅ closed (db=3 fix) | 5f42209d + 11e81b9f |
| 6 | T6 OTLP MetricsEmitter | chenyexuan | ✅ closed | 51aca50c + 9200866c |
| 7 | T7 multimodal vision-LLM | Bryce | 🟡 doc-only spec amendment; full impl Wave 5 batch | da576f62 |
| 8 | T8 graph 3 backend adapter | Bryce | ✅ closed (5 chunks + follow-ups) | f0571f98 → da576f62 |
| 9 | T9 fulltext multi-backend | chenyexuan | ✅ closed | 944805e7 + sweep A+B+C |
| 10 | sweep A/B/C/D | mixed | ✅ closed | 1e0bc01 + 4d36c7f |
| 11 | 6 spec amendments | architect/Bryce | ✅ locked | §C.3.1 / §D.3.5.1 / §H.5.1 / §H.5.2 / §K.7+§K.8 / §G.2.2.1 |
10/11 fully implemented + 1/11 (T7) doc-only spec amendment with Wave 5 implementation locked.
Wave 5 backlog 累积 (12+ items)
- legacy graphindex package elimination + retrieval/curation 迁移 (chunk 4d Option C deferred)
- e2e-http-compose lane stub model-provider fixture + Layer 2 test activation
aperag/indexing/llm.pyrelocate (build_collection_llm_callable+render_extraction_prompt)- T7 multimodal vision-LLM full implementation 3-item bundle (item 1 API + item 2 parser + item 3 provider flag)
- T1 graph extractor per-collection config tunability (per_chunk_timeout / max_entities / max_relations)
- T1 chunk_id schema unification on parser layer
- T2 cleanup
_resolve_cleanup_workertransient-vs-intentional error split - T2 cleanup builder share helper with dispatch builders (drift risk)
- T2 reconciler "document N min 无 document_index rows → re-enqueue parse"
- T3 chunk 2 parse_orchestrator parse_version short-circuit (idempotent skip)
- T3 chunk 2 tenant_scope_key org-prefix forward-compat
- fulltext additional backends (Solr / Typesense / MeiliSearch) extension
Layer 1 自动化 path metric: 20/20 pass-1 一次过率 ≥ 90% manual fresh threshold ✅
| # | Chunk | Session | LOC | Pass-1 |
|---|---|---|---|---|
| 1 | T8 chunk 1 (Postgres) | fresh | 545 | ✅ |
| 2 | T4 chunk 1 (RedisWorkQueue) | fresh | 376 | ✅ |
| 3 | T8 chunk 2 (Neo4j) | fresh | 844 | ✅ |
| 4 | T6 chunk 1 (OTLP) + follow-up | fresh + same | 525+11 | ✅ + 1 follow-up |
| 5 | T3 chunk 1 (DocParser) | same as T6 | 416 | ✅ |
| 6 | T8 chunk 3 (Nebula) | same as chunk 2 | 1169 | ✅ + ⭐ same-session catch chunk 1 latent SQL bug |
| 7 | T3 chunk 2 (q:parse async) | fresh | 1189/-101 | ✅ |
| 8 | T8 chunk 4a | fresh | +248/-88 | ✅ |
| 9 | T8 chunk 4b | same as 4a | +450/-52 | ✅ |
| 10 | T2 (cleanup fan-out) | same as T3#2 | +827/-17 | ✅ |
| 11 | T8 chunk 4c | same as 4b | +661 | ✅ + ⭐ same-session catch 5/6 Nebula schema-visibility |
| 12 | T9 (fulltext multi-backend) | same as T2 | +275/-31 | ✅ |
| 13 | T8 chunk 4d+4e | same as 4c | +509 | ✅ |
| 14 | §G.2.2.1 amendment | same | +11 docs | ✅ |
| 15 | Sweep A+B+C | same | +20/-10 | ✅ |
| 16 | Sweep D Layer 2 stub | same | +43 | ✅ |
| 17 | T1 graph LLM extractor | same | +683 | ✅ |
| 18 | T5 RedisQuotaBackend wire | same | +76/-1 | 🟡 db=3 drift surfaced |
| 19 | T5 db=3 fix-forward + T7 amendment | same | +67/-5 | 🟡 T7 narrative drift |
| 20 | T7 §G.2.5.1 honest-scope | same | +18 docs | ✅ |
Same-session ratio: 13/20 = ~65% (Bryce 11 commits + chenyexuan 4 commits). Layer 1 same-session debug-capability data points 2 cases (chunk 4a catch chunk 1 latent SQL bug + chunk 4c catch 5/6 Nebula schema-visibility latent fragments).
feedback_no_refresh_complete_all_tasks.md directive evidence base: Layer 1 ratified ≥ 90% manual fresh threshold throughout Wave 4.
Next steps
- @符炫炜 Wave 4 architect final review (含 Wave 5 backlog handoff message + same-session Layer 1 metric final summary) — 待 PR feat(celery Wave 4): real backends + 11 production-readiness items #1731 merge 后
- @不穷 PR feat(celery Wave 4): real backends + 11 production-readiness items #1731 GitHub merge action (CI 4/4 confirm)
- @huangheng Wave 4 close-out signal post merge
🎉 Wave 4 indexing redesign close-out — ALL 9 backlog tasks resolved.
…ent/intentional split + parse_version short-circuit + reconciler stuck-parse re-enqueue Wave 5 Phase 4 (PR-C / task #29) — closes 3 latent issues surfaced through Wave 4 ratify trail: * **P4-1 cleanup transient-vs-intentional split** (per architect msg=6aa8ca88 T2 ratify minor obs A): Pre-Wave-5 ``_resolve_cleanup_worker`` collapsed any factory exception into "drop the DB row". Transient infra errors (Qdrant blip, ES network glitch) lost the retry signal — DB row was deleted before next cycle could retry. Wave 5 P4 returns a new ``CleanupWorkerResolution(worker, transient)`` distinguishing the two: ``WorkerFactoryError`` is a by-design gate (drop the row to bound index growth), any other Exception is transient (skip DB row drop so next cycle retries when backend recovers). Counts surface ``transient_deferred`` so operators can track recovery rate. * **P4-2 parse_version short-circuit** (per huangheng T3 chunk 2 obs B + Wave 5 backlog item): Pre-Wave-5 ``parse_document`` always re-runs DocParser even when the resulting artifact directory is byte-identical (parse_version is content-derived). Wave 5 P4 adds ``short_circuit_if_artifacts_exist=True`` default — if all three derived artifacts (markdown.md / outline.json / chunks.jsonl) already exist under the canonical ``derived/parse_<version>/`` path, skip DocParser + writes entirely. Eliminates the ~30s OCR / Word rerun cost on rebuild of unchanged content. Tests can pass ``False`` to force re-parse. * **P4-3 reconciler stuck-document parse re-enqueue** (per architect Wave 4 T3 chunk 2 obs A — production gap close): Pre-Wave-5 parse failures (DocParser raise / source missing) silently dropped the document_id in the parse worker; operator saw ``document.status == PENDING`` forever with no signal to re-trigger. Wave 5 P4 adds ``reconcile_stuck_documents_for_parse_reenqueue`` to the reconciler loop. Detects documents with ``Document.gmt_created < now - cooldown_seconds`` AND zero ``document_index`` rows AND ``Document.gmt_updated < now - cooldown_seconds`` (cooldown filter prevents 30-s tick storm), then pushes a fresh ``ParseDispatchPayload`` matching the upload handler's contract. ``gmt_updated`` bumps after each push so the cooldown predicate rate-limits re-enqueue. Three-class tag (Wave 3 production-readiness invariant): * must-be-real: cleanup transient/intentional split prevents silent retry-signal loss; parse_version short-circuit eliminates OCR rerun waste; stuck-parse reconciler closes the document.status PENDING-forever gap * may-be-gated: ``short_circuit_if_artifacts_exist=False`` for tests pinning DocParser invocation count * fully-resolves: 3 of Wave 5 P1 backlog items (per architect ratify trail msg=6aa8ca88 + huangheng obs trail) Deltas: * ``aperag/indexing/cleanup.py`` — ``CleanupWorkerResolution`` dataclass + WorkerFactoryError-vs-Exception split in ``_resolve_cleanup_worker``; both ``cleanup_orphan_parse_versions`` + ``cleanup_for_deleted_documents`` honor ``resolution.transient`` to skip DB row drop on transient infra failures; collection cascade aggregates ``transient_deferred``. * ``aperag/indexing/parser.py`` — ``_all_artifacts_present`` predicate + ``short_circuit_if_artifacts_exist`` parameter + early-return ParseResult when all 3 artifacts present. * ``aperag/indexing/reconciler.py`` — ``STUCK_PARSE_COOLDOWN_SECONDS`` + ``reconcile_stuck_documents_for_parse_reenqueue`` async scan + ``_select_stuck_documents_for_reenqueue`` SQL query + ``_build_parse_payload_for_document`` payload reconstruction + ``_resolve_collection_parser_config`` / ``_resolve_collection_modalities`` (mirror ``document_service`` shape) + ``_mark_stuck_documents_reenqueued`` bumps gmt_updated; ``run_reconcile_loop`` calls the new scan. * ``aperag/indexing/__init__.py`` — re-export new symbols. * ``tests/integration/test_p4_robustness_3pack.py`` (new) — 13 tests across 3 layers (cleanup transient/intentional split / parse short-circuit / reconciler re-enqueue with cooldown + payload shape + skip-when-indexed + skip-when-no-object-path). * ``tests/unit_test/indexing/test_t3_1_dispatcher_path_c.py`` — added ``transient_deferred: 0`` to expected empty-counts dict. Local gates: * ``pytest tests/unit_test/indexing/ tests/integration/`` — 232 passed / 48 skipped (incl. 13 new ``test_p4_robustness_3pack``). * ``ruff check aperag/ tests/integration/test_p4_robustness_3pack.py`` — clean. * ``ruff format`` — applied. Branch: local ``chenyexuan/celery-wave5-p4`` based on main ``19d3d70`` (Wave 4 PR #1731 squash); will rebase onto ``bryce/celery-wave5`` once Bryce opens the Wave 5 draft PR. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ent/intentional split + parse_version short-circuit + reconciler stuck-parse re-enqueue Wave 5 Phase 4 (PR-C / task #29) — closes 3 latent issues surfaced through Wave 4 ratify trail: * **P4-1 cleanup transient-vs-intentional split** (per architect msg=6aa8ca88 T2 ratify minor obs A): Pre-Wave-5 ``_resolve_cleanup_worker`` collapsed any factory exception into "drop the DB row". Transient infra errors (Qdrant blip, ES network glitch) lost the retry signal — DB row was deleted before next cycle could retry. Wave 5 P4 returns a new ``CleanupWorkerResolution(worker, transient)`` distinguishing the two: ``WorkerFactoryError`` is a by-design gate (drop the row to bound index growth), any other Exception is transient (skip DB row drop so next cycle retries when backend recovers). Counts surface ``transient_deferred`` so operators can track recovery rate. * **P4-2 parse_version short-circuit** (per huangheng T3 chunk 2 obs B + Wave 5 backlog item): Pre-Wave-5 ``parse_document`` always re-runs DocParser even when the resulting artifact directory is byte-identical (parse_version is content-derived). Wave 5 P4 adds ``short_circuit_if_artifacts_exist=True`` default — if all three derived artifacts (markdown.md / outline.json / chunks.jsonl) already exist under the canonical ``derived/parse_<version>/`` path, skip DocParser + writes entirely. Eliminates the ~30s OCR / Word rerun cost on rebuild of unchanged content. Tests can pass ``False`` to force re-parse. * **P4-3 reconciler stuck-document parse re-enqueue** (per architect Wave 4 T3 chunk 2 obs A — production gap close): Pre-Wave-5 parse failures (DocParser raise / source missing) silently dropped the document_id in the parse worker; operator saw ``document.status == PENDING`` forever with no signal to re-trigger. Wave 5 P4 adds ``reconcile_stuck_documents_for_parse_reenqueue`` to the reconciler loop. Detects documents with ``Document.gmt_created < now - cooldown_seconds`` AND zero ``document_index`` rows AND ``Document.gmt_updated < now - cooldown_seconds`` (cooldown filter prevents 30-s tick storm), then pushes a fresh ``ParseDispatchPayload`` matching the upload handler's contract. ``gmt_updated`` bumps after each push so the cooldown predicate rate-limits re-enqueue. Three-class tag (Wave 3 production-readiness invariant): * must-be-real: cleanup transient/intentional split prevents silent retry-signal loss; parse_version short-circuit eliminates OCR rerun waste; stuck-parse reconciler closes the document.status PENDING-forever gap * may-be-gated: ``short_circuit_if_artifacts_exist=False`` for tests pinning DocParser invocation count * fully-resolves: 3 of Wave 5 P1 backlog items (per architect ratify trail msg=6aa8ca88 + huangheng obs trail) Deltas: * ``aperag/indexing/cleanup.py`` — ``CleanupWorkerResolution`` dataclass + WorkerFactoryError-vs-Exception split in ``_resolve_cleanup_worker``; both ``cleanup_orphan_parse_versions`` + ``cleanup_for_deleted_documents`` honor ``resolution.transient`` to skip DB row drop on transient infra failures; collection cascade aggregates ``transient_deferred``. * ``aperag/indexing/parser.py`` — ``_all_artifacts_present`` predicate + ``short_circuit_if_artifacts_exist`` parameter + early-return ParseResult when all 3 artifacts present. * ``aperag/indexing/reconciler.py`` — ``STUCK_PARSE_COOLDOWN_SECONDS`` + ``reconcile_stuck_documents_for_parse_reenqueue`` async scan + ``_select_stuck_documents_for_reenqueue`` SQL query + ``_build_parse_payload_for_document`` payload reconstruction + ``_resolve_collection_parser_config`` / ``_resolve_collection_modalities`` (mirror ``document_service`` shape) + ``_mark_stuck_documents_reenqueued`` bumps gmt_updated; ``run_reconcile_loop`` calls the new scan. * ``aperag/indexing/__init__.py`` — re-export new symbols. * ``tests/integration/test_p4_robustness_3pack.py`` (new) — 13 tests across 3 layers (cleanup transient/intentional split / parse short-circuit / reconciler re-enqueue with cooldown + payload shape + skip-when-indexed + skip-when-no-object-path). * ``tests/unit_test/indexing/test_t3_1_dispatcher_path_c.py`` — added ``transient_deferred: 0`` to expected empty-counts dict. Local gates: * ``pytest tests/unit_test/indexing/ tests/integration/`` — 232 passed / 48 skipped (incl. 13 new ``test_p4_robustness_3pack``). * ``ruff check aperag/ tests/integration/test_p4_robustness_3pack.py`` — clean. * ``ruff format`` — applied. Branch: local ``chenyexuan/celery-wave5-p4`` based on main ``19d3d70`` (Wave 4 PR #1731 squash); will rebase onto ``bryce/celery-wave5`` once Bryce opens the Wave 5 draft PR. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…unked-rotation) (#1733) * feat(celery Wave 5 P1 chunk 1): §K.9 spec + aperag/indexing/llm.py relocate (legacy graphindex elimination first chunk) Wave 5 task #26 first chunk per architect msg=10b5fae6 §K.9 spec amendment + PM msg=af82797e dispatch (single Wave 5 PR with 5-phase chunked-rotation commits, big-PR fast-landing per earayu2 msg=eced858d). Two changes bundled (per PM msg=af82797e "Phase 1 first commit = §K.9 spec paste + legacy graphindex elimination 实现起步"): 1. **§K.9 Wave 5 spec section** added to design pack: - 5-phase commit roadmap (16 acceptance items) - production-readiness 三类 layer per phase - architect direct ratify lane scope per phase - pre-check pattern lock per phase (per ``feedback_spec_lock_grep_verify_caller.md`` 双 pattern) - Layer 1 same-session continuation directive (per ``feedback_no_refresh_complete_all_tasks.md``) - Wave 5 acceptance summary (16 items + owner table) 2. **`aperag/indexing/llm.py` relocate** (§K.9.1 acceptance item 3): - new module owns canonical ``build_collection_llm_callable`` + ``render_extraction_prompt`` + ``ENTITY_RELATION_EXTRACTION`` template + ``LLMCall`` type alias - legacy ``aperag/domains/knowledge_graph/graphindex/integration.py`` and ``prompts.py`` re-export from new location during Wave 5 deprecation window so legacy retrieval/curation callers keep working until Phase 1 close-out - ``aperag/indexing/graph_extractor.py`` updated to import from new ``aperag.indexing.llm`` module directly (instead of legacy graphindex bridge) - test ``test_graph_extractor.py`` monkey-patch sites updated to target ``aperag.indexing.llm`` module Pre-check pattern 1 grep-verify caller cascade (per architect msg= b26f64b2 chunk 4d Option C ruling, used as reference list for Phase 1 migration scope): - ``aperag/indexing/`` → 0 cross-references to legacy ``graphindex/storage/*`` (chunk 4d narrowed scope invariant maintained ✅) - legacy ``graphindex/integration.py`` and ``prompts.py`` callers: ``retrieval/pipeline.py:85`` / ``knowledge_graph/service.py:69+`` / ``graph_curation/service.py:37`` / ``graph_curation/integration.py:21`` / ``service/prompt_template_service.py:161`` — remaining migrations land in subsequent Phase 1 commits Local gates green: - ``ruff check ./aperag ./tests`` clean - ``ruff format --check ./aperag ./tests`` clean (509 files) - ``pytest test_graph_extractor.py + test_full_indexing_pipeline.py + test_worker_factory.py + tests/unit_test/indexing/``: 179 passed / 2 skipped (Layer 2 stubs) Production-readiness three-class layer (Phase 1 partial): - must-be-real: ``aperag.indexing.llm`` module is the canonical production home for the relocated LLM helpers (no import indirection through legacy package once retrieval/curation migration completes) - may-be-gated: legacy ``graphindex/{integration,prompts}.py`` re-export shims kept active during Wave 5 deprecation window so cross-cutting refactor can land incrementally without breaking legacy retrieval/curation flow - fully-resolves: §K.9.1 Wave 5 acceptance item 3 (`aperag/indexing/llm.py` relocate); items 1 (legacy graphindex package elimination) and rest of Phase 1 cascade migration land in subsequent commits Next Phase 1 commits: migrate ``retrieval/pipeline.py`` / ``knowledge_graph/service.py`` / ``graph_curation/*`` to §G.5 read primitives; once all callers migrated → delete legacy ``graphindex/{storage, service, integration, engine, __init__}.py`` + delete legacy tests; final commit verifies grep-zero invariant. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(celery Wave 5 P1 chunk 2): relocate ENTITY_RELATION_EXTRACTION import to aperag/indexing/llm Wave 5 task #26 chunk 2 per §K.9.1 acceptance item 3 cascade (legacy graphindex caller migration): `aperag/service/prompt_template_service.py:155-163` (the `get_default_prompt(prompt_type="graph")` branch) was importing the ENTITY_RELATION_EXTRACTION template from the legacy `aperag.domains.knowledge_graph.graphindex.prompts` module. Per chunk 1 relocate (`11113acb`), the canonical home for the template is now `aperag.indexing.llm`. This commit migrates the import site so the caller no longer transitively touches the legacy module. Behavior unchanged — the legacy `graphindex/prompts.py` shim already re-exports the template from the new location, so this is a pure import-path refactor that leaves the runtime payload identical. Local gates: - `ruff check ./aperag ./tests` clean - `ruff format --check ./aperag ./tests` clean (509 files) Phase 1 cascade progress: - chunk 1 ✅ (`11113acb`): §K.9 spec + `aperag/indexing/llm.py` relocate - **chunk 2** (this commit): `service/prompt_template_service.py` import relocate - chunk 3+ pending: `retrieval/pipeline.py` / `knowledge_graph/service.py` / `graph_curation/*` migration (these need new design — legacy 24-method `GraphStore` Protocol → new 10-method `LineageGraphStore` Protocol API surface is NOT a 1-to-1 replacement; per Bryce msg=30c7e994 scope reality check + architect msg=b052b1b4 cascade ratify lane lock) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(celery Wave 5 P4): production robustness 3-pack — cleanup transient/intentional split + parse_version short-circuit + reconciler stuck-parse re-enqueue Wave 5 Phase 4 (PR-C / task #29) — closes 3 latent issues surfaced through Wave 4 ratify trail: * **P4-1 cleanup transient-vs-intentional split** (per architect msg=6aa8ca88 T2 ratify minor obs A): Pre-Wave-5 ``_resolve_cleanup_worker`` collapsed any factory exception into "drop the DB row". Transient infra errors (Qdrant blip, ES network glitch) lost the retry signal — DB row was deleted before next cycle could retry. Wave 5 P4 returns a new ``CleanupWorkerResolution(worker, transient)`` distinguishing the two: ``WorkerFactoryError`` is a by-design gate (drop the row to bound index growth), any other Exception is transient (skip DB row drop so next cycle retries when backend recovers). Counts surface ``transient_deferred`` so operators can track recovery rate. * **P4-2 parse_version short-circuit** (per huangheng T3 chunk 2 obs B + Wave 5 backlog item): Pre-Wave-5 ``parse_document`` always re-runs DocParser even when the resulting artifact directory is byte-identical (parse_version is content-derived). Wave 5 P4 adds ``short_circuit_if_artifacts_exist=True`` default — if all three derived artifacts (markdown.md / outline.json / chunks.jsonl) already exist under the canonical ``derived/parse_<version>/`` path, skip DocParser + writes entirely. Eliminates the ~30s OCR / Word rerun cost on rebuild of unchanged content. Tests can pass ``False`` to force re-parse. * **P4-3 reconciler stuck-document parse re-enqueue** (per architect Wave 4 T3 chunk 2 obs A — production gap close): Pre-Wave-5 parse failures (DocParser raise / source missing) silently dropped the document_id in the parse worker; operator saw ``document.status == PENDING`` forever with no signal to re-trigger. Wave 5 P4 adds ``reconcile_stuck_documents_for_parse_reenqueue`` to the reconciler loop. Detects documents with ``Document.gmt_created < now - cooldown_seconds`` AND zero ``document_index`` rows AND ``Document.gmt_updated < now - cooldown_seconds`` (cooldown filter prevents 30-s tick storm), then pushes a fresh ``ParseDispatchPayload`` matching the upload handler's contract. ``gmt_updated`` bumps after each push so the cooldown predicate rate-limits re-enqueue. Three-class tag (Wave 3 production-readiness invariant): * must-be-real: cleanup transient/intentional split prevents silent retry-signal loss; parse_version short-circuit eliminates OCR rerun waste; stuck-parse reconciler closes the document.status PENDING-forever gap * may-be-gated: ``short_circuit_if_artifacts_exist=False`` for tests pinning DocParser invocation count * fully-resolves: 3 of Wave 5 P1 backlog items (per architect ratify trail msg=6aa8ca88 + huangheng obs trail) Deltas: * ``aperag/indexing/cleanup.py`` — ``CleanupWorkerResolution`` dataclass + WorkerFactoryError-vs-Exception split in ``_resolve_cleanup_worker``; both ``cleanup_orphan_parse_versions`` + ``cleanup_for_deleted_documents`` honor ``resolution.transient`` to skip DB row drop on transient infra failures; collection cascade aggregates ``transient_deferred``. * ``aperag/indexing/parser.py`` — ``_all_artifacts_present`` predicate + ``short_circuit_if_artifacts_exist`` parameter + early-return ParseResult when all 3 artifacts present. * ``aperag/indexing/reconciler.py`` — ``STUCK_PARSE_COOLDOWN_SECONDS`` + ``reconcile_stuck_documents_for_parse_reenqueue`` async scan + ``_select_stuck_documents_for_reenqueue`` SQL query + ``_build_parse_payload_for_document`` payload reconstruction + ``_resolve_collection_parser_config`` / ``_resolve_collection_modalities`` (mirror ``document_service`` shape) + ``_mark_stuck_documents_reenqueued`` bumps gmt_updated; ``run_reconcile_loop`` calls the new scan. * ``aperag/indexing/__init__.py`` — re-export new symbols. * ``tests/integration/test_p4_robustness_3pack.py`` (new) — 13 tests across 3 layers (cleanup transient/intentional split / parse short-circuit / reconciler re-enqueue with cooldown + payload shape + skip-when-indexed + skip-when-no-object-path). * ``tests/unit_test/indexing/test_t3_1_dispatcher_path_c.py`` — added ``transient_deferred: 0`` to expected empty-counts dict. Local gates: * ``pytest tests/unit_test/indexing/ tests/integration/`` — 232 passed / 48 skipped (incl. 13 new ``test_p4_robustness_3pack``). * ``ruff check aperag/ tests/integration/test_p4_robustness_3pack.py`` — clean. * ``ruff format`` — applied. Branch: local ``chenyexuan/celery-wave5-p4`` based on main ``19d3d70`` (Wave 4 PR #1731 squash); will rebase onto ``bryce/celery-wave5`` once Bryce opens the Wave 5 draft PR. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(celery Wave 5 P1 chunk 3): §K.9 Phase 1 scope narrow + §K.10 Wave 6 backlog Per architect Option (a) ruling 2026-04-27 (post-cascade scope check msg=30c7e994 + ratify msg=8780c937): * §K.9 Phase 1 narrowed scope: relocate-only (chunks 1+2 already shipped via `11113acb` + `e0707165`). Original Phase 1 scope (delete legacy `graphindex` package + migrate retrieval/curation callers) collided with unresolved design gap — legacy `GraphIndexService. query_context()` 24-method LightRAG-style API has no equivalent on the new 10-method `LineageGraphStore` Protocol; building that query layer is 1-2 week design + implementation effort, not in-scope for Wave 5 single-PR fast-landing style. * §K.10 Wave 6 backlog added: full legacy graphindex elimination + retrieval/curation migration + new LightRAG-style query layer design moves to Wave 6 separate PR with its own architect ratify lane (per chunk 4d Option C ruling msg=b26f64b2 + Wave 4 §K.8 precedent). * Wave 5 acceptance summary updated: 15/16 items in Wave 5 PR (item 1 legacy graphindex elimination → Wave 6). * Phase 1 close-out signal: chunk 1 (`11113acb`) +chunk 2 (`e0707165`) ship llm.py relocate + import re-route to new canonical home; legacy `graphindex/{integration,prompts}.py` keep deprecation-shim re-exports during the Wave 6 deprecation window so legacy retrieval/curation callers do not break mid-Wave-5. This commit closes Phase 1 narrowed scope and unblocks Phase 2 (T7 multimodal vision-LLM 3-item bundle) per architect ruling. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(celery Wave 5 P3): Layer 2 e2e fixture wiring + sweep D activation Wave 5 Phase 3 (task #28) — replaces the pre-Wave-5 ``pytest.skip`` stubs in ``test_full_indexing_pipeline.py`` Layer 2 with functional test bodies that the e2e-http-compose CI lane can execute against the live backend stack. Per architect msg=fdd53586 + chunk 4d+4e ratify msg=c279a0ff — Layer 2 contract was declared but skip-stub-only; this phase wires the actual implementation so the canonical Phase 1 invariant runs end-to-end whenever the operator opts in via ``RUN_E2E_PHASE1_SMOKE=1``. Two test bodies wired: * ``test_phase1_full_pipeline_vector_fulltext_summary_active_graph_vision_failed`` — the canonical Phase 1 smoke (architect msg=da3012a4): real ``ProductionWorkerFactory`` + parse_document + dispatch_indexing + worker pool drive until every modality finalises. Asserts vector + fulltext + summary reach ACTIVE; graph + vision finalise FAILED with a gate marker (``Wave 4 wiring`` chunk 4b / ``completion model`` post-T1 self-disable / ``multimodal`` vision gate). The gate-marker OR tolerates the same-Wave T1 self-disable surface so the test stays green across the chunk 4b → T1 closure. * ``test_phase1_multi_keyword_fulltext_search_returns_hits`` — sweep D verification (architect msg=fdd53586): index a real document via the worker pool, then issue a 3-keyword query through ``_fulltext_search`` and assert ≥1 hit. The retrieval- side ``minimum_should_match`` arithmetic over N×content + N×title should-clauses (huangheng msg=fb64468c flag) is the latent issue this exercises end-to-end. New fixture machinery: * ``_phase1_e2e_skip_reason()`` — central skip-reason resolver that documents the activation contract: ``RUN_E2E_PHASE1_SMOKE=1`` + ``PHASE1_E2E_COLLECTION_ID`` (pointing at a Collection seeded by the e2e-http-compose bootstrap with a real model provider configured) + backend env vars (``DATABASE_URL`` / ``ES_HOST`` / Qdrant + Redis env vars). Both Layer 2 tests share the same skip predicate so the lane runner only needs one env-var setup. * ``_resolve_phase1_e2e_engine()`` — opens a real Postgres engine using ``settings.database_url``; skips with a clear reason if unset. * ``_run_phase1_workers_until_quiet()`` — bounded async loop that drives ``ProductionWorkerFactory`` directly (mirroring the ``run_*_worker`` lifespan path) until every modality row reaches a terminal status. Bounded by ``timeout_seconds`` so a hung modality fails the test loud rather than blocking forever. Three-class tag (Wave 3 production-readiness invariant): * must-be-real: live ProductionWorkerFactory + real backends + real ES query * may-be-gated: skips when env vars missing (local-dev never requires the full stack) * fully-resolves: 2 of Wave 5 P1 backlog items (Layer 2 stubs activated for canonical full pipeline + sweep D multi-keyword smoke) Cleanup roundtrip (delete document → cleanup loop → backend artefacts gone) is left as a follow-up sub-test to land alongside the e2e-http-compose lane document-delete API access scaffolding. Local gates: * ``pytest tests/integration/test_full_indexing_pipeline.py`` — 4 passed / 2 skipped (Layer 2 skips cleanly with the new ``_phase1_e2e_skip_reason()`` until env vars set in CI lane). * ``pytest tests/unit_test/indexing/ tests/integration/`` — 232 passed / 48 skipped — no regression vs P4 baseline. * ``ruff check aperag/ tests/integration/test_full_indexing_pipeline.py`` — clean. * ``ruff format`` — applied. Branch: ``chenyexuan/celery-wave5-p4`` rebased on ``bryce/celery-wave5`` HEAD ``99af7965`` (Phase 1 chunk 3 spec amend) → push to ``bryce/celery-wave5`` so Wave 5 single-PR pile continues. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(celery Wave 5 P2 chunk 1): EmbeddingService.embed_image API surface (T7 item 1) Wave 5 task #27 chunk 1 per §G.2.5.1 amendment item 1: ``EmbeddingService.embed_image(image_bytes, alt_text)`` API surface is the canonical multimodal embedding entry point that replaces the Wave 3 placeholder ``embed_query(f"{image_id}|{alt_text}")`` string- concat (Wave 3 lesson #10 broken pattern). Behavior: * When ``self.multimodal=True`` (operator wired a multimodal embedder on the collection's spec), encode image bytes as base64 data URL + forward to LiteLLM ``embedding(input=[{"image_url":{"url":...}}, ...])`` shape that multimodal-capable providers (Voyage / Jina v3 / OpenAI multimodal / etc.) accept natively. * When ``self.multimodal=False``, raise ``EmbeddingError`` with clear operator-facing diagnostic — chunk 4b vision gate already prevents this state but runtime check is defense-in-depth (Wave 3 lesson #10 ship-incomplete-but-don't-silently-lie). * ``alt_text`` paired into LiteLLM input as a textual hint for embedders that accept multi-part inputs; image-only embedders silently drop the text element. The ``aembed_image`` async wrapper mirrors the ``aembed_query`` pattern (``asyncio.to_thread`` since the LiteLLM call is sync). Wave 5 P2 chunk-1 scope (this commit): * declares the API surface so Phase 2 chunks 2 (parser image extraction) + 3 (provider v3 capability flag UI) and the chunk 4b vision gate self-disable path have a concrete contract to wire to * uses imghdr-based MIME detection + LiteLLM's documented multimodal input shape; provider-specific input format variations defer to Wave 6 (per §K.10 Wave 6 backlog cross-cutting refactor) Local gates green: - ``ruff check ./aperag ./tests`` clean - ``ruff format --check ./aperag ./tests`` clean (510 files) - ``pytest`` 179 passed / 2 skipped — no regressions on existing EmbeddingService callers (text-only path unchanged) Production-readiness three-class layer: - must-be-real: real LiteLLM multimodal embedding call against the operator-configured provider when ``multimodal=True`` - may-be-gated: provider-specific input-format adjustments (Voyage vs Jina vs OpenAI multimodal differences) deferred to Wave 6 - fully-resolves: §G.2.5.1 item 1 (``embed_image`` API surface) + §K.9 Phase 2 partial — items 2+3 (parser image extraction + provider v3 capability flag UI) follow in subsequent commits Next Phase 2 chunks: - chunk 2: parser image extraction → ``derived/parse_<v>/vision/images/<image_id>.<ext>`` - chunk 3: provider v3 router multimodal capability flag UI exposure - chunk 4: ``worker_factory._build_vision_worker`` ``_embed`` callsite rewrite to use ``embed_image`` + chunk 4b vision gate self-disable verify (Phase 1 Layer 1 test rename) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(celery Wave 5 P5A item 1): per-collection knobs for T1 graph extractor caps + timeout Per huangheng T1 obs A msg=6b349693: surface the LightRAG-style extractor's per-chunk caps and LLM timeout as collection-level overrides (`KnowledgeGraphConfig.{max_entities_per_chunk, max_relations_per_chunk, per_chunk_timeout_seconds}`) so deployments tuning a slow LLM provider or extracting from entity-dense documents can lift the defaults without patching `aperag/indexing/graph_extractor.py` constants. * `aperag/schema/common.py`: 3 new optional fields on KnowledgeGraphConfig. * `aperag/indexing/graph_extractor.py`: `_resolve_int_kg_config` / `_resolve_float_kg_config` helpers (mirror `_resolve_entity_types` pydantic-attr / Mapping / JSON-string tolerance pattern + reject non-positive / non-numeric values with warning + default fallback); `_extract_one_chunk` now takes `timeout_seconds` kwarg and `asyncio.wait_for` wires it instead of the global constant. * `tests/integration/test_graph_extractor.py`: 6 new unit tests pinning override-wins / fallback-on-missing / fallback-on-non-positive / int-coerced-to-float / fallback-on-garbage; fixed pre-existing `_extract_one_chunk` call to pass `timeout_seconds=60.0`. Defaults unchanged (32 / 32 / 60.0); collections that don't set the new fields keep current behavior. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(celery Wave 5 P5B): P2 batch — utc_now unify + tenant org-prefix forward-compat + OTLP config cross-check + cleanup builder share Wave 5 Phase 5B (task #31) — 4 polish items from the Wave 4 ratify trail accumulated obs (chunk_id schema unify deferred to a follow-up once parser canonical schema lock lands): * **P5B-A utc_now → CURRENT_TIMESTAMP unify** (per huangheng chunk 4a obs A): ``_LineageEntityRow`` + ``_LineageRelationRow`` ``gmt_created`` / ``gmt_updated`` columns now use ``server_default=text("CURRENT_TIMESTAMP")`` mirroring the alembic migration declaration. Pre-Wave-5 ORM used ``default=utc_now`` Python-side; alembic check passed because Postgres treats both as semantic-equivalent but the per-mirror discipline is stronger when both layers speak the same dialect (schema-touching follow-ups cannot drift undetected). * **P5B-B tenant_scope_key org-prefix forward-compat** (per T3 chunk 2 obs C + §H.2 forward-compat lock): new ``aperag.indexing.parse_orchestrator.resolve_tenant_scope_key`` helper centralises the prefix construction. Pre-Wave-5 the prefix was hard-coded as ``f"user:{document.user}"`` at every callsite (upload handler, reconciler stuck-doc re-enqueue); the helper unifies the construction so a future Wave 6/7 organisation-tenant rollout flips one place. The org branch stays inert for Wave 5 (Document/Collection schemas don't carry org_id yet) — the helper just makes the seam explicit. * **P5B-C OTLP config cross-check** (per huangheng T6 chunk 1 obs): lifespan startup logs a clear warning when ``INDEXING_METRICS_EMITTER=otlp`` but ``APERAG_OBSERVABILITY_MODE`` is not set to ``otlp`` / ``collector``. Pre-Wave-5 this combination silently produced an ``OTLPMetricsEmitter`` whose ``MeterProvider`` was never installed by ``aperag.observability.metrics.init_metrics_provider`` — samples no-op'd silently, defeating the explicit ``INDEXING_METRICS_EMITTER=otlp`` opt-in. * **P5B-D cleanup builder share helper** (per huangheng T2 obs B drift risk): new ``_build_collection_qdrant_connector(collection, *, allow_vector_size_fallback)`` shared helper used by ``_build_vector_worker`` / ``_build_summary_worker`` (dispatch, ``allow_vector_size_fallback=False``) AND ``_build_qdrant_cleanup_backend`` (cleanup, ``allow_vector_size_fallback=True``). Pre-Wave-5 dispatch and cleanup paths duplicated the embedder + connector wiring; a future Qdrant adapter signature change had to be applied twice. ``_build_vision_worker`` keeps its pre-helper shape — its ``is_multimodal()`` gate must fire BEFORE any network call to preserve the Wave 4 chunk 4b "fail fast on non-multimodal embedder" invariant the existing test pins. Three-class tag (Wave 3 production-readiness invariant): * must-be-real: 4 polish items each tighten a real production seam (alembic mirror discipline / org-prefix forward-compat / OTLP config consistency / cleanup-vs-dispatch drift surface) * may-be-gated: org-prefix branch stays inert until org_id columns land (declared in helper docstring) * fully-resolves: 4 of the 5 P5B chenyexuan-batch backlog items. ``chunk_id`` schema unify is left to a follow-up — parser canonical schema lock first needs the §C.x amend; touching the ``or chunk.get("id")`` fallback now without that lock would introduce drift in the opposite direction. Deltas: * ``aperag/indexing/graph_storage/postgres.py`` — ``_LineageEntityRow`` + ``_LineageRelationRow`` switched to ``server_default=CURRENT_TIMESTAMP``; ``utc_now`` import dropped. * ``aperag/indexing/parse_orchestrator.py`` — ``resolve_tenant_scope_key(*, document, collection)`` helper + re-export. * ``aperag/domains/knowledge_base/service/document_service.py`` — ``_create_or_update_document_indexes`` calls ``resolve_tenant_scope_key`` in place of the inline ``f"user:{document.user}"``. * ``aperag/indexing/reconciler.py`` — ``_build_parse_payload_for_document`` calls the same helper for the stuck-document re-enqueue producer (consistency with upload handler). * ``aperag/app.py`` — lifespan logs the OTLP-vs-observability-mode cross-check warning before constructing ``OTLPMetricsEmitter``. * ``aperag/indexing/worker_factory.py`` — ``_build_collection_qdrant_connector`` shared helper + ``_build_vector_worker`` / ``_build_summary_worker`` / ``_build_qdrant_cleanup_backend`` reuse it; vision keeps its multimodal-gate-first shape. Local gates: * ``pytest tests/unit_test/indexing/ tests/integration/`` — 232 passed / 48 skipped (no regression). * ``ruff check aperag/ tests/integration/`` — clean. * ``ruff format`` — applied. Branch: ``chenyexuan/celery-wave5-p4`` rebased on ``bryce/celery-wave5`` HEAD ``4f0e9b0`` (P2 chunk 1 embed_image API surface) → push to ``bryce/celery-wave5`` for Wave 5 single-PR pile. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(celery Wave 5 P5A item 4): Neo4j label namespace prefix `aperag_` Per §K.9.1 P5A item 14 + design pack §K.9 Phase 5A item 4: deployments that share a Neo4j instance with user-owned graphs would collide on the generic `LineageEntity` / `LineageRelation` labels. Bump them to `aperag_LineageEntity` / `aperag_LineageRelation` and align constraint names (`aperag_lineage_entity_collection_name_unique` / `aperag_lineage_relation_collection_triple_unique`). Hard-cut second round (no data migration): Wave 4 graph indexing defaulted `enable_knowledge_graph=False`, so no production deployment has data on the legacy unprefixed labels. Postgres / Nebula backends are unaffected (table names already namespaced via `aperag_lineage_*` schema in alembic migration `e7a3b9c2d1f6`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(celery Wave 5 P5A close-out): items 2/3/5 deferred to Wave 6 with explicit reasoning Per Wave 5 P5A scope close-out 2026-04-27: * Item 2 (chunk 4b _no_op_extractor identity check → attribute marker) is N/A: the placeholder was deleted in Wave 4 T1 (`19d3d70f`) when `build_collection_graph_extractor` replaced `_no_op_extractor`. There is no surviving identity-check site to harden — moot in current code. * Item 3 (W5-perf-graph-lineage parallel-list O(N) cross-backend) is deferred to Wave 6: Postgres needs JSONB → text[] migration, Nebula needs tag re-model, plus alembic column-shape change. >10k docs/entity is not a Wave 5 acceptance criterion; trigger when observed in real deployments. * Item 5 (Cypher type keyword rename `n.type` → `entity_type` / `relation_type`) is deferred to Wave 6: cross-backend rename (Postgres column + alembic; Cypher property rename; Nebula tag-prop rename) plus §D.3 Protocol-surface change (`EntityRecord.type` → `EntityRecord.entity_type` etc.). Cypher `TYPE()` is technically only for relationships so current code is unambiguous — forward-compat hygiene rather than correctness fix. §K.9 Phase 5A acceptance lines amended to reflect ship status (items 1+4 shipped; 2 N/A; 3+5 → Wave 6). §K.10 Wave 6 backlog extended with items 6+7 covering the deferred cross-backend rewrites. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(celery Wave 5 P2 chunk 2): parser image extraction → derived/parse_<v>/vision/{images,source.jsonl} Per §G.2.5.1 spec amend item 2: when DocParser produces `AssetBinPart` payloads (PDF page images, single-image-input passthrough, data-URI extracted images), the parser writes each image blob to `derived/parse_<v>/vision/images/<image_id>.<ext>` and lands a `vision/source.jsonl` descriptor enumerating them. The vision worker (chunk 4) consumes the descriptor instead of the T1 simulator's synthetic `images.json` companion. `aperag/indexing/parser.py`: * `_docparser_extract_markdown` now returns `tuple[str, list[_VisionImageAsset]]` — markdown body unchanged plus the extracted image assets list. Non-image `AssetBinPart`s (audio / PDF data) drop. Duplicate `asset_id`s deduplicated to keep Qdrant-point-id stable. * `_VisionImageAsset` carrier holds (`image_id`, `data`, `mime_type`, `alt_text`, `page_idx`, `bbox`). * `_vision_image_extension(mime_type)` maps known image MIMEs to a filename extension; falls back to `.bin` for unknown types since vision worker uses `imghdr` on bytes at embed time. * `_write_vision_assets` persists each blob via `write_atomic` and writes the JSONL descriptor. * `ParseResult.vision_source_path` and `vision_image_count` are new optional fields (defaults `""` / `0`) so pre-Wave-5 callers see no behaviour change. Image-only inputs (no markdown emitted) still land their assets so vision modality has bytes to embed. `tests/unit_test/indexing/test_parser_image_extraction.py`: 9 unit tests covering MIME → extension mapping, descriptor schema, simulator-path no-op, image-only input asset persistence, unknown-MIME `.bin` fallback, and the new `ParseResult` fields. Tests monkeypatch `_docparser_extract_markdown` to avoid pulling MinerU / MarkItDown into the unit-test path. Production-readiness 三类: - must-be-real: real `AssetBinPart` extraction wired through DocParser - may-be-gated: descriptor + image artefacts only land when DocParser surfaces `AssetBinPart`s (image-less docs see no vision/ writes) - fully-resolves: §G.2.5.1 spec item 2 (parser image extraction) + Wave 5 P2 chunk 2 acceptance The vision worker callsite rewrite (chunk 4) and provider v3 multimodal flag UI (chunk 3) remain. Defaults preserve existing behaviour for text-only callers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(celery Wave 5 P2 chunk 3): Model.supports_multimodal_embedding capability flag Per §G.2.5.1 spec amend item 3: surface a typed capability flag for embedding models that accept image bytes (CLIP / Voyage Multimodal / Jina v3 / OpenAI multimodal embeddings) so operators can register them via the v3 model platform UI. The flag drives the chunk 4b vision gate's `EmbeddingService.is_multimodal()` runtime check — flip it on the collection's embedder spec model and the gate self-disables, enabling vision modality. Distinct from existing `supports_vision` (chat/completion models that accept image input) — `supports_multimodal_embedding` describes embedding models that produce vectors from images. Changes: * `aperag/domains/model_platform/schemas.py`: new optional `supports_multimodal_embedding: bool = False` on `Model` / `ModelCreate` / `ModelUpdate` (default False, no behaviour change for existing rows). * `aperag/domains/model_platform/db/models.py`: new `supports_multimodal_embedding` Boolean column with default False. * `aperag/migration/versions/...f1c8d2a5b6e3...`: alembic migration adding the column with `server_default=false`. Forward-only cutover safe because pre-Wave-5 rows default to False (matches prior implicit behaviour). * `aperag/domains/model_platform/service/model_service.py`: `_model_to_schema` maps the new column. * `aperag/db/repositories/llm_provider.py`: `create_model` accepts the new field so the v3 `POST /models` flow persists it. * `aperag/llm/embed/base_embedding.py`: `get_embedding_service` prefers the typed `supports_multimodal_embedding` column; falls back to the legacy `runner_config["multimodal"]` JSON entry so pre-Wave-5 operators who edited the JSON keep working without re-saving the row. Tests: 3 new unit tests in `test_model_platform_v3_contract.py` covering default-False behaviour, opt-in True path, and v3 OpenAPI schema exposure on Model / ModelCreate / ModelUpdate. Production-readiness 三类: - must-be-real: real DB column + alembic migration + v3 API surface - may-be-gated: legacy `runner_config["multimodal"]` fallback for rows created before the typed column landed - fully-resolves: §G.2.5.1 spec item 3 (provider v3 multimodal capability flag UI) + Wave 5 P2 chunk 3 acceptance Phase 2 chunk 4 (callsite rewrite + chunk 4b gate self-disable verify + Layer 1/2 test rename) remains as the final Wave 5 blocker. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(celery Wave 5 P2 chunk 4): vision callsite rewrite + chunk 4b gate self-disable verify Per §G.2.5.1 spec amend final piece: rewire `_build_vision_worker._embed` to call `EmbeddingService.embed_image(image_bytes, alt_text)` (chunk 1) with the actual image bytes the parser persisted (chunk 2 wrote `derived/parse_<v>/vision/images/<image_id>.<ext>` + a JSONL descriptor at `vision/source.jsonl`). The chunk 4b vision gate self-disables when the operator flips `Model.supports_multimodal_embedding=True` (chunk 3). `aperag/indexing/worker_factory.py`: * `_embed(image_id, alt_text, image_bytes=None)` — when image_bytes is provided, route to `embedding_service.embed_image(image_bytes, alt_text)`. None falls back to the legacy text-concat path so the T1 simulator + tests that hand the worker synthetic JSON keep working. * Gate-raise message reframed: drops "Wave 4 wiring" phrasing (now Wave 5 wiring is land), names the typed `Model.supports_multimodal_embedding` flag so an operator can fix the config directly. `aperag/indexing/vision.py`: * `VisionModality.derive` accepts both source-path formats: the legacy single-JSON-array shape (T1 simulator / pre-Wave-5 tests) AND the new JSONL-with-image-path shape (parser chunk 2 output). Format detection is by first non-whitespace byte (`[` → JSON array, else → JSONL). * `_load_image_bytes(record)` reads the descriptor's `image_path` through the object store; missing blob logs a warning and returns None (embedder still runs on the alt-text/id placeholder digest) so a partial parser write doesn't block the whole derive cycle. * `_placeholder_embedding(..., image_bytes=None)` mirrors the new embedder signature; placeholder ignores bytes. * Embedder Protocol is widened: `(image_id, alt_text, image_bytes=None)`. `tests/integration/test_full_indexing_pipeline.py`: * Renamed Layer 1 `test_phase1_vision_modality_raises_wave4_wiring_gate` → `..._gate_raises_when_embedder_not_multimodal`. Asserts the reframed message names `multimodal embedding model` + `supports_multimodal_embedding` flag. * New positive-path Layer 1 `test_phase1_vision_modality_gate_self_disables_when_embedder_multimodal`: with `is_multimodal()=True`, the factory builds a vision worker without raising — pins the chunk 4 gate-self-disable contract. * Layer 2 e2e assertion: vision modality may be ACTIVE (when CI fixture has multimodal embedder configured) OR FAILED with a gate marker including `supports_multimodal_embedding`. OR-on-marker tolerance kept for transition state. `tests/unit_test/indexing/test_t1_4_summary_vision.py`: 3 new tests covering JSONL descriptor with image_path (bytes loaded + forwarded), missing-blob graceful fallback, and legacy JSON-array backward compat. Production-readiness 三类: - must-be-real: real `embed_image` callsite + real bytes load from parser's descriptor - may-be-gated: legacy text-concat fallback + simulator JSON format preserved for tests - fully-resolves: §G.2.5.1 spec items 1+2+3 all wired end-to-end + chunk 4b vision gate self-disable verify (Wave 5 P2 closure) Wave 5 P2 (T7 multimodal vision-LLM 3-item bundle) closed. The chunk 4b vision gate self-disables when an operator configures a multimodal embedder; default-off behaviour preserved for text-only collections. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Bryce <bryce@apecloud.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Wave 4: Indexing redesign — real backends + 11 production-readiness items
Continues the indexing redesign hard-cut from Wave 3 (PR #1729 merged at
00ae644f+ final review msg=2721a5e7). Wave 4 wires real backends for the 6 modality / infrastructure layers that Wave 3 explicitly gated or kept as InMemory placeholders with production-readiness invariant declarations.Per architect msg=fab88774 (Wave 1+2 gap report) + msg=0aea9edf (graph 3-adapter migration ruling) + msg=803a2757 (owner allocation) + msg=7a8ce889 (context-budget ruling): 11 hard-locked items, single PR + chunked-rotation push (per @earayu2 msg=1d77340c "大 PR 快速落地" model).
Wave 4 backlog (11 items, hard-locked)
LineageGraphStoreadapter_no_op_extractorwith real LLM (GPT-4 / Claude / DeepSeek per collection.config)workers={}5 modality singleton fan-outq:parseasync queue per §E.2)07599fac)WorkQueuebackend650ef6dd)QuotaBackendLua atomic (§H.5)(resource_class, tenant_scope_key)+ multi-tenant fairnessMetricsEmitterwire-in production (§J.1)OTLPMetricsEmitterplug toaperag/observability/+ 4 SLI emit68588420) self-disables onis_multimodal()==TrueLineageGraphStore) + delete legacyaperag/domains/knowledge_graph/graphindex/storage/*-3500/+3000LOC net)collection.config.fulltext_backend_type) + sweep_fulltext_search:350minimum_should_match latentPushed commits
f0571f98— T8 chunk 1: PostgresLineageGraphStore reference adapter (~545 LOC, 2 files; 10-method §D.3.5 Protocol; JSONB SET storage; single-statement INSERT ON CONFLICT atomicity; tenant isolation 三层防护). Architect schema ratify msg=95179f2a; huangheng pass 1 ✅ msg=b6f20096.650ef6dd— T4 chunk 1: RedisWorkQueue + lifespan dispatch onINDEXING_QUEUE_BACKEND+ 6 integration tests (multi-consumer atomic demux core invariant; per-modality keying; close+reconnect lifecycle). production-readiness invariant 三类 layer 全 declared. huangheng pass 1 ✅ msg=8ff8767e.In-progress (fresh sessions)
LineageGraphStoreadapter — Cypher list-of-MAP property + strip-then-append atomicity ([x IN list WHERE NOT (...) | x] + [{...}]) + 4 design points portable from Postgres reference (msg=95179f2a)OTLPMetricsEmitterimpl +INDEXING_METRICS_EMITTERconfig dispatch + 4 SLI emit + lifespan wireProduction-readiness invariant pattern (Wave 3 lesson #10 sediment)
Each Wave 4 item declares 3-layer in commit description + docs:
References:
memory/feedback_alembic_drift_check.md— schema-touching 三检memory/feedback_e2e_dataflow_trace.md— 9 lessons + Lesson feat: auth with github and google #10 explicit-gate self-disable patternmemory/feedback_production_readiness_invariant.md— spec/impl 互补 lessonCR (huangheng) pass-1 lock 4 items
每 chunk push 后立即 cross-check:
upgrade head + check + downgrade -1 + upgrade head)isinstance(InMemory*) → raise)Graph T8 chunk 4 acceptance lock (per huangheng pass 1 msg=b6f20096 + architect msg=95179f2a)
_LineageEntityRow+_LineageRelationRow必有对应 migration filedescription列 = 100% legacy 残留 → drop (RelationWithLineage dataclass 无此字段)asyncio.run/run_coroutine_threadsafein factory)Test plan
Style note (per @earayu2 msg=1d77340c)
大 PR + chunked-rotation 多 session push + huangheng multi-pass review + fix-forward (与 Wave 3 PR #1729 同款,避免 PR 拆分 churn)。Bryce + chenyexuan 共享 branch;写集 disjoint per architect msg=06c14844。
🤖 Generated with Claude Code