feat(celery Wave 6 #34): chunk_id schema canonical unification#1738
Merged
Conversation
Per docs/modularization/indexing-redesign-design-pack.md §K.11.1 #34 (huangheng T1 obs B + Wave 5 P5B chunk_id 5th item deferred). ## Changes aperag/indexing/vector.py - VectorBackend.upsert_point(chunk_id=) → upsert_point(point_id=) (canonical Qdrant naming, aligned with summary/vision protocols) - InMemoryVectorBackend record key renamed chunk_id → point_id - Vector worker callsite passes point_id=chunk["chunk_id"]; chunk_id remains in payload for hybrid-dedup with fulltext (§C.6 trade-off lock) aperag/indexing/worker_factory.py - _QdrantPointBackend.upsert_point(): single point_id keyword (drop pre-Wave-6 dual chunk_id|point_id transition shim) - Drop merged_payload.setdefault("chunk_id", identifier) — adapter no longer injects misleading chunk_id into summary/vision payloads (their points are not chunks); each modality controls its payload aperag/indexing/reconciler.py - utc_now audit comment: utc_now usage in reconciler is application-level wall-clock for lease comparison + gmt_updated stamps, distinct from Wave 5 P5B server_default=CURRENT_TIMESTAMP ORM-creation defaults. No further migration needed. tests/unit_test/indexing/test_chunk_id_schema_canonical.py (new) - 7 contract tests pin canonical naming: * VectorBackend / SummaryBackend / VisionBackend Protocols use point_id * InMemoryVectorBackend round-trips point_id at record level + chunk_id at payload level * Legacy chunk_id= keyword raises TypeError on InMemoryVectorBackend * _QdrantPointBackend.upsert_point uses single point_id param * _QdrantPointBackend does not inject chunk_id into payload * Parser chunks.jsonl chunk_id field naming preserved tests/integration/test_cleanup_fan_out.py tests/unit_test/indexing/test_t1_3_vector_fulltext.py tests/unit_test/indexing/test_t2_1_runtime.py tests/unit_test/indexing/test_t3_1_dispatcher_path_c.py - backend.upsert_point(chunk_id=...) → point_id=... - record reads p["chunk_id"] → p["point_id"] / p["payload"]["chunk_id"] ## Production-readiness 三类 layer (per §K.11.4) - must-be-real: parser layer chunk_id field schema unified across vector/summary/vision backend protocol surfaces; remaining utc_now usage audited (reconciler.py only, application-level, distinct from ORM defaults Wave 5 P5B already migrated) - may-be-gated: vector worker still keeps chunk_id in payload for hybrid-dedup with fulltext (§C.6 contract preserved) - fully-resolves: huangheng T1 obs B + Wave 5 P5B chunk_id 5th item deferred (per spec §K.11.1 row 34) ## simple-stable 4 guardrail (per feedback_simple_stable_zero_maintenance.md) 1. 不无限扩范围 ✅ — scope limited to vector protocol + adapter, no cross-cutting touch 2. 功能做实 ✅ — real schema rename, no transitional shim left behind 3. 简单稳定 ✅ — drops dual-API polymorphism + drops hidden payload side-effect (setdefault chunk_id), each layer's contract is now self-evident from signature 4. 私有化免维护 ✅ — operator/dev reading code sees canonical naming without needing to track which arg name belongs to which modality ## hard-cut policy (per earayu2 msg=30c81478) No production data → schema breaking change applied directly. No backward-compat shim, no deprecation window. Test that the legacy chunk_id= keyword raises TypeError catches accidental regression. ## Pre-check (per K.11.5 Pattern 2) grep upsert_point across aperag/indexing/ + tests/ — all callsites audited, all callers migrated, contract test pins canonical name. ## Local gates - 152/152 indexing unit tests pass - 21/21 indexing integration tests pass (cleanup_fan_out + dispatch_with_parse + inline_mode_smoke + parse_async_roundtrip) - 7/7 new contract tests pass - ruff check + ruff format clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…WARN log Per docs/modularization/indexing-redesign-design-pack.md §K.11.1 #38 (huangheng PR #1734 sync point 7 + feedback_announce_equals_landed.md narrative-truth invariant). ## Problem (pre-Wave-6 narrative-truth violation) `aperag.cache.application_runtime.get_application_cache()` keyed the cached `ApplicationCache` instance on the running asyncio event loop (the Redis async client is loop-bound). When the running loop changed (test process restart, worker reinit, asyncio re-initialisation, etc.), the function silently swapped the cache for a `NoopApplicationCacheBackend` with `enabled=False`. From that point on the worker paid the full LiteLLM / embedding round-trip on every request, with no log line, no metric, and no operator-visible signal that caching had degraded to zero. ## Fix When the loop-switch branch fires: - emit a WARN log with operator-actionable phrasing - bump `application_cache_metrics["application_runtime"]["loop_switch_rebuild"]` (uses the existing ApplicationCacheMetrics instance — no new metrics infra) - reset `_async_cache=None` and fall through to the normal initialisation path (which re-establishes the real Redis client on the new loop) The Noop fallback is now reached only when Redis is genuinely unreachable on the new loop — never as a silent loop-switch downgrade. ## Tests (4 new) tests/unit_test/cache/test_application_cache.py - test_application_cache_rebuilds_on_loop_switch_instead_of_silent_noop: two `_run_in_new_loop()` calls produce distinct ApplicationCache instances, second wraps real `ApplicationRedisCacheBackend` (not `NoopApplicationCacheBackend`); metric counter incremented to 1; WARN log captured. - test_application_cache_does_not_increment_loop_switch_metric_on_first_build: first-ever build does not bump the loop-switch counter. - test_application_cache_repeat_call_in_same_loop_is_singleton_no_metric_bump: same-loop repeat returns identical singleton, no metric bump. - test_application_cache_falls_back_to_noop_when_redis_unreachable_on_new_loop: if Redis genuinely fails on the new loop, rebuild attempts the real client, falls back to Noop, but metric still records the loop switch (operators see the rebuild attempt + the Redis failure WARN that was already there). ## Production-readiness 三类 layer (per §K.11.4) - must-be-real: real WARN log + real metric counter + real Redis client rebuild on cross-loop call (no silent zero-cache) - may-be-gated: first cross-loop request may pay rebuild cost (re-ping Redis) — observable via metric, not silent - fully-resolves: huangheng PR #1734 sync point 7 + feedback_announce_equals_landed.md narrative-truth lock ## simple-stable 4 guardrail 1. 不无限扩范围 ✅ — single-function fix in application_runtime.py, reuses existing `application_cache_metrics` infra, no new public API 2. 功能做实 ✅ — real cache rebuild not a placeholder 3. 简单稳定 ✅ — fewer paths than before (one rebuild path replaces the silent-Noop branch), each branch logged 4. 私有化免维护 ✅ — operator can grep `loop_switch_rebuild` in metrics + log "rebuilding for new event loop" to diagnose worker setup issues ## Local gates - 43/43 cache unit tests pass (39 existing + 4 new) - ruff check + ruff format clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wave 6 task #34 — parser canonical schema unification (per spec
docs/modularization/indexing-redesign-design-pack.md§K.11.1 row 34).Canonical name
point_idfor the Qdrant point identifier across the three modalities that share the{delete_by_filter, upsert_point}adapter (vector / summary / vision):VectorBackend.upsert_point(chunk_id=)→upsert_point(point_id=)(matches summary + vision protocol shape)_QdrantPointBackend.upsert_point()drops dualchunk_id | point_idtransition shim → singlepoint_idparamchunk_idinto summary / vision payloads (their points are not chunks); each modality now controls its own payload schemachunk_idin payload (parser-emitted, what hybrid retrieval reads) — §C.6 trade-off preservedutc_now audit: only
aperag/indexing/reconciler.pyusesutc_now, and those uses are application-level (lease comparison +gmt_updatedstamps), distinct from Wave 5 P5B'sserver_default=CURRENT_TIMESTAMPmigration which covers ORM creation defaults. Documented inline; no further migration needed.Production-readiness 三类 layer (per §K.11.4)
simple-stable 4 guardrail check
hard-cut policy
Per earayu2 msg=30c81478: no production data → schema breaking change applied directly. No backward-compat shim, no deprecation window. New contract test asserts that
chunk_id=keyword now raisesTypeErrorto catch accidental regression to the pre-Wave-6 dual API.Files changed
aperag/indexing/vector.py— protocol param + InMemoryVectorBackend record key + worker callsiteaperag/indexing/worker_factory.py—_QdrantPointBackend.upsert_point()simplificationaperag/indexing/reconciler.py— utc_now audit commenttests/unit_test/indexing/test_chunk_id_schema_canonical.py(new, 7 contract tests)tests/integration/test_cleanup_fan_out.py(5 callsite updates)tests/unit_test/indexing/test_t1_3_vector_fulltext.py(hybrid-dedup test reads payload chunk_id now)tests/unit_test/indexing/test_t2_1_runtime.py(3 callsite updates)tests/unit_test/indexing/test_t3_1_dispatcher_path_c.py(3 callsite updates + 1 record read)Test plan
tests/unit_test/indexing/passtest_cleanup_fan_out+test_dispatch_with_parse+test_inline_mode_smoke+test_parse_async_roundtrip)test_chunk_id_schema_canonical.py)🤖 Generated with Claude Code