Skip to content

docs(modularization): add current-state architecture doc (#27)#1636

Merged
earayu merged 5 commits into
mainfrom
huangheng/modularization-architecture-doc
Apr 24, 2026
Merged

docs(modularization): add current-state architecture doc (#27)#1636
earayu merged 5 commits into
mainfrom
huangheng/modularization-architecture-doc

Conversation

@earayu
Copy link
Copy Markdown
Collaborator

@earayu earayu commented Apr 24, 2026

Summary

Why now

Phase 6 merge left three docs (`README.md` / `roadmap.md` / `target-domain-map.md`) as the de-facto entry point, but all three are pre-Phase-6 historical plan docs. task #27 (`启动发射!`) asked the supporting architect to write a complete current-state architecture doc so that the 12-domain layout, the 20 boundary tests, the 2 permanent CRITICAL_WIRINGS, and the canonical rules codified over the 49-commit Phase 3+4+5 run (plus the 5-commit Phase 6 cleanup) have one authoritative reference.

What the doc covers (8 sections per PM spec)

  1. Executive summary — Phase 0→6 PR + merge commit table + steady-state facts.
  2. Domain map — per-domain inventory for all 12 backend domains (DB symbols, schemas count, services, consumer-owned Protocols, API routes) + top-level infra modules + the strict `aperag/schema/common.py` shared-primitive admission criterion + high-level `indexing` / `retrieval` / `knowledge_graph` structural summary (deeper SME input invited).
  3. Canonical rules — direct import vs `Protocol + DI`; the two subclasses (legacy-not-moved-yet vs standalone-infra permanent); dual-hook Scenario A re-export with the `sys.modules.get(...)` trick; per-domain `AuthenticatedUser(Protocol)` + User write hierarchy (lesson 9a-sexdec: facade > text SQL > forbidden).
  4. Boundary gates — G1 / G4 / G10 / G14 / G15 / G16 / G17 / G18 alt / G19 catalog with the exact pytest function backing each gate.
  5. Runtime seams — the two permanent `CRITICAL_WIRINGS` entries + why `dispatch_fn` is intentionally excluded.
  6. Legacy shim lifecycle — full file inventory for `aperag/service/*.py`, `aperag/views/*.py`, `aperag/db/models.py` re-exports, `aperag/schema/view_models.py` dual-hook blocks, and the residual `aperag/agent_runtime/` + `aperag/evaluation_v2/` top-level packages.
  7. Historical index — links to every phase2/3/4/5 breaking-changes doc + the three pre-Phase-6 roadmap docs.
  8. Future candidates — legacy shim hard-delete, `_enum_column` consolidation, residual legacy service/view extraction, etc., explicitly listed as non-commitments.

Inputs integrated

  • SME content from @cuiwenbo (dual-hook Scenario A technical detail + `schema/common.py` strict criteria).
  • SME-guidance from the architect (@符炫炜) and PM (@架构师 / 燧木) in task support sftp #27 thread.
  • Exhaustive code inventory against `28a9f531` for all 12 domains + 20 boundary tests + shim files.

Test plan

  • Doc-only PR — `make test-unit` should continue to pass without change (verifying no accidental code coupling).
  • Peer review by the participating SMEs for factual correctness (Phase 4 domain details by @bryce, Phase 5 domain details by @chenyexuan, Phase 3 KB / dual-hook by @cuiwenbo, canonical rulings by @符炫炜).
  • Architect + PM final pass on Section 3 canonical rules (ensure no drift from the sediment in notes/modularization-v2.md).
  • Verify every internal link in Section 7 resolves to an actually-present file.

🤖 Generated with Claude Code

earayu and others added 5 commits April 24, 2026 20:15
…se-6 end state (#27)

Adds `docs/modularization/architecture.md` as the authoritative source-of-truth for
the post-Phase-6 modularization end state. Scope is current-state only; the
existing `README.md` / `roadmap.md` / `target-domain-map.md` stay as historical
plan references.

The document covers, based on `origin/main @ 28a9f53`:

- Executive summary with the Phase 0→6 PR + merge commit table.
- Per-domain inventory for all 12 backend domains (identity / governance /
  model_platform / marketplace / knowledge_base / indexing / retrieval /
  knowledge_graph / conversation / agent_runtime / evaluation / web_access),
  listing DB symbols, schemas, services, consumer-owned Protocols, and API
  routes for each.
- Top-level infrastructure modules (`aperag/app.py`, `aperag/db/base.py`,
  `aperag/schema/common.py` with its strict shared-primitive admission
  criterion, `aperag/llm/`, etc.).
- The canonical rules governing every cross-domain seam:
  direct import vs `Protocol + DI` (two subclasses: legacy-not-moved-yet vs
  standalone-infra permanent); dual-hook Scenario A re-export with the
  `sys.modules.get(...)` trick that lets domain schemas rebind onto
  `aperag.schema.view_models` without tripping G1; per-domain
  `AuthenticatedUser(Protocol)` + the User write hierarchy (identity-owned
  facade > inline text SQL > forbidden cross-domain User ORM).
- Boundary gates catalog with the exact pytest function backing each of G1 /
  G4 / G10 / G14 / G15 / G16 / G17 / G18 alt / G19 plus the KB consumer-owned
  Protocol + DI smoke tests.
- Runtime-seams steady state: the two permanent `CRITICAL_WIRINGS` entries
  (`conversation.bot_service._quota_ops` and
  `agent_runtime.runtime._prompt_template_ops`), and why `dispatch_fn` is
  intentionally excluded.
- Legacy-shim inventory by file (service, view, db.models, view_models,
  top-level `agent_runtime` / `evaluation_v2`).
- Historical index linking the phase2/3/4/5 breaking-changes docs.
- Future candidates (legacy shim hard-delete, `_enum_column` consolidation,
  `AuthenticatedUser` shared Protocol, residual legacy service/view
  extraction, `web_access` deepening, URL prefix unification, deeper SME
  write-ups) listed but explicitly marked as non-commitments.

Doc-only change per msg=d54eb823 scope lock (no code or test changes).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…re doc (#27)

Per SME correction in the #27 thread (chenyexuan msg=e1f70eaa, architect
msg=c1face5c, PM msg=288cc691), the phrase "2 permanent CRITICAL_WIRINGS"
only refers to the Phase 5 G18-alt registry. G17 still holds 7 live
entries (4 KB consumer-owned Protocol slots + 3 identity `*InitOps`
adapters) and has not been collapsed by Phase 5/6.

Adjustments:

- Executive summary (Section 1) now states "two separate runtime-wiring
  registries": G17 with 7 entries + G18-alt with 2 permanent entries,
  instead of the ambiguous "2 permanent CRITICAL_WIRINGS" phrasing.
- Section 5 opens with a precision paragraph that calls out the two
  registries explicitly and warns against collapsing them into a single
  "2-entry" summary.
- Section 5.1 header renamed "The Phase 5 permanent two-entry registry
  (G18 alt)" so the scope is unambiguous.
- Section 5.2 expanded to explain why G18 alt collapsed to two over the
  Phase 5/6 run, and why G17 did not collapse: KB `_marketplace_ops` /
  `_search_pipeline_ops` / `_quota_ops` sit on standalone-infra providers
  that were never in domain-move scope; `_marketplace_collection_ops`
  stays as a narrow consumer-Protocol seam (with a candidate
  simplification in Section 8); identity `*InitOps` adapters still host
  the identity-side contract and have no retirement path without a
  separate design decision.

No other sections changed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per architect Section 8 audit (msg=5ce5f260) + PM guidance (msg=19eb93b6),
restructure Section 8 into:

- 8.1 High-value candidates (F1/F2/F3) with a paragraph each:
  * Cross-domain integration test coverage
  * `aperag/service/*.py` legacy shim hard-delete audit
  * `aperag/views/*.py` migration/retirement map
- 8.2 Compressed catalog (F4–F13) — one-line entries, effort/value
  tags, no roadmap tense:
  * F4 `_enum_column` helper consolidation
  * F5 `AuthenticatedUser(Protocol)` consolidation
  * F6 `aperag/app.py` DI wire-up extraction
  * F7 `aperag/schema/view_models.py` residual re-exports
  * F8 G-gate to boundary test mapping
  * F9 `aperag/agent_runtime/` + `aperag/evaluation_v2/` shim deletion
  * F10 `aperag/platform/` layering
  * F11 `web_access` depth
  * F12 `/api/v1` vs `/api/v2` unification
  * F13 Deeper SME write-up for indexing/retrieval/knowledge_graph

Language stays "may be worth doing" (PM hard boundary msg=19eb93b6:
don't write future candidates as "已决定要做").

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…h_pipeline_service classification (#27)

Independent CR corrections from architect (msg=c1face5c / 20:21 block)
and chenyexuan (msg=ef864605) both flagged:

1. Section 2.1 evaluation row listed `ChatSessionOps` and
   `AgentTurnDispatchOps` as Consumer ports, but both classes have zero
   runtime callers (seeded in Phase 5 5-S1 on a pre-rebase assumption
   that chat_service / agent_runtime would stay legacy; after 5-S4b +
   5-S5b domain moves + `9162ec4` rebase, `worker::dispatch_fn` switched
   to late-import direct cross-domain access and the seams were never
   wired). Dead Protocol literals only; shape identical to
   `ChatDocumentOps` before Phase 6 entry 4 deleted it.

2. Section 6.1 classified `aperag/service/search_pipeline_service.py`
   as "standalone-infra permanent" but that classification was never
   locked by a Phase 6 canonical decision — only `quota_service` and
   `prompt_template_service` got that lock (Phase 6 entry 3,
   msg=ebce5e1e). `search_pipeline_service` classification is pending.

Fixes:

- Section 2.1 evaluation Consumer ports narrowed to `AuthenticatedUser`
  with a new footnote that documents why the Protocol class-defs
  persist.
- Section 6.1 `search_pipeline_service.py` line re-labelled "legacy
  provider for KB `SearchPipelineOps`; classification … pending future
  decision"; `quota_service` / `prompt_template_service` lines upgraded
  to "Phase 6 entry 3 canonical" wording for symmetry.
- Section 5.2 parallel statement softened: `_search_pipeline_ops` no
  longer described as "standalone-infra or cross-cutting"; the
  classification-pending status is spelled out.
- Section 8.2 gains two new entries: F14 (dead Protocol class literal
  sweep for `ChatSessionOps` + `AgentTurnDispatchOps`) and F15
  (`search_pipeline_service` classification decision).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…versation topology (#27)

Per architect refined correction (msg=25eaffc1) and chenyexuan P5-1 patch
(msg=ef864605 + PM msg=0c7486de):

- Section 2.1 evaluation row now keeps `ChatSessionOps` and
  `AgentTurnDispatchOps` visible in the Consumer ports cell but
  explicitly annotates them as "dead Protocol literals (zero runtime
  callers — see footnote and Section 8 F14)". Matches the architect's
  preferred treatment: do not delete from the cell (reader would wonder
  why `ports.py` still has class defs); annotate inline so the reader
  sees both the file reality and the architectural status in one glance.
- New Section 2.5 `conversation` intra-domain dependency topology
  captures the six-service call graph (bot_service + turn_feedback_service
  as leaves; chat_title_service / chat_collection_service /
  chat_document_service / chat_service as the dependency-carrying
  services) plus the late-import rule used by agent_runtime and
  evaluation consumers to avoid evaluation → agent_runtime → conversation
  module-import-time cycles. Domain-edge map renumbered to Section 2.6.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@earayu earayu marked this pull request as ready for review April 24, 2026 12:25
@earayu earayu merged commit 522610e into main Apr 24, 2026
3 checks passed
@earayu earayu deleted the huangheng/modularization-architecture-doc branch April 24, 2026 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant