Skip to content

Commit 8676a27

Browse files
authored
docs: add architecture explanation and adrs (#353)
1 parent 2256119 commit 8676a27

6 files changed

Lines changed: 362 additions & 0 deletions

AGENTS.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# AGENTS.md
2+
3+
Guidance for AI coding agents working in this repository.
4+
5+
## Goal
6+
7+
Implement features and fix bugs with minimal regression risk, while preserving memU's architecture:
8+
9+
- `MemoryService` as composition root
10+
- workflow-based execution (`memorize`, `retrieve`, CRUD/patch)
11+
- pluggable storage backends (`inmemory`, `sqlite`, `postgres`)
12+
- profile-based LLM routing (`default`, `embedding`, custom profiles)
13+
14+
See `docs/architecture.md` for the current architectural view.
15+
16+
## Where to Change Code
17+
18+
- Service/runtime wiring: `src/memu/app/service.py`
19+
- Memorize flow: `src/memu/app/memorize.py`
20+
- Retrieve flow: `src/memu/app/retrieve.py`
21+
- CRUD/Patch flow: `src/memu/app/crud.py`
22+
- Config models/defaults: `src/memu/app/settings.py`
23+
- Workflow engine: `src/memu/workflow/*`
24+
- Storage abstraction/factory: `src/memu/database/interfaces.py`, `src/memu/database/factory.py`
25+
- In-memory: `src/memu/database/inmemory/*`
26+
- SQLite: `src/memu/database/sqlite/*`
27+
- Postgres: `src/memu/database/postgres/*`
28+
- LLM clients/wrappers/interceptors: `src/memu/llm/*`
29+
- Integrations: `src/memu/integrations/*`, `src/memu/client/*`
30+
- Tests: `tests/*`
31+
32+
## Implementation Rules
33+
34+
- Keep changes small and localized.
35+
- Do not change public API signatures unless explicitly required.
36+
- Preserve async behavior and existing workflow step contracts (`requires`/`produces` keys).
37+
- If adding a new capability, prefer integrating through an existing pipeline step or a new clearly named step.
38+
- Maintain backend parity where appropriate (if a repository contract changes, update all relevant backends).
39+
- Validate `where`/scope behavior against `UserConfig.model`; do not bypass scope filtering.
40+
- Keep type hints and mypy compatibility intact.
41+
42+
## Feature Work Checklist
43+
44+
1. Locate affected flow(s): memorize, retrieve, CRUD, or integration layer.
45+
2. Update config models/defaults if behavior is configurable.
46+
3. Wire behavior through `MemoryService` pipelines and step config (LLM profiles/capabilities).
47+
4. Implement backend/repository changes for all impacted providers.
48+
5. Add/extend tests for happy path and edge cases.
49+
6. Update docs when behavior changes (`README.md`, `docs/*`, examples if needed).
50+
7. If the change is architectural, add/update ADRs under `docs/adr/`.
51+
52+
## Bug Fix Checklist
53+
54+
1. Reproduce with an existing or new failing test.
55+
2. Implement the smallest safe fix at the correct layer.
56+
3. Add a regression test that fails before and passes after.
57+
4. Check cross-backend effects (`inmemory`, `sqlite`, `postgres`) and retrieval modes (`rag`, `llm`) when relevant.
58+
5. Verify no unintended API/output shape changes.
59+
60+
## Testing and Validation
61+
62+
Use `uv` for all local runs.
63+
64+
- Setup: `make install`
65+
- Run all tests: `make test`
66+
- Run focused tests: `uv run python -m pytest tests/<target_test>.py`
67+
- Full quality checks: `make check`
68+
69+
At minimum, run targeted tests for touched code. Run `make check` for broad or cross-cutting changes.
70+
If you cannot run a required check, state it explicitly in your final summary.
71+
72+
## Done Criteria
73+
74+
Before finishing, ensure:
75+
76+
- Code compiles and tests for changed behavior pass.
77+
- New behavior is covered by tests.
78+
- Docs are updated for user-visible or architectural changes.
79+
- No unrelated files were modified.
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# ADR 0001: Use Workflow Pipelines for Core Operations
2+
3+
- Status: Accepted
4+
- Date: 2026-02-24
5+
6+
## Context
7+
8+
memU has multiple high-level operations (`memorize`, `retrieve`, and CRUD/patch operations) that each require multi-stage execution, LLM calls, storage writes, and optional short-circuit behavior.
9+
10+
A single monolithic function per operation would make these flows hard to extend, observe, and customize.
11+
12+
## Decision
13+
14+
Model each core operation as a named workflow pipeline composed of ordered `WorkflowStep` units.
15+
16+
- Register pipelines centrally in `MemoryService` via `PipelineManager`
17+
- Validate required/produced state keys at pipeline registration/mutation time
18+
- Execute through a `WorkflowRunner` abstraction (`local` by default)
19+
- Support runtime customization by step-level config and structural mutation (insert/replace/remove)
20+
- Provide before/after/on_error step interceptors for instrumentation and control
21+
22+
## Consequences
23+
24+
Positive:
25+
26+
- uniform execution model across memorize/retrieve/CRUD
27+
- explicit, inspectable stage boundaries
28+
- extension points for custom runners and step customization
29+
- easier interception and observability around stage execution
30+
31+
Negative:
32+
33+
- dict-based workflow state relies on key naming discipline
34+
- pipeline mutation can increase behavioral variance between deployments
35+
- more framework code compared to direct function calls
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# ADR 0002: Use Pluggable Storage with Backend-Specific Vector Search
2+
3+
- Status: Accepted
4+
- Date: 2026-02-24
5+
6+
## Context
7+
8+
memU must support:
9+
10+
- zero-setup local development
11+
- lightweight persisted deployments
12+
- production deployments that need scalable vector similarity
13+
14+
No single storage engine fits all three cases.
15+
16+
## Decision
17+
18+
Adopt repository-based storage abstraction behind a `Database` protocol, with selectable providers:
19+
20+
- `inmemory`: in-process state, brute-force similarity
21+
- `sqlite`: file-based persistence, embeddings stored as JSON text, brute-force similarity
22+
- `postgres`: SQL persistence, pgvector-enabled similarity when configured
23+
24+
Vector behavior is backend-aware:
25+
26+
- brute-force cosine search remains available for portability
27+
- Postgres can use pgvector distance queries when vector support is enabled
28+
- salience ranking (reinforcement/recency-aware) uses local scoring logic
29+
30+
## Consequences
31+
32+
Positive:
33+
34+
- one service API works across local and production footprints
35+
- clear backend contracts through repository interfaces
36+
- predictable fallback behavior when native vector index is unavailable
37+
38+
Negative:
39+
40+
- duplicate repository logic across backends
41+
- behavior/performance differences between providers
42+
- SQLite and in-memory vector search does not scale as well as indexed pgvector
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# ADR 0003: Model User Scope as First-Class Fields on Memory Records
2+
3+
- Status: Accepted
4+
- Date: 2026-02-24
5+
6+
## Context
7+
8+
memU retrieval and writes need scoped operation (for example per `user_id`, `agent_id`, or session) for multi-user and multi-agent scenarios.
9+
10+
Keeping scope outside stored records would force ad-hoc filtering logic and weaken data isolation.
11+
12+
## Decision
13+
14+
Embed scope directly into all persisted entities by merging a configurable `UserConfig.model` with core record models.
15+
16+
- Scope fields are part of resource/category/item/relation models
17+
- Repositories accept `user_data` on writes and `where` filters on reads
18+
- API-level `where` filters are validated against configured scope fields before execution
19+
20+
## Consequences
21+
22+
Positive:
23+
24+
- consistent filtering model across memorize/retrieve/CRUD APIs
25+
- backend-independent scoping semantics
26+
- supports multi-tenant and multi-agent patterns without separate storage stacks
27+
28+
Negative:
29+
30+
- schema/model generation complexity increases
31+
- schema and index shape can vary by chosen scope model
32+
- callers must keep `where` and `user` payloads aligned with configured scope fields

docs/adr/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Architecture Decision Records
2+
3+
- [0001: Use Workflow Pipelines for Core Operations](0001-workflow-pipeline-architecture.md)
4+
- [0002: Use Pluggable Storage with Backend-Specific Vector Search](0002-pluggable-storage-and-vector-strategy.md)
5+
- [0003: Model User Scope as First-Class Fields on Memory Records](0003-user-scope-in-data-model.md)

docs/architecture.md

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# memU Architecture
2+
3+
## Purpose and scope
4+
5+
This document describes the self-hosted `memu` Python package architecture as implemented in this repository.
6+
7+
The repository also describes a hosted Cloud product in `README.md`, but this document focuses on the local `MemoryService` runtime and its code paths.
8+
9+
## System overview
10+
11+
memU follows the "memory as file system" concept from the README and implements it with three persistent layers:
12+
13+
- `Resource`: raw source artifacts (conversation/document/image/video/audio)
14+
- `MemoryItem`: extracted atomic memories with embeddings
15+
- `MemoryCategory`: grouped topic summaries
16+
- `CategoryItem`: item-category relation edges
17+
18+
At runtime, `MemoryService` orchestrates ingestion, retrieval, and manual CRUD over these layers.
19+
20+
```mermaid
21+
flowchart TD
22+
A["Input Resource or Query"] --> B["MemoryService"]
23+
B --> C["Workflow Pipelines"]
24+
C --> D["LLM Clients"]
25+
C --> E["Database Repositories"]
26+
E --> F["Resources"]
27+
E --> G["Memory Items"]
28+
E --> H["Memory Categories"]
29+
E --> I["Category Relations"]
30+
```
31+
32+
## Core runtime components
33+
34+
### `MemoryService` as composition root
35+
36+
`src/memu/app/service.py` constructs and owns:
37+
38+
- typed configs (`LLMProfilesConfig`, `DatabaseConfig`, `MemorizeConfig`, `RetrieveConfig`, `UserConfig`)
39+
- storage backend (`build_database(...)`)
40+
- resource filesystem fetcher (`LocalFS`)
41+
- LLM client cache and wrappers
42+
- workflow and LLM interceptor registries
43+
- workflow runner (`local` by default, pluggable)
44+
- named workflow pipelines via `PipelineManager`
45+
46+
Public APIs are assembled by mixins:
47+
48+
- `MemorizeMixin`: `memorize(...)`
49+
- `RetrieveMixin`: `retrieve(...)`
50+
- `CRUDMixin`: list/clear/create/update/delete memory operations
51+
52+
### Workflow engine
53+
54+
All major operations execute as workflows (`WorkflowStep`) with:
55+
56+
- explicit required/produced state keys
57+
- declared capability tags (`llm`, `vector`, `db`, `io`, `vision`)
58+
- per-step config (for profile selection)
59+
60+
`PipelineManager` validates step dependencies at registration/mutation time and supports runtime pipeline revisioning (`config_step`, `insert_before/after`, `replace_step`, `remove_step`).
61+
62+
`WorkflowRunner` is a protocol; default `LocalWorkflowRunner` executes sequentially with `run_steps(...)`.
63+
64+
### Interception and observability hooks
65+
66+
Two interceptor systems exist:
67+
68+
- workflow step interceptors: before/after/on_error around each step
69+
- LLM call interceptors: before/after/on_error around `chat/summarize/vision/embed/transcribe`
70+
71+
LLM wrappers also extract best-effort usage metadata from raw provider responses.
72+
73+
## Ingestion architecture (`memorize`)
74+
75+
`memorize(...)` executes the `memorize` pipeline:
76+
77+
1. `ingest_resource`: fetch local/remote resource into `blob_config.resources_dir` via `LocalFS`
78+
2. `preprocess_multimodal`: modality-specific preprocessing for conversation/document/audio (text-oriented path) and image/video (vision-oriented path)
79+
3. `extract_items`: per-memory-type LLM extraction into structured entries
80+
4. `dedupe_merge`: placeholder stage (currently pass-through)
81+
5. `categorize_items`: persist resource + memory items + item-category relations and embeddings
82+
6. `persist_index`: update category summaries; optionally persist item references
83+
7. `build_response`: return resource(s), items, categories, relations
84+
85+
Category bootstrap is lazy and scoped: categories are initialized when needed with embeddings, and mapped by normalized category name.
86+
87+
## Retrieval architecture (`retrieve`)
88+
89+
`retrieve(...)` chooses one of two pipelines from config:
90+
91+
- `retrieve_rag` (embedding-driven ranking)
92+
- `retrieve_llm` (LLM-driven ranking)
93+
94+
Both use the same staged pattern:
95+
96+
1. route intention + optional query rewrite
97+
2. category recall
98+
3. sufficiency check (optional)
99+
4. item recall
100+
5. sufficiency check (optional)
101+
6. resource recall
102+
7. response build
103+
104+
Key behavior:
105+
106+
- `where` filters are validated against `user_model` fields before querying
107+
- RAG path uses vector similarity (and optional salience ranking for items)
108+
- LLM path ranks IDs from formatted category/item/resource context
109+
- each stage can stop early if sufficiency check decides context is enough
110+
111+
## Data and storage architecture
112+
113+
### Repository contracts
114+
115+
Storage is abstracted through a `Database` protocol with four repositories:
116+
117+
- `ResourceRepo`
118+
- `MemoryItemRepo`
119+
- `MemoryCategoryRepo`
120+
- `CategoryItemRepo`
121+
122+
### Backends
123+
124+
`build_database(...)` selects backend by `database_config.metadata_store.provider`:
125+
126+
- `inmemory`: in-process dict/list state
127+
- `sqlite`: SQLModel persistence, embeddings stored as JSON text, brute-force cosine search
128+
- `postgres`: SQLModel persistence with pgvector support (when enabled), local fallback ranking when needed
129+
130+
For Postgres, startup runs migration bootstrap and attempts `CREATE EXTENSION IF NOT EXISTS vector` in `ddl_mode="create"`.
131+
132+
### Scope model propagation
133+
134+
`UserConfig.model` is merged into record/table models so scope fields (for example `user_id`) become first-class columns/attributes across resources, items, categories, and relations.
135+
136+
This is why `where` filters and `user_data` writes are consistently available across APIs.
137+
138+
## LLM/provider architecture
139+
140+
LLM access is profile-based (`llm_profiles`):
141+
142+
- `default` profile for chat-like tasks
143+
- `embedding` profile for embedding tasks (auto-derived from default if not set)
144+
145+
Per-step profile routing happens through step config (`chat_llm_profile`, `embed_llm_profile`, or `llm_profile`).
146+
147+
Client backends:
148+
149+
- `sdk`: official OpenAI SDK wrapper
150+
- `httpx`: provider-adapted HTTP backend (OpenAI, Doubao, Grok, OpenRouter)
151+
- `lazyllm_backend`: LazyLLM adapter
152+
153+
## Integration surfaces
154+
155+
- `memu.client.openai_wrapper`: opt-in OpenAI client wrapper that auto-retrieves memories and injects them into system context
156+
- `memu.integrations.langgraph`: LangChain/LangGraph tool adapter (`save_memory`, `search_memory`)
157+
158+
## Current constraints and tradeoffs
159+
160+
- workflow state is dict-based, so step contracts are validated by key names rather than static types
161+
- SQLite/inmemory vector search is brute-force (portable but less scalable)
162+
- category update quality and extraction quality are prompt/LLM dependent
163+
- some extension hooks exist as placeholders (for example dedupe/merge stage)
164+
165+
## Related ADRs
166+
167+
- `docs/adr/0001-workflow-pipeline-architecture.md`
168+
- `docs/adr/0002-pluggable-storage-and-vector-strategy.md`
169+
- `docs/adr/0003-user-scope-in-data-model.md`

0 commit comments

Comments
 (0)