|
| 1 | +# Public API |
| 2 | + |
| 3 | +## Architecture |
| 4 | + |
| 5 | +`CosmosMemoryClient` and `AsyncCosmosMemoryClient` are thin orchestrators. They keep local-buffer state and Cosmos connection lifecycle, then delegate persistence to `MemoryStore` / `AsyncMemoryStore` and higher-level behavior to four services: |
| 6 | + |
| 7 | +- `LLMService` / `AsyncLLMService` for chat completions and embeddings. |
| 8 | +- `RetrievalService` / `AsyncRetrievalService` for filtering, vector search, and episodic context. |
| 9 | +- `PipelineService` for extraction, summaries, procedural synthesis, and reconciliation. |
| 10 | +- `ProcessorService` / `AsyncProcessorService` for immediate processing and threshold-driven auto-triggering. |
| 11 | + |
| 12 | +## CosmosMemoryClient (sync) |
| 13 | + |
| 14 | +### Connection |
| 15 | + |
| 16 | +- `__init__(cosmos_endpoint=None, cosmos_credential=None, cosmos_key=None, cosmos_database=None, cosmos_container=None, cosmos_counter_container=None, cosmos_lease_container=None, cosmos_throughput_mode=None, cosmos_autoscale_max_ru=None, ai_foundry_endpoint=None, ai_foundry_credential=None, ai_foundry_api_key=None, embedding_deployment_name='text-embedding-3-large', embedding_dimensions=None, chat_deployment_name='gpt-4o-mini', use_default_credential=True, processor=None) -> None` — configure local state, model clients, optional Cosmos auto-connect, and optional processing backend. |
| 17 | +- `close() -> None` — close Cosmos/model clients and owned credentials. |
| 18 | +- `connect_cosmos(endpoint=None, credential=None, key=None, database=None, container=None) -> None` — connect to an existing memory container. |
| 19 | +- `create_memory_store(database=None, container=None, counter_container=None, lease_container=None, endpoint=None, credential=None, key=None, embedding_dimensions=None, embedding_data_type=None, distance_function=None, full_text_language=None, throughput_mode=None, autoscale_max_ru=None) -> None` — create/connect the memory, counter, and lease containers. |
| 20 | + |
| 21 | +### Memory CRUD |
| 22 | + |
| 23 | +- `add_local(user_id, role, content, memory_type='turn', agent_id=None, metadata=None, thread_id=None, tags=None, ttl=None, salience=None) -> None` — append a memory to the local buffer. |
| 24 | +- `get_local(memory_id=None, user_id=None, role=None, memory_types=None) -> list[dict]` — filter local buffered memories. |
| 25 | +- `update_local(memory_id, content=None, role=None, memory_type=None, metadata=None) -> None` — update a local buffered memory. |
| 26 | +- `delete_local(memory_id) -> None` — remove a local buffered memory. |
| 27 | +- `add_cosmos(user_id, role, content, memory_type='turn', metadata=None, thread_id=None, tags=None, ttl=None, salience=None, embedding=None, embed=None) -> str` — upsert one memory to Cosmos and return its id. |
| 28 | +- `push_to_cosmos(batch_size=25) -> None` — flush local buffered memories to Cosmos. |
| 29 | +- `get_memories(memory_id=None, user_id=None, thread_id=None, role=None, memory_types=None, recent_k=None, tags=None, any_tags=None, exclude_tags=None, include_superseded=False, min_salience=None, min_confidence=None) -> list[dict]` — retrieve memories with filters. |
| 30 | +- `update_cosmos(memory_id, content=None, role=None, memory_type=None, metadata=None) -> None` — update a Cosmos memory. |
| 31 | +- `delete_cosmos(memory_id, thread_id, user_id) -> None` — delete a Cosmos memory. |
| 32 | +- `get_thread(thread_id, user_id=None, memory_types=None, recent_k=None, tags=None, exclude_tags=None, include_superseded=False) -> list[dict]` — retrieve a thread oldest-first. |
| 33 | +- `get_user_summary(user_id) -> Optional[dict]` — retrieve the active user-summary document. |
| 34 | + |
| 35 | +### Retrieval |
| 36 | + |
| 37 | +- `search_cosmos(search_terms, memory_id=None, user_id=None, role=None, memory_types=None, thread_id=None, hybrid_search=False, top_k=5, tags=None, any_tags=None, exclude_tags=None, include_superseded=False, min_salience=None, min_confidence=None) -> list[dict]` — vector or hybrid search memories. |
| 38 | +- `get_procedural_prompt(user_id) -> Optional[str]` — read the active procedural prompt. |
| 39 | +- `get_procedural_history(user_id, limit=10) -> list[dict]` — read procedural prompt history. |
| 40 | +- `get_procedural_memories(user_id, priority=None, category=None, min_salience=None, include_superseded=False) -> list[dict]` — retrieve procedural memory documents. |
| 41 | +- `search_episodic_memories(user_id, search_terms, top_k=5, min_salience=None, include_superseded=False) -> list[dict]` — search episodic memories. |
| 42 | +- `build_procedural_context(user_id) -> str` — format procedural context for prompts. |
| 43 | +- `build_episodic_context(user_id, query, top_k=3) -> str` — format relevant episodic context. |
| 44 | + |
| 45 | +### Processing |
| 46 | + |
| 47 | +- `extract_memories(user_id, thread_id, recent_k=None) -> dict[str, int]` — extract facts/episodic memories from a thread. |
| 48 | +- `synthesize_procedural(user_id, *, force=False) -> dict` — synthesize the procedural prompt. |
| 49 | +- `generate_thread_summary(user_id, thread_id, recent_k=None, **kwargs) -> dict` — generate and persist a thread summary. |
| 50 | +- `generate_user_summary(user_id, thread_ids=None, recent_k=None, **kwargs) -> dict` — generate and persist a user summary. |
| 51 | +- `reconcile(user_id, n=None) -> dict[str, int]` — reconcile duplicate or contradictory facts. |
| 52 | +- `process_now(*, user_id, thread_id) -> ProcessThreadResult` — run the configured processor immediately. |
| 53 | +- `process_now_and_wait(*, user_id, thread_id, timeout=30.0) -> bool` — process and wait for a summary. |
| 54 | + |
| 55 | +### Tagging |
| 56 | + |
| 57 | +- `add_tags(memory_id, user_id, thread_id, tags) -> None` — add tags to a memory. |
| 58 | +- `remove_tags(memory_id, user_id, thread_id, tags) -> None` — remove tags from a memory. |
| 59 | + |
| 60 | +## AsyncCosmosMemoryClient |
| 61 | + |
| 62 | +Local-buffer methods remain synchronous in-memory operations; Cosmos, retrieval, and processing methods are `async` and must be awaited. |
| 63 | + |
| 64 | +### Connection |
| 65 | + |
| 66 | +- `__init__(cosmos_endpoint=None, cosmos_credential=None, cosmos_key=None, cosmos_database=None, cosmos_container=None, cosmos_counter_container=None, cosmos_lease_container=None, cosmos_throughput_mode=None, cosmos_autoscale_max_ru=None, ai_foundry_endpoint=None, ai_foundry_credential=None, ai_foundry_api_key=None, embedding_deployment_name='text-embedding-3-large', embedding_dimensions=None, chat_deployment_name='gpt-4o-mini', use_default_credential=True, processor=None) -> None` — configure async local state, model clients, and optional processing backend. |
| 67 | +- `async close() -> None` — close async/sync resources and owned credentials. |
| 68 | +- `async connect_cosmos(endpoint=None, credential=None, key=None, database=None, container=None) -> None` — connect to an existing memory container. |
| 69 | +- `async create_memory_store(database=None, container=None, counter_container=None, lease_container=None, endpoint=None, credential=None, key=None, embedding_dimensions=None, embedding_data_type=None, distance_function=None, full_text_language=None, throughput_mode=None, autoscale_max_ru=None) -> None` — create/connect memory, counter, and lease containers. |
| 70 | + |
| 71 | +### Memory CRUD |
| 72 | + |
| 73 | +- `add_local(user_id, role, content, memory_type='turn', agent_id=None, metadata=None, thread_id=None, tags=None, ttl=None, salience=None) -> None` — append a memory to the local buffer. |
| 74 | +- `get_local(memory_id=None, user_id=None, role=None, memory_types=None) -> list[dict]` — filter local buffered memories. |
| 75 | +- `update_local(memory_id, content=None, role=None, memory_type=None, metadata=None) -> None` — update a local buffered memory. |
| 76 | +- `delete_local(memory_id) -> None` — remove a local buffered memory. |
| 77 | +- `async add_cosmos(user_id, role, content, memory_type='turn', metadata=None, thread_id=None, tags=None, ttl=None, salience=None, embedding=None, embed=None) -> str` — upsert one memory to Cosmos and return its id. |
| 78 | +- `async push_to_cosmos(batch_size=25) -> None` — flush local buffered memories to Cosmos. |
| 79 | +- `async get_memories(memory_id=None, user_id=None, thread_id=None, role=None, memory_types=None, recent_k=None, tags=None, any_tags=None, exclude_tags=None, include_superseded=False, min_salience=None, min_confidence=None) -> list[dict]` — retrieve memories with filters. |
| 80 | +- `async update_cosmos(memory_id, content=None, role=None, memory_type=None, metadata=None) -> None` — update a Cosmos memory. |
| 81 | +- `async delete_cosmos(memory_id, thread_id, user_id) -> None` — delete a Cosmos memory. |
| 82 | +- `async get_thread(thread_id, user_id=None, memory_types=None, recent_k=None, tags=None, exclude_tags=None, include_superseded=False) -> list[dict]` — retrieve a thread oldest-first. |
| 83 | +- `async get_user_summary(user_id) -> Optional[dict]` — retrieve the active user-summary document. |
| 84 | + |
| 85 | +### Retrieval |
| 86 | + |
| 87 | +- `async search_cosmos(search_terms, memory_id=None, user_id=None, role=None, memory_types=None, thread_id=None, hybrid_search=False, top_k=5, tags=None, any_tags=None, exclude_tags=None, include_superseded=False, min_salience=None, min_confidence=None) -> list[dict]` — vector or hybrid search memories. |
| 88 | +- `async get_procedural_prompt(user_id) -> Optional[str]` — read the active procedural prompt. |
| 89 | +- `async get_procedural_history(user_id, limit=10) -> list[dict]` — read procedural prompt history. |
| 90 | +- `async get_procedural_memories(user_id, priority=None, category=None, min_salience=None, include_superseded=False) -> list[dict]` — retrieve procedural memory documents. |
| 91 | +- `async search_episodic_memories(user_id, search_terms, top_k=5, min_salience=None, include_superseded=False) -> list[dict]` — search episodic memories. |
| 92 | +- `async build_procedural_context(user_id) -> str` — format procedural context for prompts. |
| 93 | +- `async build_episodic_context(user_id, query, top_k=3) -> str` — format relevant episodic context. |
| 94 | + |
| 95 | +### Processing |
| 96 | + |
| 97 | +- `async extract_memories(user_id, thread_id, recent_k=None) -> dict[str, int]` — extract facts/episodic memories from a thread. |
| 98 | +- `async synthesize_procedural(user_id, *, force=False) -> dict` — synthesize the procedural prompt. |
| 99 | +- `async generate_thread_summary(user_id, thread_id, recent_k=None, **kwargs) -> dict` — generate and persist a thread summary. |
| 100 | +- `async generate_user_summary(user_id, thread_ids=None, recent_k=None, **kwargs) -> dict` — generate and persist a user summary. |
| 101 | +- `async reconcile(user_id, n=None) -> dict[str, int]` — reconcile duplicate or contradictory facts. |
| 102 | +- `async process_now(*, user_id, thread_id) -> ProcessThreadResult` — run the configured processor immediately. |
| 103 | +- `async process_now_and_wait(*, user_id, thread_id, timeout=30.0) -> bool` — process and wait for a summary. |
| 104 | + |
| 105 | +### Tagging |
| 106 | + |
| 107 | +- `async add_tags(memory_id, user_id, thread_id, tags) -> None` — add tags to a memory. |
| 108 | +- `async remove_tags(memory_id, user_id, thread_id, tags) -> None` — remove tags from a memory. |
| 109 | + |
| 110 | +## Extension Points |
| 111 | + |
| 112 | +Advanced users can subclass or swap service implementations by matching these protocols from `agent_memory_toolkit.services`: |
| 113 | + |
| 114 | +- `LLMServiceProtocol` / `AsyncLLMServiceProtocol`: `chat`, `embed`, `embed_one`. |
| 115 | +- `RetrievalServiceProtocol` / `AsyncRetrievalServiceProtocol`: `search`, `get_memories`, `search_episodic`, `build_episodic_context`. |
| 116 | +- `PipelineServiceProtocol` / `AsyncPipelineServiceProtocol`: extraction, procedural synthesis, summaries, reconciliation, procedural context. |
| 117 | +- `ProcessorServiceProtocol` / `AsyncProcessorServiceProtocol`: processor access, counter access, immediate processing, wait, and auto-triggering. |
| 118 | +- Store layer protocols (`MemoryStoreProtocol`, `AsyncMemoryStoreProtocol`) define the persistence primitives consumed by services. |
0 commit comments