- Embeddings and chat clients can now be injected via the new
embeddings_clientandchat_clientconstructor arguments onCosmosMemoryClientandAsyncCosmosMemoryClient. See PR:#27
- Raw conversation turns can now be embedded and vector-searched. Set
enable_turn_embeddings=True(envENABLE_TURN_EMBEDDINGS) to generate an embedding when each turn is written, then callsearch_turns()(sync and async, on both the client and store) to semantically search the raw turn log. See PR:#22
- The memories container's vector index type is now configurable instead of being
hard-coded to
diskANN. Set it via thevector_index_typeargument tocreate_memory_store(...)or theAI_FOUNDRY_EMBEDDING_VECTOR_INDEX_TYPEenvironment variable. See PR:#24 ai_foundry_endpointnow accepts a project-scoped Azure AI Foundry URL (https://<resource>.services.ai.azure.com/api/projects/<name>) in addition to the account-level inference endpoint. See PR:#23
- Hardened memory extraction: stops emitting phantom/synthesized facts the user never asserted, stops extracting facts from
[assistant]:turns, stops re-processing already-extracted turns (which previously produced reversedCONTRADICTdecisions and meta-facts like"X is contradicted by Y"), and stops storing near-duplicate episodic memories for the same scope. Episodic memories also now embed the actual content instead of a boilerplate"intent recorded"string. See PR:#20 - Fixed
add_cosmos+process_nowsilently bypassing the cadence subsystem: cadence env vars (THREAD_SUMMARY_EVERY_N,FACT_EXTRACTION_EVERY_N,USER_SUMMARY_EVERY_N, etc.) had no effect, and procedural / user-summary synthesis never ran.add_cosmosnow triggers cadence on turn writes;process_nownow runs the full 5-step pipeline on the in-process processor.See PR:#20
ProcessThreadResultgainsproceduralanduser_summaryfields.extract_memoriesreturns adropped_episodic_countfor monitoring LLM-extraction quality.See PR:#20
Initial public preview release.
This is a beta release. The public surface may evolve in
backward-incompatible ways before the 1.0.0 general-availability cut.
Pin a specific version when integrating.
- Sync (
CosmosMemoryClient) and async (AsyncCosmosMemoryClient) clients for storing, retrieving, and transforming agent memories backed by Azure Cosmos DB. - Typed memory record hierarchy (Pydantic):
TurnRecord,FactRecord,EpisodicRecord,ProceduralRecord,ThreadSummaryRecord,UserSummaryRecord. - Vector + full-text + hybrid search over memories with metadata filters, tag filters, and per-type scoping.
- Built-in memory processing pipeline: fact extraction, thread/user
summarization, procedural-memory synthesis, contradiction handling, and
deduplication — all driven by versioned
.promptyprompts. - Two processor backends:
InProcessProcessor(default, runs in your application process) andDurableFunctionProcessor(offloads work to a sibling Azure Function app via Cosmos DB change feed). - One-command
azd updeployment that provisions Cosmos DB (with vector + full-text search enabled), Azure AI Foundry (chat + embedding deployments), Azure Function app (Flex Consumption), Storage, App Insights, and the User-Assigned Managed Identity wiring all of it together. - Focused exception hierarchy:
AgentMemoryError,ConfigurationError,ValidationError,CosmosNotConnectedError,CosmosOperationError,MemoryNotFoundError,MemoryTypeMismatchError,LLMError. - Structured JSON logging via
azure.cosmos.agent_memory.logging(configure_logging,JsonFormatter).
- Distribution name:
azure-cosmos-agent-memory(PyPI) - Import path:
azure.cosmos.agent_memory