Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion Docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,12 @@ This folder contains the main project documentation for Agent Memory Toolkit.
| [local_testing.md](local_testing.md) | Covers local setup, environment configuration, RBAC, Cosmos provisioning, running the toolkit and Azure Functions locally, and testing change feed auto-processing with serverless or autoscale container provisioning. |
| [azure_testing.md](azure_testing.md) | Covers Azure deployment, cloud configuration, required services, change feed settings, throughput mode configuration, and validation steps for running the toolkit in Azure. |
| [design_patterns.md](design_patterns.md) | Shows when and how to call CRUD operations, summarization, fact extraction, and memory retrieval in chat and multi-agent applications, including automatic processing via the change feed. |
| [troubleshooting.md](troubleshooting.md) | Helps diagnose common setup, authentication, Cosmos DB, embeddings, Durable Functions, vector search, and change feed issues. |

## Recommended Reading Order

1. Start with [concepts.md](concepts.md) to understand the data model and memory lifecycle.
2. Use [local_testing.md](local_testing.md) to get the toolkit running and validated on your machine.
3. Use [azure_testing.md](azure_testing.md) when you are ready to deploy or validate the full stack in Azure.
4. See [design_patterns.md](design_patterns.md) for integration patterns in real applications.
4. See [design_patterns.md](design_patterns.md) for integration patterns in real applications.
5. Use [troubleshooting.md](troubleshooting.md) when setup, processing, search, or automatic change feed behavior does not work as expected.
168 changes: 168 additions & 0 deletions Docs/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# Troubleshooting Agent Memory Toolkit

Use this guide when local memory works but Cosmos DB, embeddings, Durable Functions, or automatic change feed processing does not.

---

## Quick Triage

| Symptom | First checks |
|---------|--------------|
| Import errors | Install with `pip install -e ".[dev]"` and import `CosmosMemoryClient` or `AsyncCosmosMemoryClient`. |
| Missing configuration | Verify `.env`, `azure_functions/local.settings.json`, and Azure Function App settings use the same endpoint, database, and container values. |
| Cosmos 401 or 403 | Run `az login` and confirm Cosmos DB data-plane RBAC is assigned. |
| Cosmos operations fail before connecting | Call `create_memory_store()` or `connect_cosmos()` before cloud operations. |
| Search returns no vector results | Confirm embeddings are generated and `EMBEDDING_DIMENSIONS` matches the container vector policy. |
| Durable Function calls fail | Start the Functions host and check `ADF_ENDPOINT`, `ADF_KEY`, and the orchestrator route. |
| Change feed does not create summaries or facts | Confirm change feed settings, thresholds, lease container, counter container, and that inserted documents have `type: "turn"`. |

---

## 1. Environment And Imports

Install the package from the repository root:

```bash
pip install -e ".[dev]"
pip install -r azure_functions/requirements.txt
```

The public clients are:

```python
from agent_memory_toolkit import CosmosMemoryClient
from agent_memory_toolkit.aio import AsyncCosmosMemoryClient
```

If notebooks cannot import the package, run them from the `Samples` folder or add the repository root to `sys.path`.

---

## 2. Configuration And Authentication

For local runs, keep `.env` and `azure_functions/local.settings.json` aligned:

```env
COSMOS_DB_ENDPOINT=https://<account>.documents.azure.com:443/
COSMOS_DB_DATABASE=ai_memory
COSMOS_DB_CONTAINER=memories
COSMOS_DB_COUNTERS_CONTAINER=counter
COSMOS_DB_LEASE_CONTAINER=leases
AI_FOUNDRY_ENDPOINT=https://<project>.services.ai.azure.com/
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_DIMENSIONS=1536
ADF_ENDPOINT=http://localhost:7071/api
ADF_KEY=
```

Run `az login` before using `DefaultAzureCredential`.

Required roles:

| Service | Role |
|---------|------|
| Cosmos DB | Cosmos DB Built-in Data Contributor |
| Azure OpenAI / AI Services | Cognitive Services OpenAI User |

RBAC changes can take several minutes to propagate.

---

## 3. Cosmos DB Store Creation

Run `create_memory_store()` before relying on cloud operations. It creates the database plus the `memories`, `counter`, and `leases` containers.

The memories container is created with:

- hierarchical partition key on `user_id` and `thread_id`
- vector index on `/embedding`
- full-text index on `/content`

If vector or full-text search fails after changing dimensions or indexing settings, create a fresh container with the desired configuration. Cosmos container vector policies are creation-time infrastructure choices.

Use `COSMOS_DB_THROUGHPUT_MODE=serverless` for the default setup. Use `autoscale` with `COSMOS_DB_AUTOSCALE_MAX_RU` when you need provisioned autoscale throughput.

---

## 4. Embeddings And Search

Embedding failures usually mean one of these is wrong:

- `AI_FOUNDRY_ENDPOINT`
- `EMBEDDING_MODEL`
- `EMBEDDING_DIMENSIONS`
- Azure OpenAI / AI Services RBAC

For hybrid search, `search_terms` is required when `hybrid_search=True`.

If search returns documents but scores look poor, check that records have an `embedding` field and that the query uses similar language to the stored memory content.

---

## 5. Durable Functions Processing

Thread summaries, fact extraction, and user summaries require the Functions host.

Start local dependencies:

```bash
azurite --silent --location /tmp/azurite --debug /tmp/azurite/debug.log
cd azure_functions
func start
```

The SDK posts to:

```text
<ADF_ENDPOINT>/orchestrators/memory_orchestrator
```

For local testing, `ADF_ENDPOINT` is usually `http://localhost:7071/api` and `ADF_KEY` is blank. For Azure, use the deployed Function App URL and set `ADF_KEY` if function-key auth is enabled.

If orchestration polling times out, check the Functions logs first. The orchestration may still be running, or an activity may be waiting on Cosmos DB or the LLM endpoint.

---

## 6. Change Feed Processing

Automatic processing requires these settings in the Functions app or `local.settings.json`:

```json
"COSMOS_DB__accountEndpoint": "https://<account>.documents.azure.com:443/",
"COSMOS_DB_COUNTERS_CONTAINER": "counter",
"COSMOS_DB_LEASE_CONTAINER": "leases",
"THREAD_SUMMARY_EVERY_N": "5",
"FACT_EXTRACTION_EVERY_N": "3",
"USER_SUMMARY_EVERY_N": "10"
```

Set a threshold to `"0"` to disable that processing type.

Only documents with `type: "turn"` increment counters. Derived memories such as `summary`, `fact`, and `user_summary` do not trigger threshold counts.

If nothing fires:

- verify the Functions host shows the Cosmos DB trigger
- confirm the `leases` container exists
- confirm the `counter` container is writable
- insert enough new turn documents to cross the configured threshold
- check for generated documents with `memory_type="summary"`, `memory_type="fact"`, or `get_user_summary(user_id=...)`

---

## 7. Async Client Notes

Use async Azure credentials with the async client:

```python
from azure.identity.aio import DefaultAzureCredential
from agent_memory_toolkit.aio import AsyncCosmosMemoryClient
```

Always `await` cloud operations and close the client when done:

```python
await memory.close()
```

In notebooks, top-level `await` is supported, so do not wrap cells with `asyncio.run()`.
Binary file added Overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
97 changes: 41 additions & 56 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Azure Cosmos DB Agent Memory Toolkit - Public Preview
# Azure Cosmos DB Agent Memory Toolkit

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
Expand All @@ -8,48 +8,25 @@
[![YouTube](https://img.shields.io/badge/YouTube-Azure%20Cosmos%20DB-FF0000?logo=youtube&logoColor=white)](https://www.youtube.com/@AzureCosmosDB)


Agent Memory Toolkit is a Python SDK for storing, retrieving, and transforming agent memories on Azure Cosmos DB. It gives your agent both raw conversation history and higher-value derived memory — thread summaries, extracted facts, and cross-thread user profiles — all searchable semantically. The processing pipeline can run **in-process** (zero infra) or in a sibling **Azure Durable Function app** that watches the Cosmos DB change feed. Sync (`CosmosMemoryClient`) and async (`AsyncCosmosMemoryClient`) APIs are mirror-images of each other.
Agent Memory Toolkit is a Python library and Azure-backed reference implementation for storing, retrieving, and transforming agent memories over time. It combines a simple SDK for local and Cosmos DB operations with Durable Functions pipelines that generate thread summaries, extract facts, and build cross-thread user profiles. The toolkit also supports automatic processing via a Cosmos DB change feed trigger that fires these pipelines in the background when configurable message count thresholds are crossed. The toolkit is designed for agent applications that need both raw conversation history and higher-value derived memory that can be searched semantically later. It provides matching sync (`CosmosMemoryClient`) and async (`AsyncCosmosMemoryClient`) APIs so the same memory model can be used in scripts, services, notebooks, and larger agent systems.

```
┌──────────────────────────────────────────────────────────────────────────────────────┐
│ YOUR AGENTIC APP │
│ Uses CosmosMemoryClient / AsyncCosmosMemoryClient │
└─────────────────────────────────────────┬────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────────────────────┐
│ AGENT MEMORY TOOLKIT (Python SDK) │
│ │
│ • Local in-memory CRUD │
│ • Cosmos DB storage and retrieval │
│ • Pluggable processor: in-process or remote Durable Function app │
└──────────────────────────────────────────┬──────────────────────────────┬────────────┘
│ │
│ read / write │ Invoke processing pipeline
▼ ▼
┌───────────────────────────────────┐ ┌──────────────────────────────────┐
│ AZURE COSMOS DB (NoSQL) │ │ AZURE DURABLE FUNCTIONS │
│ │ │ │
│ Stores: │ │ Orchestrates memory processing: │
│ • turns │ │ • thread summaries │
│ • summaries │◄─── memory management ───►│ • fact extraction │
│ • facts │ │ • user summaries │
│ • user summaries │ │ │
│ │ │ On-demand (SDK) or automatic │
│ Supports query, vector, text │ change feed trigger │ (Cosmos DB change feed trigger). │
│ search over stored memories. │───────────────────────────►│ │
└───────────────────────┬───────────┘ └──────────────────┬───────────────┘
│ embeddings and LLM-based processing │
└──────────────────────┬───────────────────────────────────┘
┌──────────────────────────────────┐
│ MICROSOFT FOUNDRY │
│ │
│ • Embeddings for search │
│ • Chat/LLM generation │
│ │
└──────────────────────────────────┘
```
![Agent Memory Toolkit overview](Overview.png)

---

## Features

| Feature | Description |
|---------|-------------|
| **Local memory store** | In-memory CRUD — no Azure needed for development |
| **Cosmos DB integration** | CRUD, `push_to_cosmos()` bulk upload, semantic search, hierarchical partition key, vector + full-text indexes |
| **Thread summaries** | `generate_thread_summary()` — LLM-generated, incrementally updated, embedded and stored |
| **Fact extraction** | `extract_facts()` — discrete, independently searchable assertions from a thread |
| **User summaries** | `generate_user_summary()` — cross-thread user profile, incrementally updated |
| **Incremental updates** | Thread and user summaries use point-read + time-filtering to merge new data with existing summaries |
| **Automatic processing** | Cosmos DB change feed trigger fires thread summaries, fact extraction, and user summaries when configurable message count thresholds are crossed |
| **Externalized prompts** | LLM prompts live in editable Markdown files (`azure_functions/prompts/`) |
| **Entra ID auth** | `DefaultAzureCredential` everywhere — `az login`, managed identities |

---

Expand Down Expand Up @@ -163,6 +140,20 @@ See [`Samples/`](Samples/) for end-to-end scenarios (chat memory, RAG, multi-age

---

## Azure Resources

| Resource | Purpose |
|----------|---------|
| **Cosmos DB for NoSQL** | Memory store with hierarchical partition key, vector index, full-text index |
| **Azure OpenAI / AI Foundry** | Embedding model + chat model for summarization / fact extraction |
| **Azure Functions** | Durable Functions orchestrator and activity functions |

Automatic change feed processing stores lightweight counter documents in a dedicated `counter` container and also uses a `leases` container that is provisioned by `create_memory_store()`. Throughput defaults to `serverless`; set `COSMOS_DB_THROUGHPUT_MODE=autoscale` to apply the shared `COSMOS_DB_AUTOSCALE_MAX_RU` cap to the memories, counter, and lease containers. See [concepts.md](Docs/concepts.md#automatic-processing-change-feed) for details.

All services use **Entra ID** auth via `DefaultAzureCredential`.

---

## Concepts in 60 seconds

| Concept | What it is | API |
Expand Down Expand Up @@ -290,16 +281,6 @@ Async equivalents (`AsyncInProcessProcessor`, `AsyncDurableFunctionProcessor`) l

---

## Documentation

- **[Docs/concepts.md](Docs/concepts.md)** — Memory types, threads, roles, embeddings, processing pipeline
- **[Docs/design_patterns.md](Docs/design_patterns.md)** — Integration patterns for chat apps and multi-agent systems
- **[Docs/local_testing.md](Docs/local_testing.md)** — Prerequisites, environment setup, running locally, debugging
- **[Docs/azure_testing.md](Docs/azure_testing.md)** — Azure deployment, RBAC, cloud validation
- **[infra/README.md](infra/README.md)** — `azd` deployment, Bicep modules, BYOR settings, counter-trigger tuning

---

## Project structure

```
Expand All @@ -315,11 +296,15 @@ tests/ Unit + integration tests (pytest)

---

## Migration notes
## Documentation

- **`agent_memory_toolkit.processing.ProcessingClient` is removed.** Drop the import and call `client.process_now()` (or `client.process_now_and_wait()`) instead. Same for the async `AsyncProcessingClient`.
- **New `processor=` kwarg.** Defaults to `InProcessProcessor()` — existing code keeps its current behavior with no edits.
- **`adf_endpoint` / `adf_key` constructor kwargs are gone.** The SDK no longer makes HTTP calls to the Function app at runtime; the Function app reads from the Cosmos change feed.
- **[concepts.md](Docs/concepts.md)** — Memory types, threads, roles, embeddings, processing pipeline
- **[design_patterns.md](Docs/design_patterns.md)** — Integration patterns for chat apps and multi-agent systems
- **[local_testing.md](Docs/local_testing.md)** — Prerequisites, environment setup, running locally, debugging
- **[azure_testing.md](Docs/azure_testing.md)** — Azure deployment, RBAC, cloud validation
- **[troubleshooting.md](Docs/troubleshooting.md)** — Common setup, auth, Cosmos DB, Durable Functions, search, and change feed failure modes

---

## Trademark notice
Trademarks This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft’s Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party’s policies.
Loading
Loading