|
| 1 | +# Troubleshooting Agent Memory Toolkit |
| 2 | + |
| 3 | +Use this guide when local memory works but Cosmos DB, embeddings, Durable Functions, or automatic change feed processing does not. |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +## Quick Triage |
| 8 | + |
| 9 | +| Symptom | First checks | |
| 10 | +|---------|--------------| |
| 11 | +| Import errors | Install with `pip install -e ".[dev]"` and import `CosmosMemoryClient` or `AsyncCosmosMemoryClient`. | |
| 12 | +| Missing configuration | Verify `.env`, `function_app/local.settings.json`, and Azure Function App settings use the same endpoint, database, container, and AI deployment values. | |
| 13 | +| Cosmos 401 or 403 | Run `az login` and confirm Cosmos DB data-plane RBAC is assigned. | |
| 14 | +| Cosmos operations fail before connecting | Call `create_memory_store()` or `connect_cosmos()` before cloud operations. | |
| 15 | +| Search returns no vector results | Confirm embeddings are generated and `AI_FOUNDRY_EMBEDDING_DIMENSIONS` matches the container vector policy. | |
| 16 | +| Durable Functions processing fails | Start the Functions host and check `function_app/local.settings.json`, the change feed trigger, and the orchestrator logs. | |
| 17 | +| Change feed does not create summaries or facts | Confirm change feed settings, thresholds, lease container, counter container, and that inserted documents have `type: "turn"`. | |
| 18 | + |
| 19 | +--- |
| 20 | + |
| 21 | +## 1. Environment And Imports |
| 22 | + |
| 23 | +Install the package from the repository root: |
| 24 | + |
| 25 | +```bash |
| 26 | +pip install -e ".[dev]" |
| 27 | +pip install -r function_app/requirements.txt |
| 28 | +``` |
| 29 | + |
| 30 | +The public clients are: |
| 31 | + |
| 32 | +```python |
| 33 | +from agent_memory_toolkit import CosmosMemoryClient |
| 34 | +from agent_memory_toolkit.aio import AsyncCosmosMemoryClient |
| 35 | +``` |
| 36 | + |
| 37 | +If notebooks cannot import the package, run them from the repo root with paths such as `Samples/Notebooks/Demo.ipynb`, or add the repository root to `sys.path`. |
| 38 | + |
| 39 | +--- |
| 40 | + |
| 41 | +## 2. Configuration And Authentication |
| 42 | + |
| 43 | +For local runs, keep `.env`, `function_app/local.settings.json`, and deployed Function App settings aligned: |
| 44 | + |
| 45 | +```env |
| 46 | +COSMOS_DB_ENDPOINT=https://<account>.documents.azure.com:443/ |
| 47 | +COSMOS_DB__accountEndpoint=https://<account>.documents.azure.com:443/ |
| 48 | +COSMOS_DB_KEY= |
| 49 | +COSMOS_DB_DATABASE=ai_memory |
| 50 | +COSMOS_DB_CONTAINER=memories |
| 51 | +COSMOS_DB_COUNTERS_CONTAINER=counter |
| 52 | +COSMOS_DB_LEASE_CONTAINER=leases |
| 53 | +COSMOS_DB_THROUGHPUT_MODE=serverless |
| 54 | +COSMOS_DB_AUTOSCALE_MAX_RU=1000 |
| 55 | +
|
| 56 | +AI_FOUNDRY_ENDPOINT=https://<account>.openai.azure.com/ |
| 57 | +AI_FOUNDRY_API_KEY= |
| 58 | +AI_FOUNDRY_EMBEDDING_DEPLOYMENT_NAME=text-embedding-3-large |
| 59 | +AI_FOUNDRY_EMBEDDING_DIMENSIONS=1536 |
| 60 | +AI_FOUNDRY_EMBEDDING_DATA_TYPE=float32 |
| 61 | +AI_FOUNDRY_EMBEDDING_DISTANCE_FUNCTION=cosine |
| 62 | +AI_FOUNDRY_CHAT_DEPLOYMENT_NAME=<chat-deployment-name> |
| 63 | +``` |
| 64 | + |
| 65 | +The notebooks and samples pass these values into the client like this: |
| 66 | + |
| 67 | +| `.env` setting | Client argument | |
| 68 | +|---|---| |
| 69 | +| `COSMOS_DB_ENDPOINT` | `cosmos_endpoint` | |
| 70 | +| `COSMOS_DB_DATABASE` | `cosmos_database` | |
| 71 | +| `COSMOS_DB_CONTAINER` | `cosmos_container` | |
| 72 | +| `COSMOS_DB_COUNTERS_CONTAINER` | `cosmos_counter_container` | |
| 73 | +| `COSMOS_DB_LEASE_CONTAINER` | `cosmos_lease_container` | |
| 74 | +| `COSMOS_DB_KEY` | `cosmos_key` | |
| 75 | +| `AI_FOUNDRY_ENDPOINT` | `ai_foundry_endpoint` | |
| 76 | +| `AI_FOUNDRY_API_KEY` | `ai_foundry_api_key` | |
| 77 | +| `AI_FOUNDRY_EMBEDDING_DEPLOYMENT_NAME` | `embedding_deployment_name` | |
| 78 | +| `AI_FOUNDRY_CHAT_DEPLOYMENT_NAME` | `chat_deployment_name` | |
| 79 | + |
| 80 | +`AI_FOUNDRY_EMBEDDING_DIMENSIONS`, `AI_FOUNDRY_EMBEDDING_DATA_TYPE`, and `AI_FOUNDRY_EMBEDDING_DISTANCE_FUNCTION` are read by the toolkit when creating the Cosmos DB vector policy. The Function App also reads `COSMOS_DB__accountEndpoint` for its identity-based Cosmos DB trigger binding; set it to the same value as `COSMOS_DB_ENDPOINT`. |
| 81 | + |
| 82 | +Run `az login` before using `DefaultAzureCredential`. |
| 83 | + |
| 84 | +Required roles: |
| 85 | + |
| 86 | +| Service | Role | |
| 87 | +|---------|------| |
| 88 | +| Cosmos DB | Cosmos DB Built-in Data Contributor | |
| 89 | +| Azure OpenAI / AI Services | Cognitive Services OpenAI User | |
| 90 | + |
| 91 | +RBAC changes can take several minutes to propagate. |
| 92 | + |
| 93 | +--- |
| 94 | + |
| 95 | +## 3. Cosmos DB Store Creation |
| 96 | + |
| 97 | +Run `create_memory_store()` before relying on cloud operations. It creates the database plus the `memories`, `counter`, and `leases` containers. |
| 98 | + |
| 99 | +The memories container is created with: |
| 100 | + |
| 101 | +- hierarchical partition key on `user_id` and `thread_id` |
| 102 | +- vector index on `/embedding` |
| 103 | +- full-text index on `/content` |
| 104 | + |
| 105 | +If vector or full-text search fails after changing dimensions or indexing settings, create a fresh container with the desired configuration. Cosmos container vector policies are creation-time infrastructure choices. |
| 106 | + |
| 107 | +Use `COSMOS_DB_THROUGHPUT_MODE=serverless` for the default setup. Use `autoscale` with `COSMOS_DB_AUTOSCALE_MAX_RU` when you need provisioned autoscale throughput. |
| 108 | + |
| 109 | +--- |
| 110 | + |
| 111 | +## 4. Embeddings And Search |
| 112 | + |
| 113 | +Embedding failures usually mean one of these is wrong: |
| 114 | + |
| 115 | +- `AI_FOUNDRY_ENDPOINT` |
| 116 | +- `AI_FOUNDRY_EMBEDDING_DEPLOYMENT_NAME` |
| 117 | +- `AI_FOUNDRY_EMBEDDING_DIMENSIONS` |
| 118 | +- Azure OpenAI / AI Services RBAC |
| 119 | + |
| 120 | +For hybrid search, `search_terms` is required when `hybrid_search=True`. |
| 121 | + |
| 122 | +If search returns documents but scores look poor, check that records have an `embedding` field and that the query uses similar language to the stored memory content. |
| 123 | + |
| 124 | +--- |
| 125 | + |
| 126 | +## 5. Durable Functions Processing |
| 127 | + |
| 128 | +Durable Functions processing requires the Functions host. |
| 129 | + |
| 130 | +Start local dependencies: |
| 131 | + |
| 132 | +```bash |
| 133 | +azurite --silent --location /tmp/azurite --debug /tmp/azurite/debug.log |
| 134 | +cd function_app |
| 135 | +func start |
| 136 | +``` |
| 137 | + |
| 138 | +The SDK does not post to a Function endpoint. With `DurableFunctionProcessor`, the SDK writes turns to Cosmos DB and the deployed Function App picks them up from the Cosmos DB change feed. For local testing, keep `function_app/local.settings.json` aligned with `.env` and confirm the Functions host starts the change feed trigger. |
| 139 | + |
| 140 | +If orchestration polling times out, check the Functions logs first. The orchestration may still be running, or an activity may be waiting on Cosmos DB or the LLM endpoint. |
| 141 | + |
| 142 | +--- |
| 143 | + |
| 144 | +## 6. Change Feed Processing |
| 145 | + |
| 146 | +Automatic processing requires these settings in the Functions app or `local.settings.json`: |
| 147 | + |
| 148 | +```json |
| 149 | +"COSMOS_DB__accountEndpoint": "https://<account>.documents.azure.com:443/", |
| 150 | +"COSMOS_DB_ENDPOINT": "https://<account>.documents.azure.com:443/", |
| 151 | +"COSMOS_DB_DATABASE": "ai_memory", |
| 152 | +"COSMOS_DB_CONTAINER": "memories", |
| 153 | +"COSMOS_DB_COUNTERS_CONTAINER": "counter", |
| 154 | +"COSMOS_DB_LEASE_CONTAINER": "leases", |
| 155 | +"AI_FOUNDRY_ENDPOINT": "https://<account>.openai.azure.com/", |
| 156 | +"AI_FOUNDRY_CHAT_DEPLOYMENT_NAME": "gpt-4o-mini", |
| 157 | +"AI_FOUNDRY_EMBEDDING_DEPLOYMENT_NAME": "text-embedding-3-large", |
| 158 | +"THREAD_SUMMARY_EVERY_N": "5", |
| 159 | +"FACT_EXTRACTION_EVERY_N": "3", |
| 160 | +"USER_SUMMARY_EVERY_N": "10" |
| 161 | +``` |
| 162 | + |
| 163 | +Set a threshold to `"0"` to disable that processing type. |
| 164 | + |
| 165 | +Cosmos DB memory documents store their category in the JSON `type` field. Only documents with `type: "turn"` increment counters. Derived memories with `type: "summary"`, `type: "fact"`, or `type: "user_summary"` do not trigger threshold counts. |
| 166 | + |
| 167 | +If nothing fires: |
| 168 | + |
| 169 | +- verify the Functions host shows the Cosmos DB trigger |
| 170 | +- confirm the `leases` container exists |
| 171 | +- confirm the `counter` container is writable |
| 172 | +- insert enough new turn documents to cross the configured threshold |
| 173 | +- check for generated documents where the Cosmos JSON field is `type="summary"`, `type="fact"`, or `type="user_summary"` |
| 174 | + |
| 175 | +--- |
| 176 | + |
| 177 | +## 7. Async Client Notes |
| 178 | + |
| 179 | +Use async Azure credentials with the async client: |
| 180 | + |
| 181 | +```python |
| 182 | +from azure.identity.aio import DefaultAzureCredential |
| 183 | +from agent_memory_toolkit.aio import AsyncCosmosMemoryClient |
| 184 | +``` |
| 185 | + |
| 186 | +Always `await` cloud operations and close the client when done: |
| 187 | + |
| 188 | +```python |
| 189 | +await memory.close() |
| 190 | +``` |
| 191 | + |
| 192 | +In notebooks, top-level `await` is supported, so do not wrap cells with `asyncio.run()`. |
0 commit comments