Skip to content

Commit 539fd8a

Browse files
authored
feat: feedback system, adaptive retrieval, governance audit trail, API hardening, and comprehensive e2e tests (#65)
## What type of PR is this? - [ ] feat (new feature) - [ ] fix (bug fix) - [ ] docs (documentation) - [ ] style (formatting, no code change) - [ ] refactor (code change that neither fixes a bug nor adds a feature) - [ ] perf (performance improvement) - [ ] test (adding or updating tests) - [ ] chore (maintenance, tooling) - [ ] build / ci (build or CI changes) ## Which issue(s) this PR fixes Fixes # ## What this PR does / why we need it #### 1. Feedback & Adaptive Retrieval System - New `mem_retrieval_feedback` table for explicit relevance signals (useful/irrelevant/outdated/wrong) - New `mem_user_retrieval_params` table for per-user adaptive scoring parameters - `record_feedback()` validates signals, verifies memory ownership, updates denormalized counters - `search_hybrid_from_scored()` applies feedback adjustment: `(1 + fw * (useful - 0.5*negative)).clamp(0.5, 2.0)` - `DefaultScoringPlugin` auto-tunes feedback_weight based on signal ratios (≥10 feedback threshold) - REST endpoints: `POST /v1/memories/:id/feedback`, `GET /v1/feedback/stats`, `GET /v1/feedback/by-tier`, `GET/PUT /v1/retrieval-params`, `POST /v1/retrieval-params/tune` #### 2. Governance Audit Trail Enhancement - Governance operations now record structured payloads: `{"quarantined": N}`, `{"cleaned_stale": N}`, etc. - All 5 governance operations (archive_working, cleanup_stale, quarantine, compress_redundant, cleanup_orphaned_incrementals) include detailed audit logs - `mem_edit_log` redesigned: `target_ids JSON` → `memory_id VARCHAR(64)` + `payload JSON`, no PK, `CLUSTER BY`, UUID v7 for edit_id #### 3. API Error Handling Improvements - New `MemoriaError::Validation` variant for input validation errors - New `api_err_typed()` function maps error variants to proper HTTP status codes: - `NotFound` → 404 - `Validation/InvalidMemoryType/InvalidTrustTier` → 422 - `Blocked` → 403 - Others → 500 - Applied to `record_feedback` and `store_memory` handlers #### 4. Prometheus Metrics & Admin Config - `GET /metrics` endpoint: Prometheus text exposition format with memoria_memories_total, memoria_users_total, memoria_auth_failures_total, etc. - `GET /admin/config` (master-key-only): runtime config view with redacted DB password #### 5. MCP Tool Surface Reduction - 5 tools hidden from `list()` but still callable via REST/direct invocation: `memory_rebuild_index`, `memory_get_retrieval_params`, `memory_tune_params`, `memory_extract_entities`, `memory_link_entities` - Tool count: 18 → 13 in public listing #### 6. Comprehensive E2E Test Coverage - **Existing fixes**: `test_api_feedback_invalid_signal` (422 for invalid signal), `test_api_tune_retrieval_params` (COALESCE fix for empty feedback) - **New API tests**: `/metrics`, `/v1/snapshots/:name/rollback`, `/v1/entities`, `/admin/config` (with master-key auth) - **Concurrency tests**: parallel stores, entity extraction race condition, concurrent feedback - **Pressure tests**: batch store at 100-item limit - **Graceful degradation**: nonexistent snapshot, feedback on deleted memory, correct after purge #### 7. Documentation & Templates Sync - All 8 markdown templates (Kiro, Cursor, Claude) updated with feedback/adaptive retrieval guidance - Steering rules synchronized across all 3 agent platforms - API reference and architecture skills updated ### Bug Fixes - Fixed `get_feedback_stats()` NULL handling with COALESCE for empty feedback tables - Fixed race condition in `upsert_entity()`: INSERT-first, catch "Duplicate entry" error - Fixed `batch_upsert_memory_entity_links()` to use multi-row INSERT with ON DUPLICATE KEY UPDATE
1 parent 1701094 commit 539fd8a

62 files changed

Lines changed: 5170 additions & 587 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.kiro/steering/goal-driven-evolution.md

Lines changed: 36 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,28 @@
11
---
2-
inclusion: agent_requested
2+
inclusion: always
33
---
44

55
<!-- memoria-version: 0.1.0-->
66

7-
# Goal-Driven Iterative Evolution via Memory
7+
# Goal-Driven Evolution + Plan Integration
88

9-
Track goals, plans, progress, lessons, and user feedback across conversations. All content in English for consistent retrieval.
9+
Track goals, plans, progress, lessons, and user feedback across conversations. Integrates with plan panels (Kiro Shift+Tab, Cursor Composer, Claude multi-step).
1010

11-
## Workflow
11+
## Before Starting Any Multi-Step Task
1212

13-
### 1. Register Goal
13+
Query memory first:
1414

15-
Check for duplicates first, then store:
15+
```
16+
memory_search(query="GOAL [topic]") # existing related goals
17+
memory_search(query="LESSON [topic]") # past learnings
18+
memory_search(query="CORRECTION ANTIPATTERN [topic]") # what NOT to do
19+
```
20+
21+
If an active goal exists, continue it instead of creating a new one.
22+
23+
## Register Goal
24+
25+
For multi-session work (skip for trivial single-session tasks < 3 steps):
1626

1727
```
1828
memory_search(query="GOAL [keywords]")
@@ -22,14 +32,7 @@ memory_store(
2232
)
2333
```
2434

25-
### 2. Plan & Execute
26-
27-
Before acting, search for past failures and user corrections to avoid repeating mistakes:
28-
29-
```
30-
memory_search(query="CORRECTION ANTIPATTERN [goal name]")
31-
memory_search(query="❌ STEP for GOAL [name]")
32-
```
35+
## Plan & Execute
3336

3437
Store the plan, then track each step:
3538

@@ -41,16 +44,18 @@ memory_store(content="✅ STEP [N/total] for GOAL [name] (#X)\nAction: [done]\nR
4144
memory_store(content="❌ STEP [N/total] for GOAL [name] (#X)\nAction: [tried]\nError: [wrong]\nRoot Cause: [why]\nNext: [adjust]", memory_type="working")
4245
```
4346

47+
Only store non-obvious insights. Don't store "ran tests, passed".
48+
4449
For high-risk iterations, isolate on a branch:
4550
```
4651
memory_branch(name="goal_[name]_iter_[N]")
4752
memory_checkout(name="goal_[name]_iter_[N]")
4853
# work on branch... then validate and merge (see Iteration Review)
4954
```
5055

51-
### 3. Capture User Feedback (immediately, any time)
56+
## Capture User Feedback (immediately)
5257

53-
User corrections are the highest-value signal — always store as `procedural`:
58+
User corrections are highest-value — always store as `procedural`:
5459

5560
```
5661
# User corrects direction
@@ -66,7 +71,7 @@ memory_store(content="⚠️ ANTIPATTERN for GOAL [name]: [what went wrong]. Rul
6671
memory_correct(query="GOAL: [name]", new_content="🎯 GOAL: [name]\n...\nPivot: [old] → [new]. Reason: [why]", reason="User changed direction")
6772
```
6873

69-
### 4. Iteration Review
74+
## Iteration Review
7075

7176
When an iteration completes or is blocked:
7277

@@ -78,7 +83,7 @@ memory_store(
7883
memory_type="procedural"
7984
)
8085
81-
# If the insight is reusable beyond this goal, extract it now — don't wait for completion
86+
# If the insight is reusable beyond this goal, extract it now
8287
memory_store(content="💡 LESSON from [goal] iter #X: [cross-goal reusable insight]", memory_type="procedural")
8388
8489
memory_correct(query="GOAL: [name]", new_content="🎯 GOAL: [name]\nStatus: ITERATION #X COMPLETE — [progress %]\nNext: [plan]", reason="iteration complete")
@@ -92,13 +97,13 @@ memory_merge(source="goal_[name]_iter_[N]", strategy="replace")
9297
memory_branch_delete(name="goal_[name]_iter_[N]")
9398
```
9499

95-
Starting the next iteration? The new PLAN must reference the previous RETRO's improvements:
100+
Starting the next iteration? Reference the previous RETRO's improvements:
96101
```
97102
memory_search(query="RETRO for GOAL [name]")
98-
# Incorporate "Next: [improvements]" into the new plan — never repeat the same plan unchanged
103+
# Incorporate "Next: [improvements]" into the new plan
99104
```
100105

101-
### 5. New Conversation Bootstrap
106+
## New Conversation Bootstrap
102107

103108
```
104109
memory_search(query="GOAL ACTIVE")
@@ -108,7 +113,7 @@ memory_search(query="CORRECTION ANTIPATTERN [name]")
108113

109114
Summarize to user: active goals, last progress, and any corrections to respect.
110115

111-
### 6. Goal Completion & Cleanup
116+
## Goal Completion & Cleanup
112117

113118
```
114119
memory_correct(query="GOAL: [name]", new_content="🎯 GOAL: [name] — ✅ ACHIEVED\nIterations: [N]\nFinal approach: [what worked]", reason="Goal achieved")
@@ -120,10 +125,18 @@ memory_store(content="💡 LESSON from [goal]: [reusable insight for future work
120125
memory_purge(topic="STEP for GOAL [name]", reason="Goal achieved, archived in RETRO")
121126
```
122127

128+
## When Goal is Abandoned
129+
130+
```
131+
memory_store(content="⚠️ ANTIPATTERN: [what didn't work]. Reason: [why abandoned]", memory_type="procedural")
132+
memory_correct(query="GOAL: [name]", new_content="🎯 GOAL: [name] — ❌ ABANDONED\nReason: [why]", reason="abandoned")
133+
```
134+
123135
## Rules
124136

125137
- **Search before acting**: always check past failures, corrections, and antipatterns before proposing a plan
126138
- **User corrections override all**: if user corrected something, that correction has highest priority forever
127-
- **Be specific**: "Tests failed" is useless; "pytest fixtures don't work with async DB, use factory pattern" is valuable
139+
- **Be specific**: "pytest fixtures don't work with async DB, use factory pattern" > "tests failed"
140+
- **Don't create goals for quick fixes** (< 3 tasks, single session)
128141
- **Emoji prefixes**: 🎯 goal, 📋 plan, ✅❌ steps, 🔄 retro, 💡 lesson, 🔧 correction, 👍 feedback, ⚠️ antipattern
129142
- **Type discipline**: GOAL/PLAN/RETRO/LESSON/CORRECTION → `procedural`; STEP logs → `working`

.kiro/steering/memory-branching-patterns.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
---
2-
inclusion: agent_requested
2+
inclusion: auto
3+
name: memory-branching
4+
description: Git-like branching for memory - isolated experiments, tech evaluation, A/B comparison. Use when exploring alternatives or risky changes.
35
---
46

57
<!-- memoria-version: 0.1.0-->

.kiro/steering/memory-hygiene.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
---
2-
inclusion: agent_requested
2+
inclusion: auto
3+
name: memory-hygiene
4+
description: Memory health management - governance triggers, contradiction resolution, snapshot cleanup. Use when memory seems noisy or contradictory.
35
---
46

57
<!-- memoria-version: 0.1.0-->
@@ -17,7 +19,6 @@ Run `memory_governance` (1h cooldown) when you notice ANY of these:
1719

1820
After governance, check the response for:
1921
- `snapshot_health.auto_ratio > 50%` → suggest `memory_snapshot_delete(prefix="auto:")`
20-
- `needs_rebuild = True` → run `memory_rebuild_index`
2122
- Quarantined memories → inform user what was quarantined and why
2223

2324
## Contradiction Resolution
@@ -52,7 +53,7 @@ Keep named snapshots the user created explicitly.
5253

5354
## Entity Graph Maintenance
5455

55-
See entity graph triggers in [memory.md](memory.md) (proactive section). After extraction, if mode returns candidates, extract entities yourself and call `memory_link_entities` with the correct JSON format (see memory.md tool reference).
56+
Entity extraction is automatic — every `memory_store` triggers regex-based extraction, with LLM extraction as a fallback when configured. No manual intervention needed.
5657

5758
## Reflection Cadence
5859

.kiro/steering/memory.md

Lines changed: 40 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -9,19 +9,17 @@ inclusion: always
99
You have persistent memory via MCP tools. Memory survives across conversations.
1010

1111
## 🔴 MANDATORY: Every conversation start
12-
Follow the bootstrap protocol in [session-lifecycle](session-lifecycle.md) — multi-query retrieval before your first response.
1312

14-
At minimum, call `memory_retrieve` with a **semantic query** derived from the user's message BEFORE responding.
13+
Call `memory_retrieve` with a **semantic query** derived from the user's message BEFORE responding.
1514

16-
**Query construction rules:**
17-
-**DO**: Extract key concepts from user's question → "benchmark optimization", "graph retrieval bug", "active goals"
18-
-**DON'T**: Use meta-queries → "all memories", "everything", "list all", "show me data"
19-
- When user asks "what do I know" or "我有哪些记忆", query the most recent active context instead (e.g., "recent goals tasks projects")
15+
**Query rules:**
16+
- ✅ Extract key concepts → "benchmark optimization", "graph retrieval bug"
17+
- ❌ Don't use meta-queries → "all memories", "everything", "list all"
2018

2119
**After retrieval:**
22-
- If results come back → use them as **reference only**. Treat retrieved memories as potentially stale or incomplete — always verify against current context before acting on them. Do NOT blindly trust memory content as ground truth.
23-
- If "No relevant memories found" → this is normal for new users, proceed without.
24-
- If ⚠️ health warnings appear → inform the user and offer to run `memory_governance`.
20+
- Results → use as reference, verify against current context
21+
- "No relevant memories" → normal for new users, proceed
22+
- ⚠️ warnings → inform user, offer `memory_governance`
2523

2624
## 🔴 MANDATORY: Every conversation turn
2725
After responding, decide if anything is worth remembering:
@@ -91,6 +89,35 @@ Before storing a new memory, consider:
9189
| `memory_retrieve` | Conversation start, or when context is needed | `query`, `top_k` (default 5), `session_id` (optional), `explain` (false = no debug, true = show timing) |
9290
| `memory_search` | User asks "what do you know about X" or you need to browse | `query`, `top_k` (default 10), `explain` (false = no debug, true = show timing) |
9391
| `memory_profile` | User asks "what do you know about me" ||
92+
| `memory_feedback` | After using a retrieved memory, record if it was helpful | `memory_id`, `signal` (useful/irrelevant/outdated/wrong), `context` (optional) |
93+
94+
**`memory_feedback`**: Call this after retrieval when you can assess whether a memory was helpful. Signals:
95+
- `useful` — memory helped answer the question or complete the task
96+
- `irrelevant` — memory was retrieved but not relevant to the query
97+
- `outdated` — memory contains stale information (consider `memory_correct` instead if you know the new value)
98+
- `wrong` — memory contains incorrect information (consider `memory_correct` instead if you know the correct value)
99+
100+
**When to call feedback vs other tools**:
101+
- Memory helped → `memory_feedback(signal="useful")`
102+
- Memory irrelevant but correct → `memory_feedback(signal="irrelevant")`
103+
- Memory outdated and you know new value → `memory_correct` (not feedback)
104+
- Memory outdated but you don't know new value → `memory_feedback(signal="outdated")`
105+
- Memory wrong and you know correct value → `memory_correct` (not feedback)
106+
- Memory should be deleted → `memory_purge` (not feedback)
107+
108+
**Example flow**:
109+
```
110+
# 1. Retrieve memories
111+
memories = memory_retrieve(query="database config")
112+
113+
# 2. Use memories to answer user's question
114+
# ... (memory about "Uses PostgreSQL" helped answer)
115+
116+
# 3. Record feedback for the helpful memory
117+
memory_feedback(memory_id="abc123", signal="useful", context="answered DB question")
118+
```
119+
120+
**Impact**: Feedback accumulates over time. With default settings, a memory with 3 `useful` signals ranks ~30% higher in future retrievals. Don't call for every memory — only when you have clear signal.
94121

95122
**`memory_retrieve` vs `memory_search`**: In MCP mode, both use the same retrieval pipeline (graph → hybrid vector+fulltext → fulltext fallback). The differences are:
96123
- `memory_retrieve` accepts `session_id` for session-scoped boosting; `memory_search` does not
@@ -124,33 +151,18 @@ When `memory_governance` reports snapshot_health with high auto_ratio (>50%), su
124151
### Branches (isolated experiments)
125152
Git-like workflow for memory. `memory_branch(name)` creates, `memory_checkout(name)` switches, `memory_diff(source)` previews changes, `memory_merge(source)` merges back, `memory_branch_delete(name)` cleans up. `memory_branches()` lists all.
126153

127-
### Entity graph (proactive — call when conditions are met)
128-
| Tool | When to call | Key params |
129-
|------|-------------|------------|
130-
| `memory_extract_entities` | **Proactively** after storing ≥ 5 new memories in a session, OR when user discusses a new project/technology/person not yet in the graph | `mode` (default: auto) |
131-
| `memory_link_entities` | After `extract_entities(mode='candidates')` returns memories — extract entities yourself, then call this | `entities` (JSON string) |
132-
133-
**Trigger heuristics — call `memory_extract_entities` when ANY of these are true:**
134-
- You stored ≥ 5 memories this session and haven't extracted entities yet
135-
- User mentions a project, technology, or person by name that you haven't seen in previous `memory_retrieve` results
136-
- User asks about relationships between concepts ("how does X relate to Y")
137-
- User starts working on a new codebase or topic area
138-
139-
**Do NOT extract entities when:**
140-
- Conversation is short (< 3 turns) and no new named entities appeared
141-
- User is only asking questions, not sharing new information
142-
- You already ran extraction this session
154+
### Entity graph
155+
Entity extraction is automatic — every `memory_store` triggers regex-based extraction, with LLM extraction as a fallback when configured. No manual intervention needed.
143156

144157
### Maintenance (proactive triggers in [memory-hygiene](memory-hygiene.md), manual triggers below)
145158
| Tool | Trigger phrase | Cooldown |
146159
|------|---------------|----------|
147160
| `memory_governance` | "clean up memories", "check memory health", or proactively per [memory-hygiene](memory-hygiene.md) | 1 hour |
148161
| `memory_consolidate` | "check for contradictions", "fix conflicts" | 30 min |
149162
| `memory_reflect` | "find patterns", "summarize what you know" | 2 hours |
150-
| `memory_rebuild_index` | Only when governance reports `needs_rebuild=True` ||
151163
| `memory_snapshot_delete` | When governance reports high snapshot auto_ratio, or user asks to clean snapshots ||
152164

153-
`memory_reflect` and `memory_extract_entities` support `mode` parameter:
165+
`memory_reflect` supports `mode` parameter:
154166
- `auto` (default): uses Memoria's internal LLM if configured, otherwise returns candidates for YOU to process
155-
- `candidates`: always returns raw data for YOU to synthesize/extract, then store results via `memory_store` or `memory_link_entities`
167+
- `candidates`: always returns raw data for YOU to synthesize, then store results via `memory_store`
156168
- `internal`: always uses Memoria's internal LLM (fails if not configured)

.kiro/steering/session-lifecycle.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
---
2-
inclusion: always
2+
inclusion: auto
3+
name: session-lifecycle
4+
description: Detailed session lifecycle management - bootstrap, mid-session re-retrieval, wrap-up cleanup. Use when starting conversations or managing session state.
35
---
46

57
<!-- memoria-version: 0.1.0-->

README.md

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Every memory change is tracked, auditable, and reversible — snapshots, branche
2929
- **Self-maintaining** — built-in governance detects contradictions, quarantines low-confidence memories
3030
- **Private by default** — local embedding model option, no data leaves your machine
3131

32-
**Supported Agents:** [Kiro](https://kiro.dev) · [Cursor](https://cursor.sh) · [Claude Code](https://docs.anthropic.com/en/docs/claude-code) · [OpenClaw](plugins/openclaw/README.md) · Any MCP-compatible agent
32+
**Supported Agents:** [Kiro](https://kiro.dev) · [Cursor](https://cursor.sh) · [Claude Code](https://docs.anthropic.com/en/docs/claude-code) · [Codex](https://openai.com/index/introducing-codex/) · [OpenClaw](plugins/openclaw/README.md) · Any MCP-compatible agent
3333

3434
**Storage Backend:** [MatrixOne](https://github.com/matrixorigin/matrixone) — Distributed database with native vector indexing
3535

@@ -74,7 +74,7 @@ cd your-project
7474
memoria init -i # Interactive wizard (recommended)
7575
```
7676

77-
This creates MCP config + steering rules for your AI tool (Kiro, Cursor, or Claude).
77+
This creates MCP config + steering rules for your AI tool (Kiro, Cursor, Claude, or Codex).
7878

7979
### 🦞 OpenClaw Plugin (Already Using OpenClaw?)
8080

@@ -179,6 +179,7 @@ AI: → memory_branch(name="eval_sqlite")
179179
- Kiro: `.kiro/steering/*.md`
180180
- Cursor: `.cursor/rules/*.mdc`
181181
- Claude: `.claude/rules/*.md`
182+
- Codex: `AGENTS.md`
182183

183184
### Update Rules
184185

@@ -197,20 +198,27 @@ memoria rules --force
197198
|------|-------------|
198199
| `memory_store` | Store a new memory |
199200
| `memory_retrieve` | Retrieve relevant memories (call at conversation start) |
201+
| `memory_search` | Semantic search across all memories |
200202
| `memory_correct` | Update an existing memory |
201203
| `memory_purge` | Delete by ID or topic keyword |
202-
| `memory_search` | Semantic search across all memories |
204+
| `memory_list` | List active memories |
203205
| `memory_profile` | Get user's memory-derived profile |
206+
| `memory_feedback` | Record relevance feedback (useful/irrelevant/outdated/wrong) |
207+
| `memory_capabilities` | List available memory tools |
204208

205209
### Snapshots & Branches
206210

207211
| Tool | Description |
208212
|------|-------------|
209213
| `memory_snapshot` | Create named snapshot |
214+
| `memory_snapshots` | List snapshots with pagination |
215+
| `memory_snapshot_delete` | Delete snapshots by name, prefix, or age |
210216
| `memory_rollback` | Restore to snapshot |
211217
| `memory_branch` | Create isolated branch |
218+
| `memory_branches` | List all branches |
212219
| `memory_checkout` | Switch branch |
213220
| `memory_merge` | Merge branch back |
221+
| `memory_branch_delete` | Delete a branch |
214222
| `memory_diff` | Preview merge changes |
215223

216224
### Maintenance
@@ -220,7 +228,8 @@ memoria rules --force
220228
| `memory_governance` | Quarantine low-confidence memories (1h cooldown) |
221229
| `memory_consolidate` | Detect contradictions (30min cooldown) |
222230
| `memory_reflect` | Synthesize insights (2h cooldown) |
223-
| `memory_extract_entities` | Build entity graph |
231+
232+
> `memory_rebuild_index`, `memory_observe`, `memory_get_retrieval_params`, `memory_tune_params`, `memory_extract_entities`, and `memory_link_entities` are available via REST API but hidden from MCP tool listing — they are ops/debug tools not intended for agent use.
224233
225234
Full API details: [API Reference Skill](skills/api-reference/SKILL.md)
226235

@@ -257,7 +266,7 @@ If you're an AI agent helping a user set up Memoria:
257266

258267
1. **Load the [Setup Skill](skills/setup/SKILL.md)** — it has step-by-step instructions
259268
2. **Ask before acting**:
260-
- Which AI tool? (Kiro / Cursor / Claude)
269+
- Which AI tool? (Kiro / Cursor / Claude / Codex)
261270
- MatrixOne database? (Docker / Cloud / existing)
262271
- Embedding service? (OpenAI / SiliconFlow / local)
263272
3. **Run `memoria init -i`** in the user's project directory

0 commit comments

Comments
 (0)