Skip to content

Commit 68d618a

Browse files
feat(memory): conflict surfacing Phases 2+3+4
Merge conflict surfacing, cloud sync hardening, semantic judge, and beta testing infrastructure.
2 parents 5f3329f + f51520d commit 68d618a

81 files changed

Lines changed: 20462 additions & 87 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

DOCS.md

Lines changed: 217 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ This is the complete technical reference for Engram. For getting started, see th
1414
|---------|-----------------|
1515
| [Database Schema](#database-schema) | Tables, FTS5, SQLite config |
1616
| [HTTP API](#http-api-endpoints) | All REST endpoints with request/response details |
17-
| [MCP Tools](#mcp-tools-17-tools) | Detailed reference for all 17 memory tools |
17+
| [MCP Tools](#mcp-tools-18-tools) | Detailed reference for all 18 memory tools |
1818
| [MCP Project Resolution](#mcp-project-resolution) | Auto-detection algorithm, response envelope, tool categories |
1919
| [Memory Protocol](#memory-protocol) | When/how agents should use the tools |
2020
| [Project Name Normalization](#project-name-normalization) | Auto-detection, normalization, similar-project warnings |
@@ -46,6 +46,8 @@ For other docs:
4646
- **user_prompts**`id` (INTEGER PK AUTOINCREMENT), `session_id` (FK), `content`, `project`, `created_at`
4747
- **prompts_fts** — FTS5 virtual table synced via triggers (`content`, `project`)
4848
- **sync_chunks**`target_key` (TEXT), `chunk_id` (TEXT), `imported_at`; composite PK (`target_key`, `chunk_id`) for target-scoped chunk tracking
49+
- **memory_relations** — stores conflict-surfacing verdicts from `mem_judge`; columns include `sync_id` (TEXT PK), `source_id`, `target_id`, `relation`, `judgment_status` (`pending` | `judged`), `reason`, `evidence`, `confidence`, `marked_by_actor`, `marked_by_kind`, `marked_by_model`, `session_id`, `project`. Syncs across machines via cloud autosync when the project is enrolled.
50+
- **sync_apply_deferred** — holds pulled mutations that could not be applied locally due to a missing FK dependency (e.g. relation references an observation not yet present); columns: `sync_id` (TEXT PK), `entity`, `payload`, `apply_status` (`deferred` | `applied` | `dead`), `retry_count`, `last_error`, `last_attempted_at`, `first_seen_at`. Rows with `apply_status='dead'` have exceeded the retry cap (5 attempts) and will not be retried automatically.
4951

5052
### SQLite Configuration
5153

@@ -119,7 +121,7 @@ Engram is local-first: local SQLite is authoritative; cloud features are optiona
119121
- `200` when deleted
120122
- `404` when session does not exist
121123
- `409` when session still has observations (delete/migrate observations first)
122-
- `409` when the session's project is enrolled for cloud sync (session deletion is blocked to avoid local/cloud divergence)
124+
- For cloud-enrolled projects: returns `200` and additionally enqueues a `session/delete` mutation that propagates the deletion to cloud replicas
123125

124126
### Observations
125127

@@ -172,6 +174,139 @@ Engram is local-first: local SQLite is authoritative; cloud features are optiona
172174

173175
- `POST /projects/migrate` — Migrate observations between project names. Body: `{old_project, new_project}`
174176

177+
### Conflict Audit (admin — local runtime only)
178+
179+
These endpoints are served by `engram serve` on the local runtime only. They are not exposed on the cloud runtime. All routes are additive — no existing routes changed.
180+
181+
#### GET /conflicts
182+
183+
List `memory_relations` rows with optional filters.
184+
185+
Query params: `project` (string), `status` (string — `pending` | `judged`), `since` (RFC3339), `limit` (int, default 50, max 500 — silently clamped).
186+
187+
Response:
188+
```json
189+
{
190+
"relations": [
191+
{
192+
"id": 42,
193+
"sync_id": "rel-abc123",
194+
"source_id": 10,
195+
"target_id": 20,
196+
"relation": "conflicts_with",
197+
"judgment_status": "pending",
198+
"created_at": "2026-01-15T12:00:00Z"
199+
}
200+
],
201+
"total": 80
202+
}
203+
```
204+
205+
#### GET /conflicts/{relation_id}
206+
207+
Get full detail for one relation row, including source and target observation snippets.
208+
209+
- `200` with full relation + `source_snippet` + `target_snippet`
210+
- `404` with JSON `{"error": "not found"}` when `relation_id` does not exist
211+
- `400` with JSON error body when `relation_id` is not a valid integer
212+
213+
#### GET /conflicts/stats
214+
215+
Aggregate counts for the project (or global when `project` query param is omitted).
216+
217+
Response:
218+
```json
219+
{
220+
"pending": 3,
221+
"accepted": 1,
222+
"rejected": 0,
223+
"deferred": 4,
224+
"dead": 1
225+
}
226+
```
227+
228+
#### POST /conflicts/scan
229+
230+
Run conflict candidate scan for a project. Synchronous.
231+
232+
Request body:
233+
```json
234+
{
235+
"project": "my-project",
236+
"apply": false,
237+
"max_insert": 100,
238+
"semantic": false,
239+
"concurrency": 5,
240+
"timeout_per_call_seconds": 60,
241+
"max_semantic": 100
242+
}
243+
```
244+
245+
- `apply: false` (default) — dry-run; reports candidates without inserting rows
246+
- `apply: true` — inserts new pending relation rows up to `max_insert` cap (default 100)
247+
- `semantic: true` — after FTS5 lexical scan, run LLM-judge semantic detection on candidate pairs. Requires `ENGRAM_AGENT_CLI` to be set on the server to `claude` or `opencode`.
248+
- `concurrency` — worker pool size for parallel LLM calls (default 5, range 1–20)
249+
- `timeout_per_call_seconds` — per-LLM-call timeout in seconds (default 60, range 1–600)
250+
- `max_semantic` — hard cap on LLM calls per scan (default 100); scan stops collecting new pairs once reached
251+
- Missing `project` field returns `400`
252+
- `concurrency` outside [1, 20] or `timeout_per_call_seconds` outside [1, 600] returns `400`
253+
254+
Response:
255+
```json
256+
{
257+
"candidates_found": 5,
258+
"inserted": 0,
259+
"semantic_judged": 0,
260+
"semantic_skipped": 0,
261+
"semantic_errors": 0
262+
}
263+
```
264+
265+
`semantic_judged`, `semantic_skipped`, and `semantic_errors` are always present (zero when `semantic: false`).
266+
267+
When `apply: true` and the cap is reached, a `warning` field is included:
268+
```json
269+
{
270+
"candidates_found": 150,
271+
"inserted": 50,
272+
"warning": "cap reached: stopped after 50 inserts"
273+
}
274+
```
275+
276+
#### GET /conflicts/deferred
277+
278+
List rows from `sync_apply_deferred`. Query params: `status` (string — `deferred` | `dead` | `applied`), `limit` (int, default 50, max 500).
279+
280+
Response:
281+
```json
282+
{
283+
"rows": [
284+
{
285+
"sync_id": "obs_xyz",
286+
"entity": "relation",
287+
"apply_status": "deferred",
288+
"retry_count": 2,
289+
"last_error": "source FK not found",
290+
"created_at": "2026-01-15T12:00:00Z"
291+
}
292+
],
293+
"total": 3
294+
}
295+
```
296+
297+
#### POST /conflicts/deferred/replay
298+
299+
Call `ReplayDeferred()` synchronously. Returns counts of rows processed.
300+
301+
Response:
302+
```json
303+
{
304+
"retried": 4,
305+
"succeeded": 3,
306+
"dead": 1
307+
}
308+
```
309+
175310
### Sync Status (local runtime only)
176311

177312
- `GET /sync/status` — Runtime sync-state status for the local node (`engram serve` only).
@@ -185,6 +320,8 @@ Engram is local-first: local SQLite is authoritative; cloud features are optiona
185320
- `last_sync_at`
186321
- `reason_code`
187322
- `reason_message`
323+
- `deferred_count` — number of pulled mutations awaiting retry (FK dependency not yet local)
324+
- `dead_count` — number of pulled mutations that exhausted retries (5 failures) and will not be retried
188325
- `upgrade` (nested object)
189326
- `stage`
190327
- `reason_code`
@@ -202,6 +339,49 @@ Engram is local-first: local SQLite is authoritative; cloud features are optiona
202339
| `ENGRAM_PORT` | Override HTTP server port | `7437` |
203340
| `ENGRAM_PROJECT` | Override project name for MCP server | auto-detected via git |
204341

342+
### Conflict Audit CLI (admin)
343+
344+
The `engram conflicts` sub-command provides admin/maintainer access to the conflict layer. It is NOT for end users — end users interact with conflicts via the normal agent conversation flow.
345+
346+
When `--project` is omitted, the cwd-detected project is used.
347+
348+
```
349+
engram conflicts list [--project <name>] [--status <pending|judged>] [--since <RFC3339>] [--limit <N>]
350+
```
351+
List `memory_relations` rows. Output: label-colon aligned columns (relation_id, relation_type, judgment_status, created_at).
352+
353+
```
354+
engram conflicts show <relation_id>
355+
```
356+
Show full detail for one relation: relation_id, relation_type, judgment_status, sync_id, created_at, source observation snippet, target observation snippet. Exits non-zero when relation_id does not exist.
357+
358+
```
359+
engram conflicts stats [--project <name>]
360+
```
361+
Print aggregate counts: pending, accepted, rejected relation rows plus deferred and dead queue sizes.
362+
363+
```
364+
engram conflicts scan [--project <name>] [--dry-run] [--apply] [--max-insert <N>]
365+
[--semantic] [--concurrency <N>] [--timeout-per-call <N>]
366+
[--max-semantic <N>] [--yes]
367+
```
368+
Walk observations for the project, run FindCandidates, and report or insert new pending relation rows.
369+
- `--dry-run` (default): reports candidates found; 0 rows inserted.
370+
- `--apply`: inserts up to `--max-insert` (default 100) new rows; prints WARNING when cap is reached.
371+
- `--semantic`: enable LLM-judge semantic detection beyond FTS5 lexical candidates. Catches vocabulary-different concepts (e.g., "Hexagonal Architecture" vs "Ports and Adapters"). Requires `ENGRAM_AGENT_CLI=claude` or `ENGRAM_AGENT_CLI=opencode`.
372+
- `--concurrency N`: worker pool size for parallel LLM calls (default 5, max 20).
373+
- `--timeout-per-call N`: per-LLM-call timeout in seconds (default 60).
374+
- `--max-semantic N`: hard cap on LLM calls per scan run (default 100).
375+
- `--yes`: skip the cost-estimate confirmation prompt before LLM calls.
376+
377+
```
378+
engram conflicts deferred [--project <name>] [--status <deferred|dead|applied>] [--limit <N>] [--inspect <sync_id>] [--replay]
379+
```
380+
Inspect or replay the `sync_apply_deferred` queue.
381+
- Default: list rows with sync_id, status, retry_count, created_at.
382+
- `--inspect <sync_id>`: print full decoded payload for one row; exits non-zero when not found.
383+
- `--replay`: call `ReplayDeferred()` and print retried/succeeded/dead counts.
384+
205385
### Cloud CLI (opt-in)
206386

207387
- `engram cloud status` — show current cloud config state plus auth/sync readiness without mutating local state
@@ -398,12 +578,25 @@ Returns success even when cwd is ambiguous — empty `project` + non-empty `avai
398578

399579
---
400580

401-
## MCP Tools (17 tools)
581+
## MCP Tools (18 tools)
402582

403583
### mem_search
404584

405585
Search persistent memory across all sessions. Supports FTS5 full-text search with type/project/scope/limit filters.
406586

587+
When an observation has judged relations in `memory_relations`, the result entry includes annotation lines immediately after the title/content block:
588+
589+
```
590+
supersedes: #<id> (<title>) — this memory supersedes another
591+
superseded_by: #<id> (<title>) — another memory supersedes this one
592+
conflicts: #<id> (<title>) — judged conflict with another memory
593+
conflict: contested by #<id> (pending) — pending (not yet judged)
594+
```
595+
596+
Multiple annotation lines appear when multiple relations apply — one per related observation. Titles are retrieved via JOIN (no N+1 queries). When the related observation has been deleted, `(deleted)` replaces the title. Agent parsers should match by prefix — these prefixes are stable across versions (REQ-012).
597+
598+
Pending relations (from `mem_save` conflict surfacing, before `mem_judge` is called) produce the `conflict: contested by #<id> (pending)` form. Judged relations produce the enriched form with title.
599+
407600
### mem_save
408601

409602
Save structured observations. The tool description teaches agents the format:
@@ -494,7 +687,27 @@ Parameters:
494687

495688
Re-judging an existing relation overwrites it (deliberate revision). Two agents judging the same pair persist as separate rows — Phase 1 surfaces both; cross-actor reconciliation is Phase 2.
496689

497-
Search results subsequently expose annotation lines like `supersedes: #<id>` and `superseded_by: #<id>` so the recalling agent sees relevant verdicts at-a-glance. The structured `supersedes[]`, `superseded_by[]`, and `conflicts[]` fields are also attached per result.
690+
Search results subsequently expose annotation lines like `supersedes: #<id> (<title>)`, `superseded_by: #<id> (<title>)`, and `conflicts: #<id> (<title>)` so the recalling agent sees relevant verdicts at-a-glance. For enrolled projects with autosync enabled, judgments propagate to other machines via the cloud mutation pipeline — the annotation appears in `mem_search` results on any machine that has pulled the relevant mutations.
691+
692+
### mem_compare
693+
694+
Records a verdict on a semantic comparison between two memories. The agent reads both memories, judges the relationship using its LLM reasoning, and calls `mem_compare` to persist the verdict. Unlike `mem_judge` (which resolves a pre-existing `pending` candidate surfaced by `mem_save`), `mem_compare` creates a new relation row directly — useful for proactive semantic analysis that goes beyond FTS5 lexical matching.
695+
696+
Available in the `agent` profile (`engram mcp --tools=agent`).
697+
698+
Parameters:
699+
- **memory_id_a** (required): int — observation ID of the first memory
700+
- **memory_id_b** (required): int — observation ID of the second memory
701+
- **relation** (required): string — one of `conflicts_with` | `supersedes` | `scoped` | `related` | `compatible` | `not_conflict`
702+
- **confidence** (required): float 0.0..1.0
703+
- **reasoning** (required): string — explanation of the verdict (max 200 chars)
704+
- **model** (optional): string — model name for provenance (e.g. `"claude-haiku-4-5"`)
705+
706+
Behavior:
707+
- Persists a relation row via `JudgeBySemantic` with system provenance (`marked_by_kind="system"`, `marked_by_actor="engram"`)
708+
- Idempotent: the same `(source_id, target_id)` pair updates the existing row rather than inserting a duplicate
709+
- `not_conflict` verdicts are no-ops — acknowledged but not persisted, matching the scan flow contract
710+
- Cross-project relations are rejected with an error
498711

499712
---
500713

0 commit comments

Comments
 (0)