Skip to content

Commit 7708622

Browse files
committed
✨ feat(app): add ingest-imports CLI command for batch JSONL ingestion
- Add `codex-mem ingest-imports` command to import watcher/relay note events through newline-delimited JSONL input - Add operator-facing import-ingestion.md guide documenting JSONL schema, example invocations, and output shape - Update normative spec with `memory_save_import` and `memory_save_imported_note` tool contracts plus example payloads - Add app-level test coverage for text and JSON output formats - Update README and docs index links to reference new ingestion guide
1 parent c93a2c5 commit 7708622

11 files changed

Lines changed: 861 additions & 0 deletions

File tree

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ It stores structured notes and handoffs in SQLite, restores continuity across re
1010
- `serve-http` runs a native MCP HTTP server for remote or private deployment.
1111
- `doctor` reports config, database readiness, migration status, provenance coverage, and MCP tool availability.
1212
- AGENTS template installation is implemented for global and project workflows.
13+
- one-shot watcher/relay batch ingestion is available through `ingest-imports`.
1314

1415
Normative product docs live in [docs/spec/README.md](docs/spec/README.md).
1516
Go implementation docs now live under [docs/go/README.md](docs/go/README.md), grouped into user, operator, and maintainer directories.
@@ -22,6 +23,8 @@ Use the docs by audience:
2223
How memory works, what gets saved, and prompt patterns for normal Codex usage.
2324
- [Operator docs](docs/go/operator/README.md)
2425
Client registration, deployment/readiness, packaging, and troubleshooting.
26+
- [Import ingestion guide](docs/go/operator/import-ingestion.md)
27+
JSONL batch ingestion for watcher and relay artifacts through `ingest-imports`.
2528
- [Maintainer docs](docs/go/maintainer/README.md)
2629
Source-tree MCP integration, implementation planning, and development tracking.
2730

@@ -73,6 +76,8 @@ They are not MCP tools and are not the normal end-user interaction path.
7376
Prints effective config plus runtime readiness and audit diagnostics.
7477
- `codex-mem doctor --json`
7578
Prints the same diagnostics in machine-readable JSON for automation or CI checks.
79+
- `codex-mem ingest-imports --source watcher_import [--input events.jsonl] [--json]`
80+
Imports newline-delimited watcher or relay note events into durable imported notes plus audit records.
7681
- `codex-mem migrate`
7782
Opens the configured SQLite database and applies embedded migrations.
7883
- `codex-mem serve`
@@ -113,6 +118,7 @@ The current MCP server exposes:
113118
Request and response examples are documented in [example-payloads.md](docs/spec/appendices/example-payloads.md).
114119
For concrete packaged-binary client setup examples, use [client-examples.md](docs/go/operator/client-examples.md).
115120
For maintainer-oriented MCP transport and smoke-test guidance from the source tree, use [mcp-integration.md](docs/go/maintainer/mcp-integration.md).
121+
For operator-facing JSONL batch ingestion details, use [import-ingestion.md](docs/go/operator/import-ingestion.md).
116122
For a quick explanation of how memory works, what gets saved, and when scope matters, use [how-memory-works.md](docs/go/user/how-memory-works.md).
117123
For end-user prompt templates that cause Codex to pick the memory tools automatically, use [prompt-examples.md](docs/go/user/prompt-examples.md).
118124
For release packaging and operator guidance, use [release-readiness.md](docs/go/operator/release-readiness.md).

docs/go/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ If you are using `codex-mem` day to day:
2323
If you are deploying or operating the MCP server:
2424

2525
- [Client Examples](./operator/client-examples.md)
26+
- [Import Ingestion](./operator/import-ingestion.md)
2627
- [Release Readiness](./operator/release-readiness.md)
2728
- [Troubleshooting](./operator/troubleshooting.md)
2829

docs/go/maintainer/development-tracker.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -301,6 +301,19 @@ Current blockers:
301301
- In progress: none.
302302
- Blockers: none.
303303
- Next step: decide whether to keep polishing the import workflow through spec-facing docs and watcher/relay integration, or move on to a different product-facing slice.
304+
### 2026-03-16 Session Update
305+
306+
- Completed: Added a real CLI-side caller for the imported-note workflow with `codex-mem ingest-imports`. The command resolves scope, starts one ingestion session, reads newline-delimited JSON import events from stdin or `--input`, and routes each event through `memory_save_imported_note` semantics so watcher/relay batches can create imported notes plus audit records without going through MCP. Added app-level coverage for text/JSON summaries and persisted note/import counts.
307+
- In progress: none.
308+
- Blockers: none.
309+
- Next step: decide whether to keep this CLI batch path as the main watcher/relay bridge for now, or add a more direct integration path on top of the same imported-note service.
310+
### 2026-03-16 Session Update
311+
312+
- Completed: Updated the normative spec to include `memory_save_import` and `memory_save_imported_note` tool contracts plus example request/response payloads, and added an operator-facing [import-ingestion.md](../operator/import-ingestion.md) guide that documents the `ingest-imports` JSONL schema, example invocations, output shape, and current fail-fast batch semantics.
313+
- In progress: none.
314+
- Blockers: none.
315+
- Next step: decide whether `ingest-imports` should remain the main watcher/relay bridge for now or whether a more direct long-lived integration path is worth adding later.
316+
304317
## Recommended Next Step
305318

306319
Recommended next implementation slice:

docs/go/operator/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ Start here:
66

77
- [Client Examples](./client-examples.md)
88
Real MCP client registration examples for local stdio and remote HTTP.
9+
- [Import Ingestion](./import-ingestion.md)
10+
JSONL batch ingestion for watcher or relay artifacts through `ingest-imports`.
911
- [Release Readiness](./release-readiness.md)
1012
Packaging, readiness, and release checklist.
1113
- [Troubleshooting](./troubleshooting.md)
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# Go Import Ingestion
2+
3+
## Purpose
4+
5+
This document explains how operators can use `codex-mem ingest-imports` to turn watcher or relay batches into durable imported notes plus import audit records.
6+
7+
Audience:
8+
9+
- operators wiring watcher or relay output into `codex-mem`
10+
- maintainers validating packaged-binary ingestion behavior
11+
12+
Use this when:
13+
14+
- you need a one-shot batch bridge into the imported-note workflow
15+
- your upstream process can emit newline-delimited JSON events
16+
17+
Do not use this for:
18+
19+
- normal day-to-day Codex prompting
20+
- direct MCP tool calls from a client
21+
22+
## Command Shape
23+
24+
Minimal stdin example:
25+
26+
```powershell
27+
Get-Content .\events.jsonl | codex-mem.exe ingest-imports --source watcher_import
28+
```
29+
30+
Read from a file and print JSON:
31+
32+
```powershell
33+
codex-mem.exe ingest-imports --source relay_import --input .\relay-events.jsonl --json
34+
```
35+
36+
Useful flags:
37+
38+
- `--source watcher_import|relay_import`
39+
Required. Declares the provenance source for every event in the batch.
40+
- `--input <path>`
41+
Optional. Reads JSONL from a file instead of stdin.
42+
- `--cwd <path>`
43+
Optional. Resolves scope from a specific workspace root.
44+
- `--branch-name <name>`
45+
Optional. Carries branch metadata into the ingestion session.
46+
- `--repo-remote <url>`
47+
Optional. Strengthens scope resolution with the repository remote.
48+
- `--task <text>`
49+
Optional. Overrides the default ingestion session task summary.
50+
- `--json`
51+
Optional. Prints a structured report instead of line-oriented text output.
52+
53+
## Event Schema
54+
55+
Each non-empty line must be one JSON object.
56+
57+
Required fields:
58+
59+
- `type`
60+
Canonical note type: `decision`, `bugfix`, `discovery`, `constraint`, `preference`, or `todo`.
61+
- `title`
62+
Short imported note title.
63+
- `content`
64+
Durable imported note body.
65+
- `importance`
66+
Integer importance from `1` to `5`.
67+
68+
At least one of:
69+
70+
- `external_id`
71+
Stable upstream artifact id used for import dedupe.
72+
- `payload_hash`
73+
Stable content hash used when no external id exists.
74+
75+
Optional fields:
76+
77+
- `tags`
78+
String array of note tags.
79+
- `file_paths`
80+
String array of touched or relevant paths.
81+
- `related_project_ids`
82+
String array of related project ids for cross-project retrieval links.
83+
- `status`
84+
Note lifecycle state. Defaults to `active` when omitted.
85+
- `privacy_intent`
86+
When set to `private`, `do_not_store`, or `ephemeral_only`, the import is audited but note materialization is suppressed.
87+
88+
## Example JSONL
89+
90+
```jsonl
91+
{"external_id":"watcher:1","type":"discovery","title":"Imported discovery","content":"Useful watcher discovery.","importance":4,"tags":["watcher"]}
92+
{"external_id":"watcher:2","type":"todo","title":"Private follow-up","content":"Should stay audit-only.","importance":3,"privacy_intent":"private"}
93+
```
94+
95+
Behavior to expect from this batch:
96+
97+
- the first event creates an imported durable note plus an import audit record
98+
- the second event creates only a suppressed import audit record
99+
100+
## Output Semantics
101+
102+
Text mode prints a compact summary such as:
103+
104+
```text
105+
ingest imports ok
106+
source=watcher_import
107+
input=stdin
108+
session_id=sess_20260316_001
109+
resolved_by=repo_remote
110+
processed=2
111+
materialized=1
112+
suppressed=1
113+
note_deduplicated=0
114+
import_deduplicated=0
115+
warnings=1
116+
```
117+
118+
JSON mode returns the same summary plus per-line results, including the created or reused `note_id` and `import_id`.
119+
120+
## Operational Notes
121+
122+
- `ingest-imports` starts one fresh session for the whole batch after resolving scope.
123+
- Each event uses the same imported-note workflow as `memory_save_imported_note`.
124+
- Existing explicit memory wins over weaker imported duplicates in the same project.
125+
- The current implementation is fail-fast: the first invalid line stops the batch and returns an error.

docs/spec/appendices/example-payloads.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -252,6 +252,124 @@ These examples illustrate intent and semantics. They are not tied to one impleme
252252
}
253253
```
254254

255+
## `memory_save_import`
256+
257+
### Example request
258+
259+
```json
260+
{
261+
"scope": {
262+
"system_id": "sys_order_platform",
263+
"project_id": "proj_order_web",
264+
"workspace_id": "ws_order_web_main"
265+
},
266+
"session_id": "sess_20260313_001",
267+
"source": "watcher_import",
268+
"external_id": "watcher:event:392",
269+
"payload_hash": "sha256:4db5b7c6f2a9",
270+
"privacy_intent": ""
271+
}
272+
```
273+
274+
### Example response
275+
276+
```json
277+
{
278+
"ok": true,
279+
"data": {
280+
"import": {
281+
"import_id": "import_701",
282+
"scope": {
283+
"system_id": "sys_order_platform",
284+
"project_id": "proj_order_web",
285+
"workspace_id": "ws_order_web_main"
286+
},
287+
"session_id": "sess_20260313_001",
288+
"source": "watcher_import",
289+
"external_id": "watcher:event:392",
290+
"payload_hash": "sha256:4db5b7c6f2a9",
291+
"suppressed": false,
292+
"imported_at": "2026-03-13T11:18:00Z"
293+
},
294+
"stored_at": "2026-03-13T11:18:00Z",
295+
"suppressed": false,
296+
"deduplicated": false
297+
},
298+
"warnings": []
299+
}
300+
```
301+
302+
## `memory_save_imported_note`
303+
304+
### Example request
305+
306+
```json
307+
{
308+
"scope": {
309+
"system_id": "sys_order_platform",
310+
"project_id": "proj_order_web",
311+
"workspace_id": "ws_order_web_main"
312+
},
313+
"session_id": "sess_20260313_001",
314+
"source": "watcher_import",
315+
"external_id": "watcher:event:393",
316+
"payload_hash": "sha256:b7f650bfe12c",
317+
"type": "discovery",
318+
"title": "Watcher captured checkout retry regression",
319+
"content": "A local watcher run showed the checkout retry button still posts a legacy payment alias after draft restore.",
320+
"importance": 4,
321+
"tags": ["watcher", "checkout", "validation"],
322+
"file_paths": ["src/order/checkout.ts"],
323+
"status": "active"
324+
}
325+
```
326+
327+
### Example response
328+
329+
```json
330+
{
331+
"ok": true,
332+
"data": {
333+
"note": {
334+
"note_id": "note_488",
335+
"scope": {
336+
"system_id": "sys_order_platform",
337+
"project_id": "proj_order_web",
338+
"workspace_id": "ws_order_web_main"
339+
},
340+
"session_id": "sess_20260313_001",
341+
"type": "discovery",
342+
"title": "Watcher captured checkout retry regression",
343+
"content": "A local watcher run showed the checkout retry button still posts a legacy payment alias after draft restore.",
344+
"importance": 4,
345+
"status": "active",
346+
"source": "watcher_import",
347+
"created_at": "2026-03-13T11:24:00Z"
348+
},
349+
"import": {
350+
"import_id": "import_702",
351+
"scope": {
352+
"system_id": "sys_order_platform",
353+
"project_id": "proj_order_web",
354+
"workspace_id": "ws_order_web_main"
355+
},
356+
"session_id": "sess_20260313_001",
357+
"source": "watcher_import",
358+
"external_id": "watcher:event:393",
359+
"payload_hash": "sha256:b7f650bfe12c",
360+
"durable_memory_id": "note_488",
361+
"suppressed": false,
362+
"imported_at": "2026-03-13T11:24:00Z"
363+
},
364+
"materialized": true,
365+
"note_deduplicated": false,
366+
"import_deduplicated": false,
367+
"suppressed": false
368+
},
369+
"warnings": []
370+
}
371+
```
372+
255373
## `memory_search`
256374

257375
### Example request

0 commit comments

Comments
 (0)