You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: recipes/thought-enrichment/README.md
+13-4Lines changed: 13 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -51,7 +51,11 @@ Classifies each thought using an LLM and writes structured metadata back to Supa
51
51
node enrich-thoughts.mjs --apply --retry-failed
52
52
```
53
53
54
-
**Flags:**`--provider` (openrouter or anthropic), `--concurrency`, `--limit`, `--skip`, `--model`.
54
+
**Flags:**`--provider` (openrouter or anthropic), `--concurrency`, `--limit`, `--skip`, `--model`, `--max-calls`, `--reset-state`.
55
+
56
+
The `--max-calls` flag is a hard ceiling on the number of LLM calls per run. The default is `10000`; pass `--max-calls 0` to disable the cap. When the limit is hit the script aborts cleanly, prints a summary, and leaves remaining rows with `enriched=false` so you can resume later. This protects against a shell typo (e.g. dropping `--limit`) burning unbounded spend against a large un-enriched table.
57
+
58
+
**Resume.** The script checkpoints `lastProcessedId` to `data/enrichment-state.json` after each concurrency chunk. On startup, if a checkpoint exists and neither `--skip` nor `--reset-state` was passed, the run resumes from `id > lastProcessedId`. The `enriched=false` filter is still applied as a second layer of defense. Pass `--reset-state` to ignore the checkpoint and start from scratch.
55
59
56
60
### backfill-type.mjs -- Type canonicalization
57
61
@@ -81,9 +85,9 @@ Scans thought content for patterns matching SSNs, credit cards, API keys, passwo
81
85
82
86
2. Apply:
83
87
84
-
```bash
85
-
node backfill-sensitivity.mjs --apply
86
-
```
88
+
```bash
89
+
node backfill-sensitivity.mjs --apply
90
+
```
87
91
88
92
## Recommended execution order
89
93
@@ -92,6 +96,11 @@ Scans thought content for patterns matching SSNs, credit cards, API keys, passwo
92
96
3. Run `enrich-thoughts.mjs --dry-run --limit 20` to preview LLM classifications.
93
97
4. Run `enrich-thoughts.mjs --apply` to enrich all remaining thoughts.
94
98
99
+
## Security notes
100
+
101
+
-**Prompt injection:** thought content is wrapped in `<thought_content>` tags and the system prompt instructs the model to treat everything inside as untrusted data. Any literal tag occurrences in content are escaped. Output fields (`summary`, `topics`, `tags`, `people`, `action_items`) are length-capped and control-char-stripped before they are written to `metadata`. Even so, enriching hostile third-party imports (shared chat exports, scraped feeds) can still influence classification labels — review before trusting them as ground truth.
102
+
-**Bearer token on the wire:** every request carries your Supabase service-role key. Double-check that `SUPABASE_URL` points at your own Supabase project, not a proxy or debug server.
103
+
95
104
## Cost expectations
96
105
97
106
The default OpenRouter model is `openai/gpt-4o-mini` at roughly $0.001--0.002 per thought. For 1,000 thoughts, expect approximately $1--2. The `backfill-type` and `backfill-sensitivity` scripts are free (no LLM calls -- they use local logic only).
0 commit comments