[recipes] Thought enrichment pipeline#192
Conversation
|
5 of my open PRs have merge conflicts and may also be overlapped by recent upstream work. Before I spend time rebasing them, I'd like a quick keep/close signal on each. I know 25 open PRs from one contributor is a lot; happy to close anything that's not useful. Possibly reshaped by recent upstream direction (#278, #280, #283):
Possibly superseded by adjacent upstream PRs:
Possibly too specialized for core OB1:
No detailed review needed; a keep/close direction per item is enough. |
LLM-based thought classification. Hardened from code review: bounded regex (closes a ReDoS vector), AbortController fetch timeouts, delimited untrusted content with capped outputs, a --max-calls spend cap, cursor pagination, and checkpoint resume.
f8522bf to
fe64827
Compare
|
Rebased onto For review context: |
Summary
LLM-powered recipe that retroactively classifies existing thoughts with type, importance, quality score, sensitivity tier, and structured metadata (topics, tags, people, action items).
backfill-typeandbackfill-sensitivityscripts for targeted re-runs.env.localWhy
Most Open Brain sources hand you raw content. Without enrichment, the
type,importance,quality_score, and sensitivity columns from the enhanced-thoughts schema stay null and the thoughts remain hard to retrieve or filter. This recipe is the turn-key way to populate those columns for anyone who's already running Open Brain and has accumulated untyped thoughts.Requires the
schemas/enhanced-thoughts/columns from #191. The README calls that out in Prerequisites.Part 2 of 12 in the OB1 Alpha Milestone consolidation.
Test plan
schemas/enhanced-thoughts/schema.sqlfirst (dependency).env.localwith Supabase + one LLM provider keyenrich-thoughts.mjs --limit 10— verify 10 thoughts get classifiedbackfill-sensitivity.mjs --dry-run— verify pattern matches print but no writesmetadata.jsonpasses the gate