diff --git a/recipes/adaptive-capture-classification/README.md b/recipes/adaptive-capture-classification/README.md new file mode 100644 index 00000000..90fa486f --- /dev/null +++ b/recipes/adaptive-capture-classification/README.md @@ -0,0 +1,353 @@ +# Adaptive Capture Classification + +## What This Does + +OB1's `capture` MCP tool classifies your input once and moves on — there is no mechanism to +tell it when it got the type wrong. This recipe adds a learning loop on top of the existing capture flow: the +classifier reports a confidence score, a per-type threshold gates whether to auto-classify or +ask for confirmation, and every user response nudges that threshold up or down. Over time the +system learns which capture types it can reliably auto-classify for you and becomes +progressively less intrusive. + +## Difficulty + +Intermediate + +## Prerequisites + +- Open Brain setup complete with the `thoughts` table populated +- An AI gateway configured (OpenRouter, Ollama, or any OpenAI-compatible endpoint) +- Access to run SQL migrations against your Supabase project (Supabase Studio or CLI) +- A capture interface you control — CLI script, Telegram bot, n8n workflow, or similar + (this recipe is interface-agnostic; the gating logic sits between your interface and the + OB1 `capture` MCP call) +- A trusted server-side environment for the reference implementation below + (`capture-with-gating.ts` uses `SUPABASE_SERVICE_ROLE_KEY`, not a browser-safe anon key) + +## What You'll Learn + +- How to add confidence scoring to LLM classification without changing your existing schema +- How to implement adaptive per-type thresholds that improve from user feedback +- How to run A/B model comparisons on live captures to choose between models empirically +- How to build a spell/typo correction layer that learns from user corrections + +--- + +## Setup + +### Step 1: Run the schema migration + +Run `schema.sql` against your Supabase project. This creates four tables alongside your +existing OB1 schema — nothing existing is modified. + +**Supabase Studio:** open the SQL editor, paste the contents of `schema.sql`, and run it. + +**Supabase CLI:** +```bash +supabase migration new adaptive_capture_classification +# paste the contents of schema.sql into the generated migration file +supabase db push +``` + +✅ **Done when:** the four tables appear in your Supabase Studio table list: +`correction_learnings`, `classification_outcomes`, `capture_thresholds`, `ab_comparisons`. + +--- + +### Step 2: Configure your capture types + +In `classifier_prompt.md`, edit the `{types}` list to match your Open Brain schema. OB1's +canonical types are: + +```json +["idea", "task", "person_note", "reference", "decision", "lesson", "meeting", "journal"] +``` + +Remove types you don't use, but do not add values outside this list unless you have added +corresponding type support to your `thoughts` table. The classifier and threshold system +will adapt to whatever list you set — per-type rows in `capture_thresholds` are created on +first use. + +--- + +### Step 3: Add a user context string + +The classifier prompt includes a `{user_context}` slot. Fill this in with a short description +of your active projects and any domain vocabulary the classifier might otherwise misread as +typos or generic words. Store this string wherever your capture interface loads configuration. + +Example: +``` +Maya is a product designer in San Francisco. Active projects: Onboarding v3, Design System +audit, Q2 mobile app. Domain terms: Figma, Lottie, handoff, A11y, WCAG. +``` + +--- + +### Step 4: Wrap your capture call with the gating logic + +> [!NOTE] +> This recipe provides a pseudocode description of the gating pattern plus a minimal +> TypeScript reference implementation (`capture-with-gating.ts`). You will need to adapt +> the implementation to your own capture interface — the logic is intentionally kept +> separate from any specific bot, CLI, or workflow framework. + +Replace your current direct call to the OB1 `capture` MCP tool with the following pattern. + +``` +CONSTANTS: + DEFAULT_THRESHOLD = 0.75 + THRESHOLD_MIN = 0.50 + THRESHOLD_MAX = 0.95 + THRESHOLD_NUDGE = 0.02 + CONSISTENCY_CUTOFF = 9 # re-run classifier if confidence below this + CONSISTENCY_FACTOR = 0.6 # multiply confidence if two runs disagree on type + +FUNCTION process_capture(raw_text, hint_type=null, hint_project=null): + + # --- 1. Classify --- + result = call_llm( + system_prompt = SYSTEM_PROMPT, # from classifier_prompt.md + user_prompt = USER_TEMPLATE.format( + user_context = USER_CONTEXT, + content = raw_text, + types = CAPTURE_TYPES, + type_note = "" if hint_type is null + else ' — you MUST use "{hint_type}" for this field' + ), + temperature = 0.1 + ) + classified = parse_json(result) # fields: type, title, tags, project, + # due_date, confidence (0–10) + + # --- 2. Optional consistency check --- + if hint_type is null and classified.confidence < CONSISTENCY_CUTOFF: + second = call_llm(same prompt) + if second.type != classified.type: + classified.confidence = round(classified.confidence * CONSISTENCY_FACTOR) + + # --- 3. Look up learned threshold --- + threshold = get_threshold(classified.type) # SELECT from capture_thresholds + + # --- 4. Gate: auto-classify or ask? --- + if hint_type is not null or (classified.confidence / 10) >= threshold: + auto_classify = true + else: + auto_classify = false + + # --- 5. Record outcome (user_accepted filled in later) --- + outcome_id = insert_outcome( + model = model_name, + item_type = classified.type, + confidence = classified.confidence, + auto_classified = auto_classify + ) + + return { classified, auto_classify, outcome_id } + + +FUNCTION complete_capture(outcome_id, classified, accepted, user_correction=null): + + # --- 6. Write to OB1 --- + if accepted: + call_ob1_capture( + content = classified.title, + type = classified.type, + tags = classified.tags, + project = classified.project, + due_date = classified.due_date + ) + + # --- 7. Record feedback and adjust threshold --- + update_outcome(outcome_id, user_accepted=accepted, user_correction=user_correction) + adjust_threshold(classified.type, accepted=accepted) + + +FUNCTION adjust_threshold(item_type, accepted): + current = get_threshold(item_type) + # Lower threshold when accepted (auto-classify was correct → be more aggressive) + # Raise threshold when rejected (auto-classify was wrong → be more conservative) + delta = -THRESHOLD_NUDGE if accepted else +THRESHOLD_NUDGE + new_val = clamp(current + delta, THRESHOLD_MIN, THRESHOLD_MAX) + upsert_threshold(item_type, new_val) +``` + +--- + +### Step 5: Add confirmation handling to your interface + +When `auto_classify` is `false`, prompt the user to confirm or correct. The minimum viable +confirmation loop is: + +``` +Show: "I think this is a [type]: [title]. Confidence: [n]/10. Correct? [Yes] [No] [Pick type]" + +On Yes: complete_capture(outcome_id, classified, accepted=true) +On No: complete_capture(outcome_id, classified, accepted=false) +On Pick: re-run classification with hint_type = user_chosen_type + complete_capture(outcome_id, new_classified, accepted=true, user_correction=user_chosen_type) +``` + +For CLI interfaces, a simple y/n prompt is sufficient. For Telegram or Slack, an inline +keyboard with type buttons works well. + +--- + +### Step 6 (optional): Enable A/B model comparison + +If you want to empirically compare two models before committing to one as your default, +run both classifiers in parallel on the same capture and ask the user to pick the better +result. Store the outcome in `ab_comparisons`. + +``` +model_a_result = classify(text, model="model-a") +model_b_result = classify(text, model="model-b") + +compare_id = insert_ab_compare(model_a_result, model_b_result) + +Show both results to user, ask: "Which is better? [A] [B] [Both] [Neither]" + +update_ab_compare(compare_id, winner=user_choice) +``` + +Query `ab_comparisons` to see win rates: +```sql +SELECT + model_a, + model_b, + COUNT(*) AS total, + SUM(CASE WHEN winner = 'a' THEN 1 ELSE 0 END) AS wins_a, + SUM(CASE WHEN winner = 'b' THEN 1 ELSE 0 END) AS wins_b, + AVG(time_ms_a) AS avg_ms_a, + AVG(time_ms_b) AS avg_ms_b +FROM ab_comparisons +WHERE winner IS NOT NULL +GROUP BY model_a, model_b; +``` + +--- + +## How to Use It + +Once wired up, your capture flow works like this: + +**High-confidence capture** — the classifier returns confidence 8/10 and the current +threshold for `task` is 0.75. It auto-classifies, writes to OB1, and silently nudges the +`task` threshold down by 0.02 (the system is performing well — it can be slightly more +aggressive next time). + +**Low-confidence capture** — the classifier returns confidence 5/10 for `note` but the +threshold is 0.75. You get a confirmation prompt. If you accept, the threshold nudges down. +If you correct it to `decision`, the threshold for `note` nudges up (you were right to ask) +and the correction is recorded. + +**Checking learned thresholds:** +```sql +SELECT item_type, threshold, sample_count, updated_at +FROM capture_thresholds +ORDER BY sample_count DESC; +``` + +**Checking model accuracy:** +```sql +SELECT + model, + COUNT(*) AS outcomes, + ROUND(AVG(CASE WHEN user_accepted THEN 1.0 ELSE 0.0 END), 3) AS accuracy +FROM classification_outcomes +WHERE user_accepted IS NOT NULL +GROUP BY model; +``` + +--- + +## Example + +**Capture:** "need to call alex re thursday meeting" + +**Classifier output:** +```json +{ + "type": "task", + "title": "Call Alex re Thursday meeting", + "tags": ["comms", "meeting"], + "project": null, + "due_date": null, + "confidence": 9 +} +``` + +**Current `task` threshold:** 0.75 + +**Decision:** 9/10 = 0.90 ≥ 0.75 → auto-classify. Written to OB1 immediately with no prompt. +`task` threshold nudges to 0.73. + +--- + +**Capture:** "alex thursday" + +**Classifier output:** +```json +{ + "type": "task", + "title": "Follow up with Alex on Thursday", + "tags": ["comms"], + "project": null, + "due_date": null, + "confidence": 5 +} +``` + +**Decision:** 5/10 = 0.50 < 0.75 → confirmation prompt shown. + +User taps **Yes** → written to OB1, `task` threshold nudges to 0.73. +User taps **Decision** → re-classified as `decision`, written to OB1, `task` threshold +nudges to 0.77 (auto-classify was wrong for this type — raise the bar slightly). + +--- + +## Notes + +**The threshold floor is 0.50.** Even a perfectly performing classifier will never +auto-classify captures where the LLM reports less than 50% confidence. Captures that +consistently fall below the floor for a given type are inherently ambiguous — the right +response is to always ask. + +**The threshold ceiling is 0.95.** The classifier is never treated as infallible. +This preserves occasional confirmation prompts even for types the system handles well, +which catches model drift and keeps the feedback loop alive. + +**Thresholds are per-type, not global.** `task` captures may auto-classify aggressively +(threshold 0.60) while `decision` captures — which carry more consequence — stay +conservative (threshold 0.85). The system finds this balance from your corrections +without manual tuning. + +**The spell correction tables (`correction_learnings`) are optional.** If your capture +interface already handles typos, or your users type accurately, skip this table entirely. +The rest of the recipe functions without it. If you do use it, see `classifier_prompt.md` +for guidance on domain vocabulary that should never be flagged as misspellings. + +**This recipe does not modify the OB1 `thoughts` table or any existing schema.** All four +tables are additive. Rolling back means dropping the four tables; nothing else is affected. + +## Troubleshooting + +**All captures are being asked for confirmation.** +The thresholds start at 0.75 and the classifier may report conservative confidence scores +initially. This is expected — the system needs 10–20 captures per type before thresholds +settle. If confidence scores are consistently low (< 5), check your `{user_context}` string; +a missing or vague context causes the classifier to hedge on `project` and `type` fields, +which drags confidence down. + +**The classifier keeps choosing the wrong type.** +Add an explicit type hint at capture time for captures where you know the type. A hint +locks the type field (confidence is set to 10 automatically) and the other metadata fields +are still inferred. This trains the threshold without polluting the accuracy stats with +cases where classification was genuinely ambiguous. + +**Threshold for a type is stuck near the ceiling.** +This means the auto-classifier has been wrong more often than right for that type. Check +your `classification_outcomes` rows for that type — look at `confidence` values alongside +`user_accepted`. If confidence is consistently high but `user_accepted` is false, your +capture vocabulary for that type may be ambiguous or your type definitions may overlap. A +revised `{types}` list or a clearer user context string usually resolves this. diff --git a/recipes/adaptive-capture-classification/capture-with-gating.ts b/recipes/adaptive-capture-classification/capture-with-gating.ts new file mode 100644 index 00000000..78c4ee2a --- /dev/null +++ b/recipes/adaptive-capture-classification/capture-with-gating.ts @@ -0,0 +1,296 @@ +/** + * Adaptive Capture Classification — TypeScript reference implementation + * + * Wraps OB1's capture flow with confidence gating and a per-type learning loop. + * Uses the Supabase JS client for database access and fetch() for LLM calls. + * + * Prerequisites: + * - schema.sql applied to your Supabase project + * - SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY, OPENROUTER_API_KEY set in environment + * - `npm install @supabase/supabase-js` + * + * Adapt callLLM() to point at your preferred AI gateway (OpenRouter shown here). + * Adapt writeToOB1() to call your OB1 capture MCP tool or REST endpoint. + */ + +import { createClient } from "@supabase/supabase-js"; + +// --------------------------------------------------------------------------- +// Config +// --------------------------------------------------------------------------- + +const SUPABASE_URL = process.env.SUPABASE_URL!; +const SUPABASE_SERVICE_ROLE_KEY = process.env.SUPABASE_SERVICE_ROLE_KEY!; +const OPENROUTER_API_KEY = process.env.OPENROUTER_API_KEY!; + +const DEFAULT_MODEL = "openai/gpt-4o-mini"; // swap for any OpenRouter model +const DEFAULT_THRESHOLD = 0.75; +const THRESHOLD_MIN = 0.50; +const THRESHOLD_MAX = 0.95; +const THRESHOLD_NUDGE = 0.02; +const CONSISTENCY_CUTOFF = 9; // re-run if confidence below this +const CONSISTENCY_FACTOR = 0.6; // penalty when two runs disagree on type + +// OB1 canonical capture types — edit to match your thoughts table +const OB1_TYPES = [ + "idea", "task", "person_note", "reference", + "decision", "lesson", "meeting", "journal", +] as const; + +type CaptureType = typeof OB1_TYPES[number]; + +// --------------------------------------------------------------------------- +// Types +// --------------------------------------------------------------------------- + +interface Classified { + type: CaptureType; + title: string; + tags: string[]; + project: string | null; + due_date: string | null; + confidence: number; // 0–10 + model: string; +} + +interface PipelineResult { + classified: Classified; + autoClassify: boolean; + outcomeId: string; + threshold: number; +} + +// --------------------------------------------------------------------------- +// Supabase client +// --------------------------------------------------------------------------- + +const db = createClient(SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY); + +// --------------------------------------------------------------------------- +// LLM classification +// --------------------------------------------------------------------------- + +const SYSTEM_PROMPT = `You are a capture classifier for a personal knowledge management system. +Analyse the given text and return a JSON object with classification metadata. +Return ONLY the JSON object — no markdown fences, no explanation, no extra text.`; + +function buildUserPrompt(content: string, userContext: string, hintType?: CaptureType): string { + const typeNote = hintType + ? ` — you MUST use "${hintType}" for this field` + : ""; + return `User context (use this to identify projects and domain): +${userContext || "(not provided)"} + +--- + +Classify the following capture: +"${content}" + +Return a JSON object with exactly these fields: +- "type": one of ${JSON.stringify(OB1_TYPES)}${typeNote} +- "title": short descriptive title, 3–7 words, sentence case +- "tags": array of 1–3 relevant lowercase keywords (not the type name, not the project name) +- "project": one of the user's active project names if clearly relevant, otherwise null +- "due_date": ISO 8601 date string if a specific deadline is stated, otherwise null +- "confidence": integer 0–10 — how confident you are in the type classification`; +} + +async function callLLM( + content: string, + userContext: string, + model: string, + hintType?: CaptureType, +): Promise { + const response = await fetch("https://openrouter.ai/api/v1/chat/completions", { + method: "POST", + headers: { + "Authorization": `Bearer ${OPENROUTER_API_KEY}`, + "Content-Type": "application/json", + }, + body: JSON.stringify({ + model, + temperature: 0.1, + messages: [ + { role: "system", content: SYSTEM_PROMPT }, + { role: "user", content: buildUserPrompt(content, userContext, hintType) }, + ], + }), + }); + + const data = await response.json(); + const raw = data.choices[0].message.content as string; + const parsed = JSON.parse(raw.replace(/```(?:json)?\s*|\s*```/g, "").trim()); + + const type = OB1_TYPES.includes(parsed.type) ? parsed.type as CaptureType : "idea"; + const confidence = Math.max(0, Math.min(10, Math.round(Number(parsed.confidence ?? 7)))); + + return { + type: hintType ?? type, + title: String(parsed.title || content.slice(0, 60)), + tags: (parsed.tags ?? []).map((t: unknown) => String(t).toLowerCase()), + project: parsed.project ?? null, + due_date: parsed.due_date ?? null, + confidence: hintType ? 10 : confidence, + model, + }; +} + +// --------------------------------------------------------------------------- +// Threshold management +// --------------------------------------------------------------------------- + +async function getThreshold(itemType: string): Promise { + const { data } = await db + .from("capture_thresholds") + .select("threshold") + .eq("item_type", itemType) + .maybeSingle(); + return data ? Number(data.threshold) : DEFAULT_THRESHOLD; +} + +async function adjustThreshold(itemType: string, accepted: boolean): Promise { + const current = await getThreshold(itemType); + const { data: thresholdRow } = await db + .from("capture_thresholds") + .select("sample_count") + .eq("item_type", itemType) + .maybeSingle(); + const delta = accepted ? -THRESHOLD_NUDGE : +THRESHOLD_NUDGE; + const newVal = Math.max(THRESHOLD_MIN, Math.min(THRESHOLD_MAX, current + delta)); + + await db.from("capture_thresholds").upsert({ + item_type: itemType, + threshold: newVal, + sample_count: (thresholdRow?.sample_count ?? 0) + 1, + updated_at: new Date().toISOString(), + }, { onConflict: "item_type" }); +} + +// --------------------------------------------------------------------------- +// Outcome recording +// --------------------------------------------------------------------------- + +async function recordOutcome(classified: Classified, autoClassify: boolean): Promise { + const id = crypto.randomUUID(); + await db.from("classification_outcomes").insert({ + id, + model: classified.model, + item_type: classified.type, + confidence: classified.confidence, + auto_classified: autoClassify, + created_at: new Date().toISOString(), + }); + return id; +} + +async function resolveOutcome( + outcomeId: string, + accepted: boolean, + userCorrection?: string, +): Promise { + await db.from("classification_outcomes").update({ + user_accepted: accepted, + user_correction: userCorrection ?? null, + }).eq("id", outcomeId); +} + +// --------------------------------------------------------------------------- +// OB1 capture — replace with your MCP tool call or REST endpoint +// --------------------------------------------------------------------------- + +async function writeToOB1(classified: Classified): Promise { + // Example: direct Supabase insert into the thoughts table. + // If you use the OB1 MCP tool, call it here instead. + await db.from("thoughts").insert({ + content: classified.title, + type: classified.type, + tags: classified.tags, + project: classified.project, + due_date: classified.due_date, + created_at: new Date().toISOString(), + }); +} + +// --------------------------------------------------------------------------- +// Main pipeline +// --------------------------------------------------------------------------- + +/** + * Process a capture through the full gating pipeline. + * Returns a PipelineResult — the caller decides what to do based on autoClassify. + */ +export async function processCapture( + rawText: string, + options: { + userContext?: string; + hintType?: CaptureType; + model?: string; + } = {}, +): Promise { + const { userContext = "", hintType, model = DEFAULT_MODEL } = options; + + // Step 1: classify + let classified = await callLLM(rawText, userContext, model, hintType); + + // Step 2: optional consistency check + if (!hintType && classified.confidence < CONSISTENCY_CUTOFF) { + const second = await callLLM(rawText, userContext, model); + if (second.type !== classified.type) { + classified.confidence = Math.round(classified.confidence * CONSISTENCY_FACTOR); + } + } + + // Step 3: look up learned threshold + const threshold = await getThreshold(classified.type); + + // Step 4: gate + const autoClassify = hintType !== undefined || (classified.confidence / 10) >= threshold; + + // Step 5: record outcome (user_accepted filled in later via completeCapture) + const outcomeId = await recordOutcome(classified, autoClassify); + + return { classified, autoClassify, outcomeId, threshold }; +} + +/** + * Complete a capture after the user has confirmed or corrected it. + * Call this whether the capture was auto-classified or manually confirmed. + */ +export async function completeCapture( + outcomeId: string, + classified: Classified, + accepted: boolean, + userCorrection?: CaptureType, +): Promise { + if (accepted) { + await writeToOB1(classified); + } + await resolveOutcome(outcomeId, accepted, userCorrection); + await adjustThreshold(classified.type, accepted); +} + +// --------------------------------------------------------------------------- +// Example usage +// --------------------------------------------------------------------------- + +/* +const result = await processCapture("need to follow up with Sarah re the proposal", { + userContext: "Active projects: client pitch, Q2 planning. Team: Sarah, James.", +}); + +if (result.autoClassify) { + // High confidence — write immediately and record positive feedback + await completeCapture(result.outcomeId, result.classified, true); + console.log(`Captured as ${result.classified.type}: ${result.classified.title}`); +} else { + // Low confidence — ask the user + console.log( + `I think this is a ${result.classified.type}: "${result.classified.title}"\n` + + `Confidence: ${result.classified.confidence}/10. Correct? [y/n]` + ); + + // On user confirmation: + const accepted = true; // replace with actual user input + await completeCapture(result.outcomeId, result.classified, accepted); +} +*/ diff --git a/recipes/adaptive-capture-classification/classifier_prompt.md b/recipes/adaptive-capture-classification/classifier_prompt.md new file mode 100644 index 00000000..94ff9335 --- /dev/null +++ b/recipes/adaptive-capture-classification/classifier_prompt.md @@ -0,0 +1,72 @@ +# Classification Prompt + +This is the system and user prompt used by the classifier. Adapt the type list and user context +section to match your own Open Brain schema and projects. + +--- + +## System prompt + +``` +You are a capture classifier for a personal knowledge management system. +Analyse the given text and return a JSON object with classification metadata. +Return ONLY the JSON object — no markdown fences, no explanation, no extra text. +``` + +--- + +## User prompt template + +``` +User context (use this to identify projects and domain): +{user_context} + +--- + +Classify the following capture: +"{content}" + +Return a JSON object with exactly these fields: +- "type": one of {types} +- "title": short descriptive title, 3–7 words, sentence case +- "tags": array of 1–3 relevant lowercase keywords (not the type name, not the project name) +- "project": one of the user's active project names if clearly relevant, otherwise null +- "due_date": ISO 8601 date string if a specific deadline is stated, otherwise null +- "confidence": integer 0–10 — how confident you are in the type classification + (0 = no confidence, 10 = very confident) +``` + +--- + +## Notes on the prompt + +**`{types}`** — OB1's canonical types are: +``` +["idea", "task", "person_note", "reference", "decision", "lesson", "meeting", "journal"] +``` +Use this list as-is or remove types you don't use. Do not add types that don't exist in your +`thoughts` table schema — the classifier will hallucinate values outside this list if the +prompt doesn't constrain it. + +**`{user_context}`** — inject a short description of the user's active projects and domain so +the classifier can recognise project names and domain-specific terms without being confused +by jargon. Example: + +``` +Bruce is a engineer based in Melbourne. Active projects: MCG redesign, Werribee precinct master plan study. Domain terms: PMP, PPP. +``` + +**`{content}`** — the raw (or spell-corrected) capture text from the user. + +**Confidence scoring** — the LLM reports its own confidence as an integer 0–10. Values ≥ 7 +are typically reliable for auto-classification; values < 5 almost always need user +confirmation. The per-type thresholds in `capture_thresholds` let this calibrate +automatically over time rather than requiring a fixed cutoff. + +**Consistency check (optional)** — for captures where confidence < 9, run the prompt twice +and compare the `type` field. If the two calls disagree, multiply the reported confidence +by 0.6. This catches cases where the LLM is uncertain but happens to report high confidence +on the first pass. + +**Temperature** — use 0.1. Higher temperatures increase variability in the type field and +produce confidence scores that are harder to calibrate. diff --git a/recipes/adaptive-capture-classification/metadata.json b/recipes/adaptive-capture-classification/metadata.json new file mode 100644 index 00000000..e70d52d7 --- /dev/null +++ b/recipes/adaptive-capture-classification/metadata.json @@ -0,0 +1,29 @@ +{ + "name": "Adaptive Capture Classification", + "description": "Adds confidence gating and a per-type learning loop to OB1's capture flow. The classifier reports a confidence score; a threshold gates whether to auto-classify or ask the user; every user response nudges the threshold up or down. Includes optional A/B model comparison and spell correction learning tables.", + "category": "recipes", + "author": { + "name": "Matthew Hardy", + "github": "mjh756" + }, + "version": "1.0.0", + "requires": { + "open_brain": true, + "services": ["Any OpenAI-compatible LLM gateway (OpenRouter, Ollama, etc.)"], + "tools": ["Supabase CLI or Supabase Studio"] + }, + "tags": [ + "classification", + "confidence", + "learning", + "capture", + "thresholds", + "a/b-testing", + "spell-correction", + "feedback-loop" + ], + "difficulty": "intermediate", + "estimated_time": "60–90 minutes", + "created": "2026-03-31", + "updated": "2026-03-31" +} diff --git a/recipes/adaptive-capture-classification/schema.sql b/recipes/adaptive-capture-classification/schema.sql new file mode 100644 index 00000000..1a2579c2 --- /dev/null +++ b/recipes/adaptive-capture-classification/schema.sql @@ -0,0 +1,118 @@ +-- Adaptive Capture Classification — Learning Tables +-- Adds four tables to your Open Brain database to support confidence gating, +-- per-type threshold learning, A/B model comparison, and spell correction learning. +-- +-- Run this migration once against your Supabase project. +-- No existing OB1 tables are modified. + +-- ============================================================ +-- 1. correction_learnings +-- Tracks user feedback on individual word corrections. +-- After two rejections a correction is suppressed permanently. +-- ============================================================ +CREATE TABLE IF NOT EXISTS correction_learnings ( + word TEXT NOT NULL, + correction TEXT NOT NULL, + accepted INTEGER DEFAULT 0, + rejected INTEGER DEFAULT 0, + PRIMARY KEY (word, correction) +); + +GRANT SELECT, INSERT, UPDATE ON correction_learnings TO authenticated; + +ALTER TABLE correction_learnings ENABLE ROW LEVEL SECURITY; + +CREATE POLICY "Users manage their own correction learnings" + ON correction_learnings + FOR ALL + USING (auth.role() = 'authenticated'); + +-- ============================================================ +-- 2. classification_outcomes +-- One row per capture attempt. Records the model used, +-- LLM confidence, whether it was auto-classified, and the +-- user's eventual verdict. Used to track model accuracy +-- and drive threshold adjustments. +-- ============================================================ +CREATE TABLE IF NOT EXISTS classification_outcomes ( + id TEXT PRIMARY KEY DEFAULT gen_random_uuid()::text, + item_id TEXT, + model TEXT NOT NULL, + item_type TEXT NOT NULL, + confidence REAL NOT NULL, + auto_classified BOOLEAN NOT NULL DEFAULT FALSE, + user_accepted BOOLEAN, + user_correction TEXT, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +CREATE INDEX IF NOT EXISTS idx_outcomes_model ON classification_outcomes (model); +CREATE INDEX IF NOT EXISTS idx_outcomes_type ON classification_outcomes (item_type); +CREATE INDEX IF NOT EXISTS idx_outcomes_date ON classification_outcomes (created_at); + +GRANT SELECT, INSERT, UPDATE ON classification_outcomes TO authenticated; + +ALTER TABLE classification_outcomes ENABLE ROW LEVEL SECURITY; + +CREATE POLICY "Users manage their own outcomes" + ON classification_outcomes + FOR ALL + USING (auth.role() = 'authenticated'); + +-- ============================================================ +-- 3. capture_thresholds +-- Stores the current auto-classify threshold for each +-- capture type. Starts at 0.75 for every type. Nudged +-- down when the user silently accepts an auto-classify +-- result; nudged up when they correct it. Clamped 0.50–0.95. +-- ============================================================ +CREATE TABLE IF NOT EXISTS capture_thresholds ( + item_type TEXT PRIMARY KEY, + threshold REAL NOT NULL DEFAULT 0.75, + sample_count INTEGER DEFAULT 0, + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +GRANT SELECT, INSERT, UPDATE ON capture_thresholds TO authenticated; + +ALTER TABLE capture_thresholds ENABLE ROW LEVEL SECURITY; + +CREATE POLICY "Users manage their own thresholds" + ON capture_thresholds + FOR ALL + USING (auth.role() = 'authenticated'); + +-- ============================================================ +-- 4. ab_comparisons +-- Head-to-head model comparison results. One row per +-- comparison session. Winner is 'a', 'b', 'both', or +-- 'neither', set by the user. Used to decide which model +-- to promote as the default classifier. +-- ============================================================ +CREATE TABLE IF NOT EXISTS ab_comparisons ( + id TEXT PRIMARY KEY DEFAULT gen_random_uuid()::text, + model_a TEXT NOT NULL, + model_b TEXT NOT NULL, + item_type_a TEXT NOT NULL, + item_type_b TEXT NOT NULL, + confidence_a REAL NOT NULL, + confidence_b REAL NOT NULL, + time_ms_a INTEGER NOT NULL, + time_ms_b INTEGER NOT NULL, + tokens_a INTEGER, + tokens_b INTEGER, + winner TEXT CHECK (winner IN ('a', 'b', 'both', 'neither')), + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +CREATE INDEX IF NOT EXISTS idx_ab_model_a ON ab_comparisons (model_a); +CREATE INDEX IF NOT EXISTS idx_ab_model_b ON ab_comparisons (model_b); + +GRANT SELECT, INSERT, UPDATE ON ab_comparisons TO authenticated; + +ALTER TABLE ab_comparisons ENABLE ROW LEVEL SECURITY; + +CREATE POLICY "Users manage their own A/B comparisons" + ON ab_comparisons + FOR ALL + USING (auth.role() = 'authenticated');