diff --git a/integrations/enhanced-mcp/README.md b/integrations/enhanced-mcp/README.md new file mode 100644 index 000000000..e59491fac --- /dev/null +++ b/integrations/enhanced-mcp/README.md @@ -0,0 +1,160 @@ +# Enhanced MCP Server + +> Production-grade remote MCP server expanding the Open Brain tool surface from 4 to 13 tools with enhanced search, CRUD, enrichment, sensitivity detection, and operational monitoring. + +## What It Does + +This integration deploys a second MCP server alongside the stock Open Brain server. It adds semantic and full-text search modes, content dedup via SHA-256 fingerprinting, automatic LLM-powered metadata classification, sensitivity detection (restricted content is blocked from cloud capture), and operational monitoring tools that light up when optional schemas are installed. + +The original `server/` connector remains untouched and safe to leave connected: the four tools that would otherwise collide (`capture_thought`, `search_thoughts`, `list_thoughts`, `thought_stats`) are namespaced with a `brain_` prefix in this server, so both tool sets can coexist without the model seeing duplicate names. + +## Prerequisites + +- Working Open Brain setup ([guide](../../docs/01-getting-started.md)) +- **Enhanced Thoughts schema applied** — install `schemas/enhanced-thoughts` first (adds type, importance, sensitivity columns and utility RPCs) +- OpenRouter API key (same one from the Getting Started guide) +- Supabase CLI installed for deployment +- Optional: `schemas/smart-ingest` (unlocks `ops_capture_status` tool) +- Optional: `schemas/knowledge-graph` (unlocks `graph_search`, `entity_detail`, `ops_source_monitor` tools) + +## Security + +This server authenticates every request against `MCP_ACCESS_KEY` using a constant-time comparison, and accepts the key only through the `x-brain-key` header or `Authorization: Bearer …` — never a URL query string. It runs under the Supabase `service_role`, which bypasses RLS by design; that is intentional for MCP use, but it does mean this Edge Function is the sensitivity-filter boundary. All tools that expose thought content skip `sensitivity_tier = 'restricted'` rows, and `brain_capture_thought` rejects restricted content outright (same for `update_thought`). + +**Companion schema exposure — please read before deploying publicly.** The enhanced-thoughts schema this server depends on is intended to install with `service_role`-only grants on the sensitive RPCs (`search_thoughts_text`, `brain_stats_aggregate`, `get_thought_connections`) — no `anon` GRANTs by default. That means those RPCs are reachable only via authenticated server-side code, including this MCP server. If your deployment's copy of that schema also grants `anon`, or if you later add public grants for a dashboard, be aware: `SECURITY DEFINER` + `anon` grant is an RLS bypass because the function body runs with the function owner's privileges. Combined with a publicly-reachable enhanced-mcp deployment, this would let anyone with your Supabase project URL + anon key read thought content directly via those RPCs — routing around this server's sensitivity filtering. Audit the grants on your companion schemas before exposing this MCP outside a trusted network. + +## Credential Tracker + +Copy this block into a text editor and fill it in as you go. + +```text +ENHANCED MCP SERVER -- CREDENTIAL TRACKER +------------------------------------------ + +FROM YOUR OPEN BRAIN SETUP + Project URL: ____________ + Service role key: ____________ + MCP access key: ____________ + OpenRouter API key: ____________ + +OPTIONAL (for multi-provider fallback) + OpenAI API key: ____________ + Anthropic API key: ____________ + +------------------------------------------ +``` + +## Steps + +### 1. Deploy the Edge Function + +Copy the `integrations/enhanced-mcp/` folder into your Supabase project's `supabase/functions/` directory, then deploy: + +```bash +supabase functions deploy enhanced-mcp --no-verify-jwt +``` + +### 2. Set Environment Variables + +Add your secrets to the deployed function: + +```bash +supabase secrets set \ + MCP_ACCESS_KEY="your-access-key" \ + OPENROUTER_API_KEY="your-openrouter-key" +``` + +Optional multi-provider fallback (for metadata classification resilience): + +```bash +supabase secrets set \ + OPENAI_API_KEY="your-openai-key" \ + ANTHROPIC_API_KEY="your-anthropic-key" +``` + +### 3. Add as a Remote MCP Connector + +In Claude Desktop (or any MCP-compatible client), add a new remote connector: + +- **Name:** `Open Brain Enhanced` +- **URL:** `https://.supabase.co/functions/v1/enhanced-mcp` +- **Header:** `x-brain-key: ` _(or `Authorization: Bearer `)_ + +Header-only authentication — the access key is NOT accepted as a `?key=` URL query parameter. Query strings surface in Supabase, CDN, and proxy access logs, which leaks the credential into places that don't get rotated with the secret itself. Use the header (or `Authorization: Bearer …`) exclusively. + +### 4. Test Core Tools + +Verify the enhanced server is working by testing these tools in your AI client: + +1. **`brain_capture_thought`** — Save a test thought: "Testing the enhanced MCP server setup" +2. **`brain_search_thoughts`** — Search for "testing" to find the thought you just captured +3. **`brain_thought_stats`** — View your brain's type and topic distribution +4. **`brain_list_thoughts`** — Browse recent thoughts with filters + +> The four tools that overlap with the stock server are prefixed with `brain_` in this integration (`brain_capture_thought`, `brain_search_thoughts`, `brain_list_thoughts`, `brain_thought_stats`). That way you can run both servers side by side without the model seeing two tools under the same name. The stock `capture_thought` / `search_thoughts` / `list_thoughts` / `thought_stats` remain available on the original connector; this server adds `brain_*` variants with extended filters, enriched metadata, sensitivity detection, and content-fingerprint dedup. + +### 5. Enable Schema-Backed Tools (Optional) + +If you have installed optional schemas, these tools activate automatically: + +| Tool | Required Schema | What It Does | +|------|----------------|--------------| +| `ops_capture_status` | `schemas/smart-ingest` | Ingestion job health monitoring | +| `graph_search` | `schemas/knowledge-graph` | Search entities by name or type | +| `entity_detail` | `schemas/knowledge-graph` | Full entity profile with connections | +| `ops_source_monitor` | Ops monitoring views | Per-source ingestion monitoring | + +If a required schema is not installed, the tool returns a clear message explaining which schema to install. + +## Expected Outcome + +After completing the steps above, you should have 13 tools available in your AI client under the "Open Brain Enhanced" connector. Running `brain_capture_thought` should save a thought with automatic type classification, topic extraction, and sensitivity detection. Running `brain_search_thoughts` should return results with similarity scores. Running `brain_thought_stats` should show your brain's statistics using server-side aggregation. + +If you also have the original `server/` connector active, you will see both tool sets. Thanks to the `brain_` prefix on the four overlapping tools, there are no duplicate tool names — the enhanced versions expose extended filters, sensitivity detection, and content-fingerprint dedup; the stock versions remain the minimal default. You can disable either connector at any time to reduce tool count. + +## Tool Reference + +| # | Tool | Description | Schema Required | +|---|------|-------------|-----------------| +| 1 | `brain_search_thoughts` | Semantic vector or full-text search with date and metadata filters | Enhanced Thoughts | +| 2 | `brain_list_thoughts` | Paginated browsing with type, source, date filters and sorting | Enhanced Thoughts | +| 3 | `get_thought` | Fetch a single thought by ID with full metadata | Enhanced Thoughts | +| 4 | `update_thought` | Update content with automatic re-embedding and re-classification | Enhanced Thoughts | +| 5 | `brain_capture_thought` | Capture with dedup, sensitivity detection, and LLM classification | Enhanced Thoughts | +| 6 | `brain_thought_stats` | Type and topic statistics via server-side aggregation | Enhanced Thoughts | +| 7 | `search_thoughts_text` | Direct full-text search (faster for exact phrase matching) | Enhanced Thoughts | +| 8 | `count_thoughts` | Fast filtered count without returning content | Enhanced Thoughts | +| 9 | `related_thoughts` | Find thoughts connected by shared topics or people | Enhanced Thoughts | +| 10 | `ops_capture_status` | Ingestion health: job status, error rates, recent failures | Smart Ingest | +| 11 | `graph_search` | Search knowledge graph entities with thought counts | Knowledge Graph | +| 12 | `entity_detail` | Full entity profile: aliases, linked thoughts, relationship edges | Knowledge Graph | +| 13 | `ops_source_monitor` | Per-source ingestion volume, errors, and failure samples | Ops Views | + +### Intentionally Excluded From This Release + +- **`delete_thought`** is intentionally not included in this initial PR. It requires a `deleted_at` shadow column and a restore workflow to align with the maintainer's "depreciate and version rather than delete" preference (see PR #127 closure). It will ship in a follow-up once that column lands in `schemas/enhanced-thoughts` and a sibling `restore_thought` tool can be published alongside it. + +## Known Limitations + +- **Semantic search + date filter on dense recent brains.** `brain_search_thoughts` in semantic mode calls the `match_thoughts` RPC, which returns the top-N matches by cosine similarity. Date filtering is applied client-side on top of those results. When the RPC supports pre-cutoff date filtering via its `filter` JSONB payload, the filter is pushed server-side and the behavior is precise; when it doesn't, this integration over-fetches 3× the requested limit (capped at 500) and filters client-side. On brains with very dense recent activity and a restrictive old date window, this may miss relevant old matches ranked below the over-fetch cutoff. Workaround: use `mode: "text"` (full-text search honours date filters at the SQL level) or narrow the query. + +## Troubleshooting + +**Issue: "Invalid or missing access key" error** +Solution: Ensure your `MCP_ACCESS_KEY` secret is set in Supabase and matches the key in your connector configuration. The key must be passed via the `x-brain-key` header or `Authorization: Bearer …`. Query-string auth (`?key=…`) is intentionally not supported — it would leak the credential into access logs. + +**Issue: "No embedding API key configured" error** +Solution: At least one of `OPENROUTER_API_KEY` or `OPENAI_API_KEY` must be set. OpenRouter is the default and recommended provider for OB1. + +**Issue: Schema-backed tools return "install required schema" messages** +Solution: This is expected behavior. These tools gracefully degrade when their backing tables are not present. Install the referenced schema contribution and the tools will activate automatically. + +**Issue: "match_thoughts" or "brain_stats_aggregate" RPC not found** +Solution: The Enhanced Thoughts schema (`schemas/enhanced-thoughts`) must be applied before deploying this server. It adds the required RPCs and columns. + +**Issue: Metadata classification returns fallback results** +Solution: Check that your LLM provider API key is valid and has sufficient quota. The server tries OpenRouter first, then falls back to OpenAI and Anthropic if configured. If all providers fail, it uses safe defaults. + +## Tool Surface Area + +This integration adds up to 13 tools to your AI's context. If you are managing multiple connectors, review the [MCP Tool Audit & Optimization Guide](../../docs/05-tool-audit.md) for strategies on keeping your tool count manageable as your Open Brain grows. diff --git a/integrations/enhanced-mcp/_shared/config.ts b/integrations/enhanced-mcp/_shared/config.ts new file mode 100644 index 000000000..f9e594ed0 --- /dev/null +++ b/integrations/enhanced-mcp/_shared/config.ts @@ -0,0 +1,204 @@ +/** Shared configuration constants for the Enhanced MCP integration. */ + +// ── Embedding ──────────────────────────────────────────────────────────────── + +/** OpenAI embedding model via OpenRouter (OB1 standard). */ +export const EMBEDDING_MODEL = "openai/text-embedding-3-small"; + +/** Dimensionality of the embedding vectors stored in pgvector. */ +export const EMBEDDING_DIMENSION = 1536; + +/** Maximum content length (chars) before truncation for embedding calls. */ +export const MAX_CONTENT_LENGTH = 8000; + +// ── Classifier models ──────────────────────────────────────────────────────── +// Order reversed from ExoCortex — OpenRouter is primary for OB1 deployments. + +/** OpenRouter model used as the primary classifier. */ +export const CLASSIFIER_MODEL_OPENROUTER = "anthropic/claude-haiku-4-5"; + +/** OpenAI model used as secondary classifier fallback. */ +export const CLASSIFIER_MODEL_OPENAI = "gpt-4o-mini"; + +/** Anthropic model used as tertiary classifier fallback. */ +export const CLASSIFIER_MODEL_ANTHROPIC = "claude-haiku-4-5-20251001"; + +// ── Thought defaults ───────────────────────────────────────────────────────── + +/** Default thought type when classification is unavailable. */ +export const DEFAULT_TYPE = "idea"; + +/** + * Default importance score (0-6 scale). + * + * 0 = Noise — information we don't want + * 1 = Trivial + * 2 = Low + * 3 = Normal (center of bell curve — most thoughts land here) + * 4 = Notable + * 5 = Important + * 6 = User-flagged only — never assigned automatically by LLM + */ +export const DEFAULT_IMPORTANCE = 3; + +/** Default quality score (0-100 scale). */ +export const DEFAULT_QUALITY_SCORE = 50; + +/** Default sensitivity tier. */ +export const DEFAULT_SENSITIVITY_TIER = "standard"; + +/** Default classifier confidence for unclassified thoughts. */ +export const DEFAULT_CONFIDENCE = 0.55; + +// ── Structured capture overrides ───────────────────────────────────────────── + +/** + * Confidence assigned to thoughts captured via structured input (MCP, REST, + * Telegram) where the caller supplies explicit type/topic metadata. + */ +export const STRUCTURED_CAPTURE_CONFIDENCE = 0.82; + +/** Importance assigned to structured captures (slightly elevated). */ +export const STRUCTURED_CAPTURE_IMPORTANCE = 4; + +// ── Enrichment retry ──────────────────────────────────────────────────────── + +/** Delay (ms) before retrying the primary classifier on transient failure. */ +export const ENRICHMENT_RETRY_DELAY_MS = 1500; + +// ── Sensitivity ────────────────────────────────────────────────────────────── + +/** Ordered sensitivity tiers — index 0 is least restrictive. */ +export const SENSITIVITY_TIERS = ["standard", "personal", "restricted"] as const; + +// ── Field length limits ────────────────────────────────────────────────────── + +/** Maximum character length for thought summaries. */ +export const MAX_SUMMARY_LENGTH = 160; + +/** Maximum character length for topic hint strings. */ +export const MAX_TOPIC_HINT_LENGTH = 80; + +/** Maximum character length for next-step / action-item strings. */ +export const MAX_NEXT_STEP_LENGTH = 180; + +/** Maximum number of tags that can be attached to a single thought. */ +export const MAX_TAGS_PER_THOUGHT = 12; + +// ── Allowed types ──────────────────────────────────────────────────────────── + +/** Canonical set of thought types accepted by the system. */ +export const ALLOWED_TYPES = new Set([ + "idea", "task", "person_note", "reference", "decision", "lesson", "meeting", "journal", +]); + +// ── Classifier prompt ──────────────────────────────────────────────────────── + +/** + * System prompt sent to the classifier model when extracting metadata + * (type, summary, topics, tags, people, action_items, confidence) from + * raw thought content. + */ +export const EXTRACTION_PROMPT = [ + "You classify personal notes for a second-brain.", + "Return STRICT JSON with keys: type, summary, topics, tags, people, action_items, importance, confidence.", + "", + "IMPORTANCE (0-6 scale):", + "Rate importance 0-6. 0=noise/not useful. 1=trivial. 2=low. 3=normal. 4=notable. 5=important.", + "6 is reserved for user-flagged critical items — never assign 6 automatically.", + "", + "type must be one of: idea, task, person_note, reference, decision, lesson, meeting, journal.", + "summary: max 160 chars. topics: 1-3 short lowercase tags. tags: additional freeform labels.", + "people: names mentioned. action_items: implied to-dos. confidence: 0-1.", + "", + "CONFIDENCE CALIBRATION:", + "- 0.9+: Clearly personal — user's own decision, preference, lesson, health data", + "- 0.7-0.89: Probably personal but could be generic advice", + "- 0.5-0.69: Borderline — reads more like general knowledge than personal context", + "- Below 0.5: Generic advice, encyclopedia-grade facts, or vague filler", + "", + "Examples:", + "", + 'Input: "Met with Sarah about the API redesign. She wants GraphQL instead of REST. We\'ll prototype both by Friday."', + 'Output: {"type":"meeting","summary":"API redesign meeting with Sarah — prototyping GraphQL vs REST","topics":["api-design","graphql"],"tags":["architecture"],"people":["Sarah"],"action_items":["Prototype GraphQL API","Prototype REST API","Compare by Friday"],"confidence":0.95}', + "", + 'Input: "I\'m going to use Supabase instead of Firebase. Better SQL support and the pgvector extension is critical for embeddings."', + 'Output: {"type":"decision","summary":"Chose Supabase over Firebase for SQL and pgvector support","topics":["database","infrastructure"],"tags":["architecture"],"people":[],"action_items":[],"confidence":0.92}', + "", + 'Input: "Never run database migrations during peak traffic hours. Learned this the hard way last Tuesday."', + 'Output: {"type":"lesson","summary":"Avoid running DB migrations during peak traffic","topics":["devops","database"],"tags":["best-practice"],"people":[],"action_items":[],"confidence":0.90}', + "", + 'Input: "The boiling point of water is 100\u00B0C at sea level."', + 'Output: {"type":"reference","summary":"Boiling point of water at sea level","topics":["science"],"tags":["general-knowledge"],"people":[],"action_items":[],"confidence":0.3}', +].join("\n"); + +// ── Sensitivity patterns ──────────────────────────────────────────────────── + +/** Patterns that trigger "restricted" sensitivity tier. */ +export const RESTRICTED_PATTERNS: [RegExp, string][] = [ + [/\b\d{3}-?\d{2}-?\d{4}\b/, "ssn_pattern"], + [/\b[A-Z]{1,2}\d{6,9}\b/, "passport_pattern"], + [/\b\d{8,17}\b.*\b(account|routing|iban)\b/i, "bank_account"], + [/\b(account|routing)\b.*\b\d{8,17}\b/i, "bank_account"], + [/\b(sk-|pk_live_|sk_live_|ghp_|gho_|AKIA)[A-Za-z0-9]{10,}/i, "api_key"], + [/\bpassword\s*[:=]\s*\S+/i, "password_value"], + [/\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/, "credit_card"], +]; + +/** Patterns that trigger "personal" sensitivity tier. */ +export const PERSONAL_PATTERNS: [RegExp, string][] = [ + [/\b\d+\s*mg\b(?!\s*\/\s*(dL|kg|L|ml))/i, "medication_dosage"], + [/\b(pregabalin|metoprolol|losartan|lisinopril|aspirin|atorvastatin|sertraline|metformin|gabapentin|prednisone|insulin|warfarin)\b/i, "drug_name"], + [/\b(glucose|a1c|cholesterol|blood pressure|bp|hrv|bmi)\b.*\b\d+/i, "health_measurement"], + [/\b(diagnosed|diagnosis|prediabetic|diabetic|arrhythmia|ablation)\b/i, "medical_condition"], + [/\b(salary|income|net worth|401k|ira|portfolio)\b.*\b\$?\d/i, "financial_detail"], + [/\b\$\d{3,}[,\d]*\b/i, "financial_amount"], +]; + +// ── Type definitions ──────────────────────────────────────────────────────── + +export type ThoughtMetadata = { + type: string; + summary: string; + topics: string[]; + tags: string[]; + people: string[]; + action_items: string[]; + importance: number | null; + confidence: number; +}; + +export type SensitivityResult = { + tier: "standard" | "personal" | "restricted"; + reasons: string[]; +}; + +export type PreparedPayload = { + content: string; + embedding: number[]; + metadata: Record; + type: string; + importance: number; + quality_score: number; + sensitivity_tier: string; + source_type: string; + content_fingerprint: string; + warnings: string[]; +}; + +export type PrepareThoughtOpts = { + source?: string; + source_type?: string; + metadata?: Record; + skip_embedding?: boolean; + embedding?: number[]; + skip_classification?: boolean; +}; + +export type StructuredCapture = { + matched: boolean; + normalizedText: string; + typeHint: string | null; + topicHint: string | null; + nextStep: string | null; +}; diff --git a/integrations/enhanced-mcp/_shared/helpers.ts b/integrations/enhanced-mcp/_shared/helpers.ts new file mode 100644 index 000000000..d639a4d83 --- /dev/null +++ b/integrations/enhanced-mcp/_shared/helpers.ts @@ -0,0 +1,932 @@ +/** + * Shared helper functions for the Enhanced MCP integration. + * + * Ported from ExoCortex open-brain-utils.ts with OB1 adaptations: + * - OpenRouter is the primary provider (reversed from ExoCortex). + * - All env reads use Deno.env.get(). + */ + +import { + EXTRACTION_PROMPT, + CLASSIFIER_MODEL_OPENROUTER, + CLASSIFIER_MODEL_OPENAI, + CLASSIFIER_MODEL_ANTHROPIC, + DEFAULT_TYPE, + DEFAULT_IMPORTANCE, + DEFAULT_QUALITY_SCORE, + DEFAULT_SENSITIVITY_TIER, + DEFAULT_CONFIDENCE, + STRUCTURED_CAPTURE_CONFIDENCE, + STRUCTURED_CAPTURE_IMPORTANCE, + SENSITIVITY_TIERS, + MAX_SUMMARY_LENGTH, + ENRICHMENT_RETRY_DELAY_MS, + ALLOWED_TYPES, + RESTRICTED_PATTERNS, + PERSONAL_PATTERNS, + EMBEDDING_DIMENSION, + type ThoughtMetadata, + type SensitivityResult, + type PreparedPayload, + type PrepareThoughtOpts, + type StructuredCapture, +} from "./config.ts"; + +// ── Fetch with timeout ───────────────────────────────────────────────────── + +/** + * Wrap fetch() with an AbortController-backed timeout. + * + * Defaults to FETCH_TIMEOUT_MS env (60000). Pass a specific timeoutMs for + * tighter budgets (e.g., 10s fire-and-forget, 30s embedding/DB calls). + * + * On timeout, throws an Error with "fetch timeout after {ms}ms" — callers + * that use isTransientError() will recognize this as retryable. + */ +export async function fetchWithTimeout( + url: string, + init: RequestInit = {}, + timeoutMs?: number, +): Promise { + const defaultMs = Number(Deno.env.get("FETCH_TIMEOUT_MS") ?? 60_000); + const ms = timeoutMs ?? defaultMs; + const ctrl = new AbortController(); + const timer = setTimeout(() => ctrl.abort(), ms); + try { + return await fetch(url, { ...init, signal: ctrl.signal }); + } catch (err) { + if (err instanceof Error && (err.name === "AbortError" || /aborted/i.test(err.message))) { + throw new Error(`fetch timeout after ${ms}ms`); + } + throw err; + } finally { + clearTimeout(timer); + } +} + +// ── Type coercion helpers ────────────────────────────────────────────────── + +export function asString(value: unknown, fallback: string): string { + return typeof value === "string" ? value : fallback; +} + +export function asNumber(value: unknown, fallback: number, min: number, max: number): number { + const parsed = Number(value); + if (!Number.isFinite(parsed)) return fallback; + return Math.min(max, Math.max(min, parsed)); +} + +export function asInteger(value: unknown, fallback: number, min: number, max: number): number { + return Math.round(asNumber(value, fallback, min, max)); +} + +export function asBoolean(value: unknown, fallback: boolean): boolean { + return typeof value === "boolean" ? value : fallback; +} + +export function asOptionalInteger(value: unknown, min: number, max: number): number | null { + if (value === undefined || value === null || value === "") return null; + return asInteger(value, min, min, max); +} + +export function isRecord(value: unknown): value is Record { + return typeof value === "object" && value !== null && !Array.isArray(value); +} + +// ── Array helpers ────────────────────────────────────────────────────────── + +/** Deduplicate, filter empty strings, and cap at 12 items. */ +export function normalizeStringArray(value: unknown): string[] { + if (!Array.isArray(value)) return []; + return [...new Set( + value + .map((item) => (typeof item === "string" ? item.trim() : "")) + .filter((item) => item.length > 0) + .slice(0, 12), + )]; +} + +/** Combine two string arrays with dedup via normalizeStringArray. */ +export function mergeUniqueStrings(base: unknown, extras: string[]): string[] { + return normalizeStringArray([ + ...normalizeStringArray(base), + ...normalizeStringArray(extras), + ]); +} + +// ── Embedding helpers ────────────────────────────────────────────────────── + +/** Returns the embedding only if it has the correct dimension count, otherwise undefined. */ +export function safeEmbedding(emb: number[] | null | undefined): number[] | undefined { + return Array.isArray(emb) && emb.length === EMBEDDING_DIMENSION ? emb : undefined; +} + +/** + * Generate a text embedding via OpenRouter (primary) or OpenAI (fallback). + * + * OB1 adaptation: OpenRouter is tried first (reversed from ExoCortex). + */ +export async function embedText(text: string): Promise { + const openRouterKey = Deno.env.get("OPENROUTER_API_KEY") ?? ""; + const openAiKey = Deno.env.get("OPENAI_API_KEY") ?? ""; + const openRouterModel = Deno.env.get("OPENROUTER_EMBEDDING_MODEL") ?? "openai/text-embedding-3-small"; + const openAiModel = Deno.env.get("OPENAI_EMBEDDING_MODEL") ?? "text-embedding-3-small"; + + // Primary: OpenRouter + if (openRouterKey) { + const response = await fetchWithTimeout("https://openrouter.ai/api/v1/embeddings", { + method: "POST", + headers: { + "Authorization": `Bearer ${openRouterKey}`, + "Content-Type": "application/json", + }, + body: JSON.stringify({ model: openRouterModel, input: text }), + }); + + if (!response.ok) { + throw new Error(`OpenRouter embedding failed (${response.status}): ${await response.text()}`); + } + + const payload = await response.json(); + const embedding = payload?.data?.[0]?.embedding; + if (!Array.isArray(embedding) || embedding.length === 0) { + throw new Error("OpenRouter embedding response missing vector data"); + } + return embedding as number[]; + } + + // Fallback: OpenAI direct + if (openAiKey) { + const response = await fetchWithTimeout("https://api.openai.com/v1/embeddings", { + method: "POST", + headers: { + "Authorization": `Bearer ${openAiKey}`, + "Content-Type": "application/json", + }, + body: JSON.stringify({ model: openAiModel, input: text }), + }); + + if (!response.ok) { + throw new Error(`OpenAI embedding failed (${response.status}): ${await response.text()}`); + } + + const payload = await response.json(); + const embedding = payload?.data?.[0]?.embedding; + if (!Array.isArray(embedding) || embedding.length === 0) { + throw new Error("OpenAI embedding response missing vector data"); + } + return embedding as number[]; + } + + throw new Error("No embedding API key configured. Set OPENROUTER_API_KEY or OPENAI_API_KEY."); +} + +// ── Metadata extraction ──────────────────────────────────────────────────── + +type MetadataProvider = "openrouter" | "openai" | "anthropic"; + +/** Read env and return configured providers in OB1 priority order (openrouter first). */ +function getConfiguredMetadataProviders(): MetadataProvider[] { + const providers: MetadataProvider[] = []; + if (Deno.env.get("OPENROUTER_API_KEY")) providers.push("openrouter"); + if (Deno.env.get("OPENAI_API_KEY")) providers.push("openai"); + if (Deno.env.get("ANTHROPIC_API_KEY")) providers.push("anthropic"); + return providers; +} + +/** Fetch metadata from OpenRouter chat completions endpoint. */ +async function fetchOpenRouterMetadata(text: string): Promise { + const apiKey = Deno.env.get("OPENROUTER_API_KEY") ?? ""; + if (!apiKey) throw new Error("OPENROUTER_API_KEY is not configured"); + + const model = Deno.env.get("OPENROUTER_CLASSIFIER_MODEL") ?? CLASSIFIER_MODEL_OPENROUTER; + const response = await fetchWithTimeout("https://openrouter.ai/api/v1/chat/completions", { + method: "POST", + headers: { + "Authorization": `Bearer ${apiKey}`, + "Content-Type": "application/json", + }, + body: JSON.stringify({ + model, + temperature: 0.1, + messages: [ + { role: "system", content: `${EXTRACTION_PROMPT}\nReturn only the JSON object.` }, + { role: "user", content: text }, + ], + }), + }); + + if (!response.ok) { + throw new Error(`OpenRouter classification failed (${response.status}): ${await response.text()}`); + } + + return readChatCompletionText(await response.json()); +} + +/** Fetch metadata from OpenAI chat completions endpoint. */ +async function fetchOpenAIMetadata(text: string): Promise { + const apiKey = Deno.env.get("OPENAI_API_KEY") ?? ""; + if (!apiKey) throw new Error("OPENAI_API_KEY is not configured"); + + const model = Deno.env.get("OPENAI_CLASSIFIER_MODEL") ?? CLASSIFIER_MODEL_OPENAI; + const response = await fetchWithTimeout("https://api.openai.com/v1/chat/completions", { + method: "POST", + headers: { + "Authorization": `Bearer ${apiKey}`, + "Content-Type": "application/json", + }, + body: JSON.stringify({ + model, + temperature: 0.1, + response_format: { type: "json_object" }, + messages: [ + { role: "system", content: EXTRACTION_PROMPT }, + { role: "user", content: text }, + ], + }), + }); + + if (!response.ok) { + throw new Error(`OpenAI classification failed (${response.status}): ${await response.text()}`); + } + + return readChatCompletionText(await response.json()); +} + +/** Fetch metadata from Anthropic Messages API. */ +async function fetchAnthropicMetadata(text: string): Promise { + const apiKey = Deno.env.get("ANTHROPIC_API_KEY") ?? ""; + if (!apiKey) throw new Error("ANTHROPIC_API_KEY is not configured"); + + const model = Deno.env.get("ANTHROPIC_CLASSIFIER_MODEL") ?? CLASSIFIER_MODEL_ANTHROPIC; + const response = await fetchWithTimeout("https://api.anthropic.com/v1/messages", { + method: "POST", + headers: { + "x-api-key": apiKey, + "anthropic-version": "2023-06-01", + "Content-Type": "application/json", + }, + body: JSON.stringify({ + model, + max_tokens: 1024, + temperature: 0.1, + system: EXTRACTION_PROMPT, + messages: [{ role: "user", content: text }], + }), + }); + + if (!response.ok) { + throw new Error(`Anthropic classification failed (${response.status}): ${await response.text()}`); + } + + return readAnthropicText(await response.json()); +} + +/** Extract text content from an OpenAI/OpenRouter chat completion response. */ +function readChatCompletionText(payload: unknown): string { + if (!isRecord(payload) || !Array.isArray(payload.choices) || payload.choices.length === 0) { + return ""; + } + const firstChoice = payload.choices[0]; + if (!isRecord(firstChoice) || !isRecord(firstChoice.message)) return ""; + + const content = firstChoice.message.content; + if (typeof content === "string") return content; + if (!Array.isArray(content)) return ""; + + return content + .map((part) => { + if (!isRecord(part) || asString(part.type, "") !== "text") return ""; + return asString(part.text, ""); + }) + .join(""); +} + +/** Extract text content from an Anthropic Messages response. */ +function readAnthropicText(payload: unknown): string { + if (!isRecord(payload) || !Array.isArray(payload.content) || payload.content.length === 0) { + return ""; + } + return payload.content + .map((block: unknown) => { + if (!isRecord(block) || asString(block.type, "") !== "text") return ""; + return asString(block.text, ""); + }) + .join(""); +} + +/** Strip markdown code fences (```json ... ```) that LLMs sometimes wrap around JSON output. */ +function stripCodeFences(text: string): string { + const trimmed = text.trim(); + const match = trimmed.match(/^```(?:json)?\s*\n?([\s\S]*?)\n?\s*```$/); + return match ? match[1].trim() : trimmed; +} + +/** + * True for errors worth retrying: network failures, timeouts, 429, and 5xx. + * + * 401 (Unauthorized) and 402 (Payment Required) are NOT transient — those are + * hard auth/quota failures that should fail-fast rather than cascade through + * the fallback provider chain (which would double-bill the user). + */ +function isTransientError(err: unknown): boolean { + if (!(err instanceof Error)) return false; + const msg = err.message; + if (/fetch timeout|fetch failed|network|ECONNRESET|ETIMEDOUT|UND_ERR|aborted/i.test(msg)) return true; + if (/\b(429|500|502|503|504|529)\b/.test(msg)) return true; + return false; +} + +/** + * True for errors that are hard failures (bad auth, no quota, bad request). + * + * When we see one of these on the primary provider we should NOT fall through + * to secondary/tertiary providers — those will just double-charge the user on + * what is clearly a configuration or account-state problem. + */ +function isFatalProviderError(err: unknown): boolean { + if (!(err instanceof Error)) return false; + const msg = err.message; + // 401/403 = auth, 402 = payment required, 400 = malformed request + return /\b(400|401|402|403)\b/.test(msg); +} + +// ── LLM call budget ──────────────────────────────────────────────────────── + +/** + * Process-wide LLM classification call counter. Provides a hard ceiling on + * how many chat-completion round-trips extractMetadata() can issue during + * the lifetime of this Edge Function instance. + * + * Default 10,000 — override via ENHANCED_MCP_MAX_CALLS env. Set to 0 to + * disable the classifier entirely and always return fallback metadata + * (useful for pure-text bulk imports that don't need enrichment). + */ +let _llmCallCount = 0; + +function getLlmCallCap(): number { + const raw = Deno.env.get("ENHANCED_MCP_MAX_CALLS"); + if (raw === undefined) return 10_000; + const parsed = Number(raw); + if (!Number.isFinite(parsed) || parsed < 0) return 10_000; + return Math.floor(parsed); +} + +/** + * Multi-provider metadata extraction with retry and fallback logic. + * + * OB1 adaptation: provider priority is openrouter > openai > anthropic. + * + * Cost safety: + * - Global ENHANCED_MCP_MAX_CALLS cap on classifier invocations. + * - Primary + one retry (transient only), then at most ONE fallback + * provider — never two — so a broken primary never triggers double + * billing on all three providers. + * - Fatal errors (400/401/402/403) from the primary fail-fast instead + * of cascading. Those are account-level problems, not transient. + */ +export async function extractMetadata( + text: string, +): Promise { + const fallback = fallbackMetadata(text); + const configuredProviders = getConfiguredMetadataProviders(); + const primary = configuredProviders[0]; + + if (!primary) { + console.warn("No metadata provider configured, returning fallback"); + return { ...fallback, _enrichment_status: "fallback" }; + } + + // Global call-budget guardrail: once exhausted, stop classifying. + const cap = getLlmCallCap(); + if (cap === 0) { + return { ...fallback, _enrichment_status: "fallback" }; + } + if (_llmCallCount >= cap) { + console.warn( + `Enhanced MCP LLM call budget exhausted (${_llmCallCount} / ${cap}); returning fallback metadata.`, + ); + return { ...fallback, _enrichment_status: "fallback" }; + } + + const fetchProvider = (p: MetadataProvider) => { + _llmCallCount += 1; + return p === "openrouter" + ? fetchOpenRouterMetadata(text) + : p === "openai" + ? fetchOpenAIMetadata(text) + : fetchAnthropicMetadata(text); + }; + + const parseResult = (raw: string): ThoughtMetadata | null => { + if (!raw.trim()) return null; + const parsed = JSON.parse(stripCodeFences(raw)); + return sanitizeMetadata(parsed, text); + }; + + // Attempt 1: primary provider + let lastError: unknown; + try { + const result = parseResult(await fetchProvider(primary)); + if (result) return { ...result, _enrichment_status: "complete" }; + } catch (err) { + lastError = err; + console.warn("Primary metadata classification failed (attempt 1)", primary, err); + } + + // Fail-fast on fatal provider errors (bad auth, out of quota, malformed + // request). These never become transient — cascading wastes money on + // providers that have nothing to do with the broken one. + if (isFatalProviderError(lastError)) { + console.warn( + "Primary metadata classification failed with fatal provider error; skipping fallback providers", + primary, + lastError, + ); + return { ...fallback, _enrichment_status: "fallback" }; + } + + // Attempt 2: retry primary after delay for transient failures only + if (isTransientError(lastError) && _llmCallCount < cap) { + try { + await new Promise((r) => setTimeout(r, ENRICHMENT_RETRY_DELAY_MS)); + const result = parseResult(await fetchProvider(primary)); + if (result) return { ...result, _enrichment_status: "complete" }; + } catch (err) { + console.warn("Primary metadata classification failed (attempt 2)", primary, err); + lastError = err; + if (isFatalProviderError(err)) { + return { ...fallback, _enrichment_status: "fallback" }; + } + } + } + + // Attempt 3: at most ONE fallback provider — never cascade through all + // three, that's the scenario where a single capture can rack up three + // separate LLM charges. + for (const fallbackProvider of configuredProviders.filter((p) => p !== primary)) { + if (_llmCallCount >= cap) break; + try { + const result = parseResult(await fetchProvider(fallbackProvider)); + if (result) return { ...result, _enrichment_status: "complete" }; + } catch (err) { + console.warn("Fallback metadata classification failed", fallbackProvider, err); + if (isFatalProviderError(err)) { + // If the fallback also fails fatally, don't keep trying. + break; + } + } + // Stop after a single fallback attempt regardless of outcome. + break; + } + + return { ...fallback, _enrichment_status: "fallback" }; +} + +// ── Fallback & sanitization ──────────────────────────────────────────────── + +/** Minimal metadata when all classifiers fail. */ +export function fallbackMetadata(input: string): ThoughtMetadata { + return { + type: "idea", + summary: input.slice(0, 160), + topics: [], + tags: [], + people: [], + action_items: [], + importance: null, + confidence: 0.2, + }; +} + +/** Validate and bounds-check LLM-produced metadata. */ +export function sanitizeMetadata(value: unknown, sourceText: string): ThoughtMetadata { + const fallback = fallbackMetadata(sourceText); + + if (!isRecord(value)) return fallback; + + const typeCandidate = asString(value.type, fallback.type); + const type = ALLOWED_TYPES.has(typeCandidate) ? typeCandidate : fallback.type; + + const summary = asString(value.summary, fallback.summary).trim().slice(0, 160) || fallback.summary; + const confidence = asNumber(value.confidence, fallback.confidence, 0, 1); + + // Extract LLM-assigned importance (0-5 range; 6 is user-only, never auto-assigned) + const rawImportance = + value.importance !== undefined && value.importance !== null + ? asInteger(value.importance, DEFAULT_IMPORTANCE, 0, 5) + : null; + + return { + type, + summary, + topics: normalizeStringArray(value.topics), + tags: normalizeStringArray(value.tags), + people: normalizeStringArray(value.people), + action_items: normalizeStringArray(value.action_items), + importance: rawImportance, + confidence, + }; +} + +// ── Sensitivity detection ────────────────────────────────────────────────── + +/** Test text against restricted and personal patterns. */ +export function detectSensitivity(text: string): SensitivityResult { + const reasons: string[] = []; + + for (const [pattern, reason] of RESTRICTED_PATTERNS) { + if (pattern.test(text)) { + reasons.push(reason); + return { tier: "restricted", reasons }; + } + } + + for (const [pattern, reason] of PERSONAL_PATTERNS) { + if (pattern.test(text)) { + reasons.push(reason); + } + } + + if (reasons.length > 0) return { tier: "personal", reasons }; + return { tier: "standard", reasons: [] }; +} + +// ── Content fingerprint ──────────────────────────────────────────────────── + +/** + * Compute SHA-256 fingerprint of normalized content. + * Algorithm: lowercase -> collapse whitespace -> trim -> SHA-256 hex. + * Uses Web Crypto API (available in Deno and modern browsers). + */ +export async function computeContentFingerprint(content: string): Promise { + const normalized = content.trim().replace(/\s+/g, " ").toLowerCase(); + if (!normalized) return ""; + const encoder = new TextEncoder(); + const data = encoder.encode(normalized); + const hashBuffer = await crypto.subtle.digest("SHA-256", data); + return Array.from(new Uint8Array(hashBuffer)) + .map((b) => b.toString(16).padStart(2, "0")) + .join(""); +} + +// ── Structured capture parsing ───────────────────────────────────────────── + +/** Parse `[type] [topic] body text + next step` format. */ +export function parseStructuredCapture(content: string): StructuredCapture { + const trimmed = content.trim(); + const match = /^\s*\[([^\]]+)\]\s*\[([^\]]+)\]\s*(.+?)(?:\s*\+\s*(.+))?$/i.exec(trimmed); + + if (!match) { + return { + matched: false, + normalizedText: trimmed, + typeHint: null, + topicHint: null, + nextStep: null, + }; + } + + const typeHint = normalizeTypeHint(match[1] ?? ""); + const topicHint = (match[2] ?? "").trim().slice(0, 80) || null; + const thoughtBody = (match[3] ?? "").trim(); + const nextStep = (match[4] ?? "").trim().slice(0, 180) || null; + const normalizedText = nextStep + ? `${thoughtBody} Next step: ${nextStep}` + : thoughtBody; + + return { + matched: true, + normalizedText, + typeHint, + topicHint, + nextStep, + }; +} + +/** Map common aliases to canonical thought types. */ +export function normalizeTypeHint(value: string): string | null { + const key = value.trim().toLowerCase().replace(/\s+/g, "_"); + if (!key) return null; + + const aliases: Record = { + idea: "idea", + task: "task", + person: "person_note", + person_note: "person_note", + reference: "reference", + ref: "reference", + note: "reference", + decision: "decision", + lesson: "lesson", + meeting: "meeting", + event: "meeting", + journal: "journal", + }; + + return aliases[key] ?? null; +} + +// ── Evergreen tagging ────────────────────────────────────────────────────── + +/** Add "evergreen" tag if the content contains the word. */ +export function applyEvergreenTag( + content: string, + metadata: Record, +): Record { + const result = { ...metadata }; + const tags = normalizeStringArray(result.tags); + + if (/\bevergreen\b/i.test(content)) { + const hasEvergreen = tags.some((tag) => tag.toLowerCase() === "evergreen"); + if (!hasEvergreen) tags.push("evergreen"); + } + + result.tags = tags; + return result; +} + +// ── Sensitivity tier resolution ──────────────────────────────────────────── + +/** + * Resolve sensitivity tier with escalation-only semantics. + * Can only escalate (standard -> personal -> restricted), never downgrade. + * Unrecognized values normalize to "personal" (safe default). + */ +export function resolveSensitivityTier( + detected: typeof SENSITIVITY_TIERS[number], + override?: string, +): typeof SENSITIVITY_TIERS[number] { + if (!override) return detected; + + const normalized = override.trim().toLowerCase(); + const validTiers: readonly string[] = SENSITIVITY_TIERS; + const overrideIndex = validTiers.indexOf(normalized); + const detectedIndex = validTiers.indexOf(detected); + + if (overrideIndex < 0) { + // Unrecognized value -> normalize to "personal" (safe default) + const personalIndex = validTiers.indexOf("personal"); + return SENSITIVITY_TIERS[Math.max(detectedIndex, personalIndex)]; + } + + // Only escalate, never downgrade + return SENSITIVITY_TIERS[Math.max(detectedIndex, overrideIndex)]; +} + +// ── Master ingest pipeline ───────────────────────────────────────────────── + +/** Validate type against ALLOWED_TYPES, returning DEFAULT_TYPE on mismatch. */ +function sanitizeType(value: string): string { + const normalized = value.trim().toLowerCase(); + return ALLOWED_TYPES.has(normalized) ? normalized : DEFAULT_TYPE; +} + +/** + * Canonical thought preparation pipeline. + * + * Override precedence (highest to lowest): + * 1. Structured capture hint (from parseStructuredCapture) + * 2. Explicit caller override (opts.metadata.type, opts.metadata.importance, etc.) + * 3. Extracted metadata (from LLM classification via extractMetadata) + * 4. Defaults (type: 'idea', importance: 3, quality_score: 50, sensitivity: 'standard') + * + * All ingest paths (MCP capture_thought, REST /capture, smart-ingest) call this. + */ +export async function prepareThoughtPayload( + content: string, + opts?: PrepareThoughtOpts, +): Promise { + const source = opts?.source ?? "mcp"; + const sourceType = opts?.source_type ?? source; + const extraMetadata = opts?.metadata ?? {}; + const warnings: string[] = []; + + // Step 1: Parse structured capture format + const structuredCapture = parseStructuredCapture(content); + const normalizedText = structuredCapture.normalizedText.trim(); + + if (!normalizedText) { + throw new Error("content is required"); + } + + const isOversized = normalizedText.length > 30000; + if (isOversized) { + warnings.push("oversized_content"); + console.warn( + `prepareThoughtPayload received oversized content (${normalizedText.length} chars); consider routing through smart-ingest for atomization.`, + ); + } + + // Step 2: Detect sensitivity + const sensitivity = detectSensitivity(normalizedText); + + // Step 3: Resolve type (precedence: structured > caller > extracted > default) + const callerType = asString(extraMetadata.memory_type, asString(extraMetadata.type, "")); + + // Step 4: Extract metadata via LLM (if not skipped) + let extracted: ThoughtMetadata | null = null; + let enrichmentStatus: "complete" | "fallback" | "skipped" = "skipped"; + if (!opts?.skip_classification) { + try { + const result = await extractMetadata(normalizedText); + enrichmentStatus = result._enrichment_status; + extracted = result; + if (enrichmentStatus === "fallback") { + warnings.push("metadata_fallback"); + } + } catch (err) { + console.warn("Metadata extraction failed, using defaults", err); + warnings.push("metadata_fallback"); + enrichmentStatus = "fallback"; + } + } + + // Step 5: Apply precedence rules for type + const resolvedType = sanitizeType( + structuredCapture.typeHint || callerType || extracted?.type || DEFAULT_TYPE, + ); + + // Step 6: Merge topics, tags, people, action_items + const baseTags = normalizeStringArray(extraMetadata.tags); + const baseTopics = normalizeStringArray(extraMetadata.topics); + const basePeople = normalizeStringArray(extraMetadata.people); + const baseActionItems = normalizeStringArray(extraMetadata.action_items); + + const extractedTopics = extracted ? normalizeStringArray(extracted.topics) : []; + const extractedTags = extracted ? normalizeStringArray(extracted.tags) : []; + const extractedPeople = extracted ? normalizeStringArray(extracted.people) : []; + const extractedActionItems = extracted ? normalizeStringArray(extracted.action_items) : []; + + let topics = mergeUniqueStrings(baseTopics.length > 0 ? baseTopics : extractedTopics, []); + let tags = mergeUniqueStrings(baseTags.length > 0 ? baseTags : extractedTags, []); + const people = mergeUniqueStrings(basePeople.length > 0 ? basePeople : extractedPeople, []); + let actionItems = mergeUniqueStrings( + baseActionItems.length > 0 ? baseActionItems : extractedActionItems, + [], + ); + + // Add structured capture hints + if (structuredCapture.topicHint) { + topics = mergeUniqueStrings(topics, [structuredCapture.topicHint]); + tags = mergeUniqueStrings(tags, [structuredCapture.topicHint]); + } + if (structuredCapture.nextStep) { + actionItems = mergeUniqueStrings(actionItems, [structuredCapture.nextStep]); + } + + // Step 7: Resolve importance (precedence: caller > structured > LLM-extracted > default) + const callerImportance = + extraMetadata.importance !== undefined + ? asInteger(extraMetadata.importance, DEFAULT_IMPORTANCE, 0, 6) + : null; + const structuredImportance = structuredCapture.matched ? STRUCTURED_CAPTURE_IMPORTANCE : null; + const extractedImportance = extracted?.importance ?? null; + const importance = + callerImportance ?? structuredImportance ?? extractedImportance ?? DEFAULT_IMPORTANCE; + + // Step 8: Resolve confidence + const callerConfidence = + extraMetadata.confidence !== undefined + ? asNumber(extraMetadata.confidence, DEFAULT_CONFIDENCE, 0, 1) + : null; + const structuredConfidence = structuredCapture.matched ? STRUCTURED_CAPTURE_CONFIDENCE : null; + const confidence = + callerConfidence ?? structuredConfidence ?? extracted?.confidence ?? DEFAULT_CONFIDENCE; + + // Step 9: Resolve quality score + const callerQuality = + extraMetadata.quality_score !== undefined + ? asNumber(extraMetadata.quality_score, DEFAULT_QUALITY_SCORE, 0, 100) + : null; + const quality_score = callerQuality ?? Math.round(confidence * 70 + 20); + + // Step 10: Resolve summary + const callerSummary = asString(extraMetadata.summary, ""); + const extractedSummary = extracted?.summary ?? ""; + const summary = (callerSummary || extractedSummary || normalizedText) + .trim() + .slice(0, MAX_SUMMARY_LENGTH); + + // Step 11: Resolve sensitivity tier (escalation only) + const callerSensitivity = asString( + extraMetadata.sensitivity_tier, + asString(extraMetadata.sensitivity, ""), + ); + const sensitivity_tier = resolveSensitivityTier( + sensitivity.tier, + callerSensitivity || undefined, + ); + + // Step 12: Compute embedding + let embedding: number[] = []; + if (opts?.embedding) { + embedding = opts.embedding; + } else if (!opts?.skip_embedding) { + try { + embedding = await embedText(normalizedText); + } catch (err) { + console.warn("Embedding failed, will be null", err); + warnings.push("embedding_unavailable"); + } + } + + // Step 13: Compute content fingerprint + const content_fingerprint = await computeContentFingerprint(normalizedText); + + // Step 14: Assemble metadata object with evergreen tag + const metadata = applyEvergreenTag(normalizedText, { + ...extraMetadata, + type: resolvedType, + summary, + topics, + tags, + people, + action_items: actionItems, + confidence, + source, + source_type: asString(extraMetadata.source_type, sourceType), + capture_format: structuredCapture.matched ? "structured_v1" : "freeform", + structured_capture: structuredCapture.matched + ? { + type: structuredCapture.typeHint, + topic: structuredCapture.topicHint, + next_step: structuredCapture.nextStep, + } + : null, + oversized: isOversized || extraMetadata.oversized === true, + captured_at: asString(extraMetadata.captured_at, new Date().toISOString()), + sensitivity_reasons: sensitivity.reasons, + agent_name: asString(extraMetadata.agent_name, "mcp"), + provider: asString(extraMetadata.provider, "mcp"), + enrichment_status: enrichmentStatus, + enrichment_attempted_at: enrichmentStatus !== "skipped" ? new Date().toISOString() : null, + ...(warnings.length > 0 ? { enrichment_warnings: warnings } : {}), + }); + + return { + content: normalizedText, + embedding, + metadata, + type: resolvedType, + importance, + quality_score, + sensitivity_tier, + source_type: asString(extraMetadata.source_type, sourceType), + content_fingerprint, + warnings, + }; +} + +// ── LIKE / ILIKE escaping ────────────────────────────────────────────────── + +/** + * Escape Postgres LIKE / ILIKE metacharacters so a user query can be safely + * interpolated into a `%...%` pattern. + * + * `%` and `_` are wildcards in ILIKE and `\` is the escape character; any + * of them in a user query will produce surprising match behavior. A search + * for "100%" without escaping expands to the ILIKE `%100%%` and matches + * everything that contains "100", capped only by LIMIT — effectively the + * whole table on dense graphs. + * + * Example: + * escapeLikePattern("100%") === "100\\%" + * escapeLikePattern("a_b") === "a\\_b" + */ +export function escapeLikePattern(s: string): string { + return s.replace(/[\\%_]/g, (ch) => "\\" + ch); +} + +// ── Supabase utility ─────────────────────────────────────────────────────── + +/** + * Quick existence check: returns true if the table or view can be queried + * without error. + * + * Uses `select("*", { head: true, count: "exact" }).limit(0)` so we don't + * depend on the target having an `id` column — important for views that + * have unusual column sets (e.g. `ops_source_errors_24h` has only + * `(source, error_events_24h)`, no `id`). + */ +type TableExistsQuery = PromiseLike<{ error: unknown }>; + +export async function tableExists( + supabase: { + from: ( + name: string, + ) => { + select: ( + cols: string, + opts?: { head?: boolean; count?: "exact" | "planned" | "estimated" }, + ) => { limit: (n: number) => TableExistsQuery }; + }; + }, + tableName: string, +): Promise { + const { error } = await supabase + .from(tableName) + .select("*", { head: true, count: "exact" }) + .limit(0); + return !error; +} diff --git a/integrations/enhanced-mcp/deno.json b/integrations/enhanced-mcp/deno.json new file mode 100644 index 000000000..705b50447 --- /dev/null +++ b/integrations/enhanced-mcp/deno.json @@ -0,0 +1,9 @@ +{ + "imports": { + "@hono/mcp": "npm:@hono/mcp@0.1.1", + "@modelcontextprotocol/sdk": "npm:@modelcontextprotocol/sdk@1.24.3", + "hono": "npm:hono@4.9.2", + "zod": "npm:zod@4.1.13", + "@supabase/supabase-js": "npm:@supabase/supabase-js@2.47.10" + } +} diff --git a/integrations/enhanced-mcp/index.ts b/integrations/enhanced-mcp/index.ts new file mode 100644 index 000000000..4adff492e --- /dev/null +++ b/integrations/enhanced-mcp/index.ts @@ -0,0 +1,1638 @@ +import "jsr:@supabase/functions-js/edge-runtime.d.ts"; + +import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; +import { StreamableHTTPTransport } from "@hono/mcp"; +import { Hono } from "hono"; +import { z } from "zod"; +import { createClient } from "@supabase/supabase-js"; + +import { + embedText, + extractMetadata, + detectSensitivity, + resolveSensitivityTier, + computeContentFingerprint, + prepareThoughtPayload, + applyEvergreenTag, + normalizeStringArray, + safeEmbedding, + tableExists, + escapeLikePattern, + asString, + asNumber, + asInteger, + asBoolean, + asOptionalInteger, + isRecord, +} from "./_shared/helpers.ts"; + +// ── Environment ─────────────────────────────────────────────────────────── + +const SUPABASE_URL = Deno.env.get("SUPABASE_URL")!; +const SUPABASE_SERVICE_ROLE_KEY = Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!; +const MCP_ACCESS_KEY = Deno.env.get("MCP_ACCESS_KEY")!; + +const supabase = createClient(SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY); + +// ── Types ───────────────────────────────────────────────────────────────── + +type ThoughtRow = { + id: number; + content: string; + content_fingerprint?: string | null; + type: string; + sensitivity_tier: string; + importance: number; + quality_score: number; + source_type: string; + metadata: Record; + created_at: string; + similarity?: number; + rank?: number; +}; + +type UpsertThoughtResult = { + thought_id: number; + action: string; + content_fingerprint: string; +}; + +// ── Helpers ─────────────────────────────────────────────────────────────── + +function toolSuccess(text: string, payload: Record) { + return { + content: [{ type: "text" as const, text }], + structuredContent: payload, + }; +} + +function toolFailure(message: string) { + return { + content: [{ type: "text" as const, text: `Error: ${message}` }], + isError: true, + }; +} + +function truncateContent(content: string, maxLen: number): string { + if (!content || content.length <= maxLen) return content; + return content.slice(0, maxLen) + "..."; +} + +// ── MCP Server ──────────────────────────────────────────────────────────── + +const server = new McpServer({ + name: "open-brain-enhanced", + version: "1.0.0", +}); + +// ── 1. brain_search_thoughts ──────────────────────────────────────────── + +server.registerTool( + "brain_search_thoughts", + { + title: "Search Thoughts (Enhanced)", + description: + "Search over your stored thoughts. Supports semantic (vector) and text (full-text) modes. Namespaced with brain_ prefix to avoid collision with the stock search_thoughts tool when both MCP servers are connected.", + inputSchema: z.object({ + query: z.string().min(2).describe("Search query"), + mode: z + .enum(["semantic", "text"]) + .default("semantic") + .optional() + .describe("Search mode: semantic (vector similarity) or text (full-text search)"), + limit: z.number().int().min(1).max(50).default(8).optional(), + offset: z + .number() + .int() + .min(0) + .default(0) + .optional() + .describe("Pagination offset (text mode only)"), + min_similarity: z + .number() + .min(0) + .max(1) + .default(0.3) + .optional() + .describe("Minimum similarity threshold (semantic mode only)"), + start_date: z + .string() + .optional() + .describe("ISO 8601 start date filter on created_at"), + end_date: z + .string() + .optional() + .describe("ISO 8601 end date filter on created_at"), + metadata_filter: z.record(z.string(), z.unknown()).optional(), + }), + }, + async (params) => { + try { + const raw = params as Record; + const query = asString(raw.query, "").trim(); + const mode = asString(raw.mode, "semantic"); + const limit = asInteger(raw.limit, 8, 1, 50); + const offset = asInteger(raw.offset, 0, 0, Number.MAX_SAFE_INTEGER); + const minSimilarity = asNumber(raw.min_similarity, 0.3, 0, 1); + const startDate = raw.start_date + ? asString(raw.start_date, "").trim() + : null; + const endDate = raw.end_date + ? asString(raw.end_date, "").trim() + : null; + const metadataFilter = isRecord(raw.metadata_filter) + ? raw.metadata_filter + : {}; + + if (query.length < 2) { + return toolFailure("query must be at least 2 characters"); + } + + if (mode === "text") { + const filter: Record = { + ...(metadataFilter as Record), + }; + filter.exclude_restricted = true; + if (startDate) filter.start_date = startDate; + if (endDate) filter.end_date = endDate; + + const { data, error } = await supabase.rpc("search_thoughts_text", { + p_query: query, + p_limit: limit, + p_filter: filter, + p_offset: offset, + }); + + if (error) { + throw new Error(`search_thoughts_text failed: ${error.message}`); + } + + const rows = (data ?? []) as ThoughtRow[]; + const totalCount = + rows.length > 0 + ? Number( + (rows[0] as Record).total_count ?? rows.length, + ) + : 0; + + if (rows.length === 0) { + return toolSuccess("No matches found.", { + results: [], + pagination: { + total: 0, + offset, + limit, + has_more: false, + }, + }); + } + + const lines = rows.map((row, index) => { + const score = Number(row.rank ?? 0).toFixed(3); + return `${offset + index + 1}. [${score}] (${row.type}) #${row.id} ${truncateContent(row.content, 500)}`; + }); + + return toolSuccess(lines.join("\n"), { + results: rows, + pagination: { + total: totalCount, + offset, + limit, + has_more: offset + rows.length < totalCount, + }, + }); + } + + // Semantic search (default) + // + // NOTE: `match_thoughts` returns the top-N by similarity and then we + // date-filter client-side. When the RPC supports date/tier filters in + // its `filter` JSONB payload they'll be honored pre-cutoff and the + // behavior is server-side correct; when it doesn't, we rely on an + // over-fetch slack to avoid silently returning zero results on active + // brains with old date windows. See `known limitations` in the README. + const dateFilterActive = !!(startDate || endDate); + // Forward filters into the RPC payload — ignored by older RPC versions + // but used by versions that support them, at which point the + // post-filter becomes a no-op. + const semanticFilter: Record = { + ...(metadataFilter as Record), + exclude_restricted: true, + }; + if (startDate) semanticFilter.start_date = startDate; + if (endDate) semanticFilter.end_date = endDate; + + // Over-fetch when date filter is active so client-side post-filter + // has headroom. 3x the requested limit is a reasonable compromise + // between cost and correctness for dense recent brains. + const fetchCount = dateFilterActive + ? Math.min(Math.max(limit * 3, 50), 500) + : Math.min(limit + 20, 200); + + const queryEmbedding = await embedText(query); + const { data, error } = await supabase.rpc("match_thoughts", { + query_embedding: queryEmbedding, + match_count: fetchCount, + match_threshold: minSimilarity, + filter: semanticFilter, + }); + + if (error) { + throw new Error(`match_thoughts failed: ${error.message}`); + } + + const allRows = (data ?? []) as ThoughtRow[]; + const rows = allRows + .filter((row) => row.sensitivity_tier !== "restricted") + .filter((row) => !startDate || row.created_at >= startDate) + .filter((row) => !endDate || row.created_at <= endDate) + .slice(0, limit); + + if (rows.length === 0) { + return toolSuccess("No matches found.", { results: [] }); + } + + const lines = rows.map((row, index) => { + const score = Number(row.similarity ?? 0).toFixed(3); + const type = asString(row.metadata?.type, row.type ?? "unknown"); + return `${index + 1}. [${score}] (${type}) #${row.id} ${truncateContent(row.content, 500)}`; + }); + + return toolSuccess(lines.join("\n"), { results: rows }); + } catch (error) { + console.error("brain_search_thoughts failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 2. brain_list_thoughts ────────────────────────────────────────────── + +server.registerTool( + "brain_list_thoughts", + { + title: "List Thoughts (Enhanced)", + description: + "Enhanced listing of thoughts with filters, sorting, and pagination. Namespaced with brain_ prefix to avoid collision with the stock list_thoughts tool when both MCP servers are connected.", + inputSchema: z.object({ + limit: z.number().int().min(1).max(100).default(20).optional(), + offset: z.number().int().min(0).default(0).optional(), + type: z + .string() + .optional() + .describe( + "Filter by thought type (e.g. idea, decision, lesson, task)", + ), + source_type: z + .string() + .optional() + .describe("Filter by source type (e.g. chatgpt_import, mcp)"), + start_date: z + .string() + .optional() + .describe("ISO 8601 start date filter on created_at"), + end_date: z + .string() + .optional() + .describe("ISO 8601 end date filter on created_at"), + sort: z + .enum(["created_at", "importance"]) + .default("created_at") + .optional(), + order: z.enum(["asc", "desc"]).default("desc").optional(), + }), + }, + async (params) => { + try { + const raw = params as Record; + const limit = asInteger(raw.limit, 20, 1, 100); + const offset = asInteger(raw.offset, 0, 0, Number.MAX_SAFE_INTEGER); + const type = raw.type ? asString(raw.type, "").trim() : null; + const sourceType = raw.source_type + ? asString(raw.source_type, "").trim() + : null; + const startDate = raw.start_date + ? asString(raw.start_date, "").trim() + : null; + const endDate = raw.end_date + ? asString(raw.end_date, "").trim() + : null; + const sort = asString(raw.sort, "created_at"); + const order = asString(raw.order, "desc"); + + // Count query (parallel with data query) + let countQuery = supabase + .from("thoughts") + .select("id", { count: "exact", head: true }) + .neq("sensitivity_tier", "restricted"); + if (type) countQuery = countQuery.eq("type", type); + if (sourceType) countQuery = countQuery.eq("source_type", sourceType); + if (startDate) countQuery = countQuery.gte("created_at", startDate); + if (endDate) countQuery = countQuery.lte("created_at", endDate); + + // Data query + let dataQuery = supabase + .from("thoughts") + .select( + "id, content, type, source_type, importance, quality_score, sensitivity_tier, metadata, created_at, updated_at", + ) + .neq("sensitivity_tier", "restricted") + .order(sort, { ascending: order === "asc" }) + .range(offset, offset + limit - 1); + + if (type) dataQuery = dataQuery.eq("type", type); + if (sourceType) dataQuery = dataQuery.eq("source_type", sourceType); + if (startDate) dataQuery = dataQuery.gte("created_at", startDate); + if (endDate) dataQuery = dataQuery.lte("created_at", endDate); + + const [countRes, dataRes] = await Promise.all([countQuery, dataQuery]); + + if (dataRes.error) { + throw new Error( + `brain_list_thoughts query failed: ${dataRes.error.message}`, + ); + } + + const rows = (dataRes.data ?? []) as ThoughtRow[]; + const total = countRes.count ?? 0; + const hasMore = offset + rows.length < total; + + const text = + rows.length === 0 + ? "No thoughts found matching filters." + : rows + .map( + (row, i) => + `${offset + i + 1}. (${row.type}) #${row.id} ${truncateContent(row.content, 500)}`, + ) + .join("\n"); + + return toolSuccess(text, { + results: rows, + pagination: { total, offset, limit, has_more: hasMore }, + }); + } catch (error) { + console.error("brain_list_thoughts failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 3. get_thought ────────────────────────────────────────────────────── + +server.registerTool( + "get_thought", + { + title: "Get Thought", + description: + "Fetch a thought by ID with its full metadata and provenance.", + inputSchema: z.object({ + id: z.number().int().min(1).describe("Thought ID"), + }), + }, + async (params) => { + try { + const id = asInteger( + (params as Record).id, + 0, + 1, + Number.MAX_SAFE_INTEGER, + ); + + if (!id) { + return toolFailure("id is required"); + } + + const { data, error } = await supabase + .from("thoughts") + .select( + "id, content, content_fingerprint, type, sensitivity_tier, importance, quality_score, source_type, metadata, created_at, updated_at", + ) + .eq("id", id) + .single(); + + if (error || !data) { + return toolFailure(`Thought #${id} not found`); + } + + const row = data as ThoughtRow; + + if (row.sensitivity_tier === "restricted") { + return toolFailure("This thought is restricted."); + } + + const lines = [ + `(${row.type}) #${row.id}`, + row.content, + `Importance: ${row.importance} | Quality: ${row.quality_score} | Sensitivity: ${row.sensitivity_tier}`, + `Source: ${row.source_type || "unknown"} | Created: ${row.created_at}`, + ]; + + // Show provenance from metadata if available + const sources = row.metadata?.sources_seen; + const agents = row.metadata?.agents_seen; + if (Array.isArray(sources) && sources.length > 0) { + lines.push(`Sources seen: ${sources.join(", ")}`); + } + if (Array.isArray(agents) && agents.length > 0) { + lines.push(`Agents seen: ${agents.join(", ")}`); + } + + return toolSuccess(lines.join("\n"), { thought: row }); + } catch (error) { + console.error("get_thought failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 4. update_thought ─────────────────────────────────────────────────── + +server.registerTool( + "update_thought", + { + title: "Update Thought", + description: + "Update the content of an existing thought. Re-generates embedding and metadata.", + inputSchema: z.object({ + id: z.number().int().min(1).describe("Thought ID to update"), + content: z + .string() + .min(1) + .describe("New content for the thought"), + }), + }, + async (params) => { + try { + const id = asInteger( + (params as Record).id, + 0, + 1, + Number.MAX_SAFE_INTEGER, + ); + const content = asString( + (params as Record).content, + "", + ).trim(); + + if (!id) { + return toolFailure("id is required"); + } + if (!content) { + return toolFailure("content is required"); + } + + const { data: existing, error: fetchError } = await supabase + .from("thoughts") + .select("id, content, type, sensitivity_tier, importance, metadata") + .eq("id", id) + .single(); + + if (fetchError || !existing) { + return toolFailure(`Thought #${id} not found`); + } + + if (existing.sensitivity_tier === "restricted") { + return toolFailure("Cannot update restricted thought"); + } + + const oldType = + existing.type ?? + asString( + (existing.metadata as Record)?.type, + "unknown", + ); + + // Detect sensitivity on the NEW content first so we can reject + // restricted updates before paying for embedding + classification. + const sensitivity = detectSensitivity(content); + if (sensitivity.tier === "restricted") { + const reasons = + sensitivity.reasons.length > 0 + ? ` Reasons: ${sensitivity.reasons.join(", ")}.` + : ""; + return toolFailure( + "Updated content contains restricted patterns (SSN, credit card, " + + "API key, etc). Restricted content is local-only and cannot be " + + "stored in cloud MCP." + + reasons, + ); + } + + const [embedding, extracted] = await Promise.all([ + embedText(content), + extractMetadata(content), + ]); + + const oldMetadata = isRecord(existing.metadata) + ? existing.metadata + : {}; + const fingerprint = await computeContentFingerprint(content); + + // Escalation-only tier resolution — never downgrade the stored tier. + // If an existing `personal` thought is edited to remove the sensitive + // phrasing, the row stays `personal` rather than silently becoming + // `standard` and leaking into broad list/search responses. This + // matches the invariant enforced in brain_capture_thought's pipeline + // via resolveSensitivityTier (existing tier acts as the floor). + const resolvedTier = resolveSensitivityTier( + sensitivity.tier, + existing.sensitivity_tier ?? undefined, + ); + + const metadata = { + ...oldMetadata, + type: extracted.type, + summary: extracted.summary, + topics: extracted.topics, + tags: extracted.tags, + people: extracted.people, + action_items: extracted.action_items, + confidence: extracted.confidence, + sensitivity_reasons: sensitivity.reasons, + }; + + const finalizedMetadata = applyEvergreenTag(content, metadata); + + const { error: updateError } = await supabase + .from("thoughts") + .update({ + content, + content_fingerprint: fingerprint, + embedding, + type: extracted.type, + sensitivity_tier: resolvedTier, + importance: existing.importance ?? 3, + metadata: finalizedMetadata, + updated_at: new Date().toISOString(), + }) + .eq("id", id); + + if (updateError) { + throw new Error(`update_thought failed: ${updateError.message}`); + } + + const newType = asString( + (finalizedMetadata as Record).type, + "unknown", + ); + return toolSuccess( + `Updated thought #${id}. Type: ${oldType} \u2192 ${newType}.`, + { id, old_type: oldType, new_type: newType }, + ); + } catch (error) { + console.error("update_thought failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 5. brain_capture_thought ──────────────────────────────────────────── +// +// NOTE: `delete_thought` is intentionally not shipped in this initial PR. +// Hard `DELETE FROM thoughts WHERE id = ?` is irreversible and the +// companion schema (`schemas/enhanced-thoughts`) has no `deleted_at` +// tombstone column, so there is no safe soft-delete path today. +// +// The upstream maintainer's guidance on PR #127 was "depreciate and +// version rather than delete" — we will honour that in a follow-up +// once `deleted_at` + a `restore_thought` flow lands in the schema. +// See the README "Intentionally excluded" section for user-facing text. + + +server.registerTool( + "brain_capture_thought", + { + title: "Capture Thought (Enhanced)", + description: + "Capture a new thought with automatic dedup by content fingerprint. Runs full enrichment pipeline including sensitivity detection, LLM-powered classification, and structured-capture parsing. Namespaced with brain_ prefix to avoid collision with the stock capture_thought tool when both MCP servers are connected.", + inputSchema: z.object({ + content: z.string().min(1), + source: z.string().default("mcp").optional(), + metadata: z.record(z.string(), z.unknown()).optional(), + }), + }, + async (params) => { + try { + const raw = params as Record; + const content = asString(raw.content, "").trim(); + const source = asString(raw.source, "mcp").trim() || "mcp"; + const extraMetadata = isRecord(raw.metadata) ? raw.metadata : {}; + + if (!content) { + return toolFailure("content is required"); + } + + // Pre-flight sensitivity check — restricted content blocked from cloud + const sensitivity = detectSensitivity(content); + if (sensitivity.tier === "restricted") { + return toolFailure( + "Restricted thoughts are local-only and cannot be captured through cloud MCP.", + ); + } + + // Fingerprint-first dedup: if the exact content was already captured, + // short-circuit BEFORE paying for LLM classification + embedding. + // The upsert_thought RPC also dedups, but it runs after we've already + // spent a full enrichment cycle — this saves the cost entirely. + const preFingerprint = await computeContentFingerprint(content); + if (preFingerprint) { + const { data: existing, error: existingError } = await supabase + .from("thoughts") + .select( + "id, type, sensitivity_tier, importance, quality_score, source_type, metadata, content_fingerprint", + ) + .eq("content_fingerprint", preFingerprint) + .maybeSingle(); + + if (!existingError && existing) { + return toolSuccess( + `Duplicate of thought #${existing.id} (${existing.type}). No new capture.`, + { + thought_id: existing.id, + action: "deduplicated", + content_fingerprint: existing.content_fingerprint, + type: existing.type, + sensitivity_tier: existing.sensitivity_tier, + metadata: existing.metadata, + }, + ); + } + } + + // Use canonical pipeline with live LLM classification + const prepared = await prepareThoughtPayload(content, { + source, + source_type: asString(extraMetadata.source_type, source), + metadata: extraMetadata, + }); + + const { data, error } = await supabase.rpc("upsert_thought", { + p_content: prepared.content, + p_payload: { + type: prepared.type, + sensitivity_tier: prepared.sensitivity_tier, + importance: prepared.importance, + quality_score: prepared.quality_score, + source_type: prepared.source_type, + metadata: prepared.metadata, + created_at: new Date().toISOString(), + ...(safeEmbedding(prepared.embedding) && { + embedding: prepared.embedding, + }), + }, + }); + + if (error) { + throw new Error(`upsert_thought failed: ${error.message}`); + } + + const result = data as UpsertThoughtResult | null; + if (!result?.thought_id) { + throw new Error("upsert_thought returned no result"); + } + + return toolSuccess( + `${result.action === "inserted" ? "Captured new" : "Updated"} thought #${result.thought_id} as ${prepared.type}.`, + { + thought_id: result.thought_id, + action: result.action, + content_fingerprint: result.content_fingerprint, + type: prepared.type, + sensitivity_tier: prepared.sensitivity_tier, + metadata: prepared.metadata, + }, + ); + } catch (error) { + console.error("brain_capture_thought failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 6. brain_thought_stats ────────────────────────────────────────────── + +server.registerTool( + "brain_thought_stats", + { + title: "Thought Statistics (Enhanced)", + description: + "Summaries of thought type/topic activity. Uses server-side aggregation for accurate counts across entire brain. Namespaced with brain_ prefix to avoid collision with the stock thought_stats tool when both MCP servers are connected.", + inputSchema: z.object({ + since_days: z + .number() + .int() + .min(0) + .max(3650) + .default(0) + .optional(), + }), + }, + async (params) => { + try { + const sinceDays = asInteger( + (params as Record).since_days, + 0, + 0, + 3650, + ); + + const { data, error } = await supabase.rpc("brain_stats_aggregate", { + p_since_days: sinceDays, + }); + + if (error) { + throw new Error(`brain_stats query failed: ${error.message}`); + } + + const aggregate = (data ?? {}) as Record; + const total = + typeof aggregate.total === "number" ? aggregate.total : 0; + const topTypes = Array.isArray(aggregate.top_types) + ? (aggregate.top_types as Array<{ type: string; count: number }>) + : []; + const topTopics = Array.isArray(aggregate.top_topics) + ? (aggregate.top_topics as Array<{ topic: string; count: number }>) + : []; + + const windowLabel = + sinceDays === 0 ? "all time" : `last ${sinceDays} day(s)`; + const summary = [ + `Window: ${windowLabel}`, + `Total thoughts: ${total}`, + `Top types: ${topTypes.map((t) => `${t.type}=${t.count}`).join(", ") || "none"}`, + `Top topics: ${topTopics.map((t) => `${t.topic}=${t.count}`).join(", ") || "none"}`, + ].join("\n"); + + return toolSuccess(summary, { + total, + top_types: topTypes, + top_topics: topTopics, + }); + } catch (error) { + console.error("brain_thought_stats failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 7. search_thoughts_text ───────────────────────────────────────────── + +server.registerTool( + "search_thoughts_text", + { + title: "Full-Text Search", + description: + "Direct full-text search over thoughts. Simpler than search_thoughts for text-only queries.", + inputSchema: z.object({ + query: z.string().min(2).describe("Search query"), + limit: z.number().int().min(1).max(50).default(8).optional(), + offset: z.number().int().min(0).default(0).optional(), + }), + }, + async (params) => { + try { + const raw = params as Record; + const query = asString(raw.query, "").trim(); + const limit = asInteger(raw.limit, 8, 1, 50); + const offset = asInteger(raw.offset, 0, 0, Number.MAX_SAFE_INTEGER); + + if (query.length < 2) { + return toolFailure("query must be at least 2 characters"); + } + + const { data, error } = await supabase.rpc("search_thoughts_text", { + p_query: query, + p_limit: limit, + p_filter: { exclude_restricted: true }, + p_offset: offset, + }); + + if (error) { + throw new Error(`search_thoughts_text failed: ${error.message}`); + } + + const rows = (data ?? []) as ThoughtRow[]; + + if (rows.length === 0) { + return toolSuccess("No matches found.", { results: [] }); + } + + const lines = rows.map((row, index) => { + const score = Number(row.rank ?? 0).toFixed(3); + return `${offset + index + 1}. [${score}] (${row.type}) #${row.id} ${truncateContent(row.content, 500)}`; + }); + + return toolSuccess(lines.join("\n"), { results: rows }); + } catch (error) { + console.error("search_thoughts_text failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 8. count_thoughts ─────────────────────────────────────────────────── + +server.registerTool( + "count_thoughts", + { + title: "Count Thoughts", + description: + "Count thoughts matching optional filters. Fast metadata query without returning content.", + inputSchema: z.object({ + type: z.string().optional().describe("Filter by thought type"), + source_type: z + .string() + .optional() + .describe("Filter by source type"), + start_date: z + .string() + .optional() + .describe("ISO 8601 start date filter on created_at"), + end_date: z + .string() + .optional() + .describe("ISO 8601 end date filter on created_at"), + }), + }, + async (params) => { + try { + const raw = params as Record; + const type = raw.type ? asString(raw.type, "").trim() : null; + const sourceType = raw.source_type + ? asString(raw.source_type, "").trim() + : null; + const startDate = raw.start_date + ? asString(raw.start_date, "").trim() + : null; + const endDate = raw.end_date + ? asString(raw.end_date, "").trim() + : null; + + let query = supabase + .from("thoughts") + .select("id", { count: "exact", head: true }) + .neq("sensitivity_tier", "restricted"); + if (type) query = query.eq("type", type); + if (sourceType) query = query.eq("source_type", sourceType); + if (startDate) query = query.gte("created_at", startDate); + if (endDate) query = query.lte("created_at", endDate); + + const { count, error } = await query; + if (error) { + throw new Error(`count_thoughts query failed: ${error.message}`); + } + + const filters: Record = {}; + if (type) filters.type = type; + if (sourceType) filters.source_type = sourceType; + if (startDate) filters.start_date = startDate; + if (endDate) filters.end_date = endDate; + + const filterDesc = + Object.keys(filters).length > 0 + ? ` (filters: ${Object.entries(filters).map(([k, v]) => `${k}=${v}`).join(", ")})` + : ""; + + return toolSuccess(`Count: ${count ?? 0}${filterDesc}`, { + count: count ?? 0, + filters, + }); + } catch (error) { + console.error("count_thoughts failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 9. related_thoughts ───────────────────────────────────────────────── + +server.registerTool( + "related_thoughts", + { + title: "Related Thoughts", + description: + "Find thoughts related to a given thought via the knowledge graph connections.", + inputSchema: z.object({ + thought_id: z + .number() + .int() + .min(1) + .describe("Thought ID to find connections for"), + limit: z.number().int().min(1).max(20).default(10).optional(), + }), + }, + async (params) => { + try { + const raw = params as Record; + const thoughtId = asInteger( + raw.thought_id, + 0, + 1, + Number.MAX_SAFE_INTEGER, + ); + const limit = asInteger(raw.limit, 10, 1, 20); + + if (!thoughtId) { + return toolFailure("thought_id is required"); + } + + const { data, error } = await supabase.rpc( + "get_thought_connections", + { + p_thought_id: thoughtId, + p_limit: limit, + }, + ); + + if (error) { + // Graceful degradation if the RPC doesn't exist + if ( + error.message.includes("function") && + error.message.includes("does not exist") + ) { + return toolSuccess( + "The get_thought_connections RPC is not available. " + + "Install schemas/knowledge-graph to enable related thought discovery.", + { available: false }, + ); + } + throw new Error( + `get_thought_connections failed: ${error.message}`, + ); + } + + const rows = (data ?? []) as Record[]; + + if (rows.length === 0) { + return toolSuccess( + `No related thoughts found for #${thoughtId}.`, + { results: [], thought_id: thoughtId }, + ); + } + + const lines = rows.map( + (row, index) => + `${index + 1}. #${row.id} (${row.type}) ${truncateContent(asString(row.content, ""), 300)}`, + ); + + return toolSuccess( + `Found ${rows.length} related thought(s) for #${thoughtId}:\n${lines.join("\n")}`, + { results: rows, thought_id: thoughtId }, + ); + } catch (error) { + console.error("related_thoughts failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 10. ops_capture_status (schema-backed: needs Smart Ingest Pipeline) ─ + +server.registerTool( + "ops_capture_status", + { + title: "Ops Capture Status", + description: + "Operational health checks for ingestion jobs. Requires the Smart Ingest Pipeline schema.", + inputSchema: z.object({ + sample_limit: z + .number() + .int() + .min(1) + .max(100) + .default(20) + .optional(), + include_samples: z.boolean().default(true).optional(), + }), + }, + async (params) => { + try { + const raw = params as Record; + const sampleLimit = asInteger(raw.sample_limit, 20, 1, 100); + const includeSamples = asBoolean(raw.include_samples, true); + + // Schema guard: check if ingestion_jobs table exists + const hasTable = await tableExists(supabase, "ingestion_jobs"); + if (!hasTable) { + return toolSuccess( + "This tool requires the Smart Ingest Pipeline schema. " + + "Install schemas/smart-ingest to enable operational monitoring of ingestion jobs.", + { available: false }, + ); + } + + // Parallel queries: recent jobs + count by status + const [recentRes, totalCountRes, completedCountRes, errorCountRes] = + await Promise.all([ + supabase + .from("ingestion_jobs") + .select( + "id, source_label, status, extracted_count, added_count, skipped_count, created_at, completed_at", + ) + .order("created_at", { ascending: false }) + .limit(sampleLimit), + supabase + .from("ingestion_jobs") + .select("id", { count: "exact", head: true }), + supabase + .from("ingestion_jobs") + .select("id", { count: "exact", head: true }) + .eq("status", "complete"), + supabase + .from("ingestion_jobs") + .select("id", { count: "exact", head: true }) + .eq("status", "error"), + ]); + + if (recentRes.error) { + throw new Error( + `ingestion_jobs query failed: ${recentRes.error.message}`, + ); + } + + const jobs = (recentRes.data ?? []) as Record[]; + const totalJobs = totalCountRes.count ?? 0; + const completedJobs = completedCountRes.count ?? 0; + const errorJobs = errorCountRes.count ?? 0; + + const statusSummary = [ + `Ingestion Job Status`, + `Total jobs: ${totalJobs}`, + `Completed: ${completedJobs}`, + `Errors: ${errorJobs}`, + `Recent samples: ${jobs.length}`, + ]; + + const payload: Record = { + available: true, + total_jobs: totalJobs, + completed_jobs: completedJobs, + error_jobs: errorJobs, + }; + + if (includeSamples) { + payload.recent_jobs = jobs; + } + + return toolSuccess(statusSummary.join("\n"), payload); + } catch (error) { + console.error("ops_capture_status failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 11. graph_search (schema-backed: needs Knowledge Graph) ───────────── + +server.registerTool( + "graph_search", + { + title: "Graph Search", + description: + "Search entities by name or type. Returns entities from the knowledge graph with their thought counts.", + inputSchema: z.object({ + query: z + .string() + .min(1) + .describe("Search term for entity name"), + entity_type: z + .string() + .optional() + .describe( + "Filter: person, project, topic, tool, organization, place", + ), + limit: z.number().int().min(1).max(50).default(20).optional(), + }), + }, + async (params) => { + try { + const raw = params as Record; + const query = asString(raw.query, "").trim(); + const entityType = raw.entity_type + ? asString(raw.entity_type, "").trim() + : null; + const limit = asInteger(raw.limit, 20, 1, 50); + + if (!query) { + return toolFailure("query is required"); + } + + // Schema guard: check if entities table exists + const hasTable = await tableExists(supabase, "entities"); + if (!hasTable) { + return toolSuccess( + "This tool requires the Knowledge Graph schema. " + + "Install schemas/knowledge-graph to enable entity search and graph exploration.", + { available: false }, + ); + } + + // Escape LIKE wildcards in the user query — unescaped `%` or `_` turns + // a search for "100%" into the ILIKE pattern `%100%%`, matching every + // entity whose name contains "100" instead of the literal substring. + const safeQuery = escapeLikePattern(query); + let q = supabase + .from("entities") + .select( + "id, entity_type, canonical_name, aliases, metadata, first_seen_at, last_seen_at", + ) + .ilike("canonical_name", `%${safeQuery}%`) + .order("last_seen_at", { ascending: false }) + .limit(limit); + + if (entityType) { + q = q.eq("entity_type", entityType); + } + + const { data: entities, error } = await q; + if (error) { + throw new Error(`graph_search failed: ${error.message}`); + } + + if (!entities || entities.length === 0) { + return toolSuccess("No entities found.", { + results: [], + total: 0, + }); + } + + // Get thought counts for each entity, excluding restricted thoughts + const entityIds = entities.map( + (e: Record) => e.id as number, + ); + const { data: countRows, error: countError } = await supabase + .from("thought_entities") + .select("entity_id, thoughts!inner(sensitivity_tier)") + .in("entity_id", entityIds) + .neq("thoughts.sensitivity_tier", "restricted"); + + if (countError) { + console.error("thought count query failed", countError); + } + + const countMap = new Map(); + if (countRows) { + for (const row of countRows) { + const eid = (row as Record).entity_id as number; + countMap.set(eid, (countMap.get(eid) ?? 0) + 1); + } + } + + const results = entities.map((e: Record) => ({ + ...e, + thought_count: countMap.get(e.id as number) ?? 0, + })); + + const lines = results.map( + (e: Record) => + `#${e.id} [${e.entity_type}] ${e.canonical_name} (${e.thought_count} thoughts, last seen ${e.last_seen_at})`, + ); + + return toolSuccess( + `Found ${results.length} entities:\n${lines.join("\n")}`, + { results, total: results.length }, + ); + } catch (error) { + console.error("graph_search failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 12. entity_detail (schema-backed: needs Knowledge Graph) ──────────── + +server.registerTool( + "entity_detail", + { + title: "Entity Detail", + description: + "Get full entity info with connected thoughts and edges from the knowledge graph.", + inputSchema: z.object({ + entity_id: z.number().int().min(1).describe("Entity ID"), + }), + }, + async (params) => { + try { + const raw = params as Record; + const entityId = asInteger( + raw.entity_id, + 0, + 1, + Number.MAX_SAFE_INTEGER, + ); + + if (!entityId) { + return toolFailure("entity_id is required"); + } + + // Schema guard + const hasTable = await tableExists(supabase, "entities"); + if (!hasTable) { + return toolSuccess( + "This tool requires the Knowledge Graph schema. " + + "Install schemas/knowledge-graph to enable entity detail views.", + { available: false }, + ); + } + + // Fetch entity + const { data: entity, error: entityError } = await supabase + .from("entities") + .select("*") + .eq("id", entityId) + .maybeSingle(); + + if (entityError) { + throw new Error(`entity fetch failed: ${entityError.message}`); + } + if (!entity) { + return toolFailure(`Entity #${entityId} not found`); + } + + // Fetch linked thoughts (excluding restricted), limit 20 most recent + const { data: thoughtLinks, error: tlError } = await supabase + .from("thought_entities") + .select("thought_id, mention_role, confidence") + .eq("entity_id", entityId) + .limit(100); + + if (tlError) { + throw new Error( + `thought_entities fetch failed: ${tlError.message}`, + ); + } + + let thoughts: Record[] = []; + if (thoughtLinks && thoughtLinks.length > 0) { + const thoughtIds = ( + thoughtLinks as Record[] + ).map((tl) => tl.thought_id as number); + const { data: thoughtRows, error: tError } = await supabase + .from("thoughts") + .select("id, content, type, created_at, sensitivity_tier") + .in("id", thoughtIds) + .neq("sensitivity_tier", "restricted") + .order("created_at", { ascending: false }) + .limit(20); + + if (tError) { + console.error("thoughts fetch failed", tError); + } else if (thoughtRows) { + const roleMap = new Map(); + for (const tl of thoughtLinks as Record[]) { + roleMap.set( + tl.thought_id as number, + tl.mention_role as string, + ); + } + thoughts = (thoughtRows as Record[]).map( + (t) => ({ + id: t.id, + content: truncateContent(asString(t.content, ""), 300), + type: t.type, + created_at: t.created_at, + mention_role: + roleMap.get(t.id as number) ?? "mentioned", + }), + ); + } + } + + // Fetch edges (both directions) + const { data: edgesFrom, error: efError } = await supabase + .from("edges") + .select("id, to_entity_id, relation, support_count, confidence") + .eq("from_entity_id", entityId); + + const { data: edgesTo, error: etError } = await supabase + .from("edges") + .select( + "id, from_entity_id, relation, support_count, confidence", + ) + .eq("to_entity_id", entityId); + + if (efError) console.error("edges from fetch failed", efError); + if (etError) console.error("edges to fetch failed", etError); + + // Collect all connected entity IDs to resolve names + const connectedIds = new Set(); + for (const e of (edgesFrom ?? []) as Record[]) { + connectedIds.add(e.to_entity_id as number); + } + for (const e of (edgesTo ?? []) as Record[]) { + connectedIds.add(e.from_entity_id as number); + } + + const nameMap = new Map(); + if (connectedIds.size > 0) { + const { data: connEntities } = await supabase + .from("entities") + .select("id, canonical_name, entity_type") + .in("id", Array.from(connectedIds)); + if (connEntities) { + for (const ce of connEntities as Record[]) { + nameMap.set(ce.id as number, { + name: ce.canonical_name as string, + type: ce.entity_type as string, + }); + } + } + } + + const edges = [ + ...((edgesFrom ?? []) as Record[]).map((e) => ({ + edge_id: e.id, + direction: "outgoing", + relation: e.relation, + other_entity_id: e.to_entity_id, + other_entity_name: + nameMap.get(e.to_entity_id as number)?.name ?? "unknown", + other_entity_type: + nameMap.get(e.to_entity_id as number)?.type ?? "unknown", + support_count: e.support_count, + confidence: e.confidence, + })), + ...((edgesTo ?? []) as Record[]).map((e) => ({ + edge_id: e.id, + direction: "incoming", + relation: e.relation, + other_entity_id: e.from_entity_id, + other_entity_name: + nameMap.get(e.from_entity_id as number)?.name ?? "unknown", + other_entity_type: + nameMap.get(e.from_entity_id as number)?.type ?? "unknown", + support_count: e.support_count, + confidence: e.confidence, + })), + ]; + + const entityData = entity as Record; + const summary = [ + `Entity #${entityData.id}: ${entityData.canonical_name} [${entityData.entity_type}]`, + `Aliases: ${JSON.stringify(entityData.aliases)}`, + `First seen: ${entityData.first_seen_at}, Last seen: ${entityData.last_seen_at}`, + `Connected thoughts: ${thoughts.length}`, + `Edges: ${edges.length}`, + ]; + + if (edges.length > 0) { + summary.push("Connections:"); + for (const edge of edges) { + summary.push( + ` ${edge.direction === "outgoing" ? "\u2192" : "\u2190"} ${edge.relation} \u2192 ${edge.other_entity_name} [${edge.other_entity_type}] (support: ${edge.support_count})`, + ); + } + } + + return toolSuccess(summary.join("\n"), { + entity: entityData, + thoughts, + edges, + }); + } catch (error) { + console.error("entity_detail failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── 13. ops_source_monitor (schema-backed: needs ops views) ──────────── + +server.registerTool( + "ops_source_monitor", + { + title: "Ops Source Monitor", + description: + "Per-source ingestion counts, errors, and recent failures. Requires operational monitoring views.", + inputSchema: z.object({ + sample_limit: z + .number() + .int() + .min(1) + .max(100) + .default(25) + .optional(), + }), + }, + async (params) => { + try { + const raw = params as Record; + const sampleLimit = asInteger(raw.sample_limit, 25, 1, 100); + + // Schema guard: check that one of the views this tool actually reads + // exists. The previous guard checked `ops_source_volume_24h`, a view + // name that exists in neither this repo nor the brain-health-monitoring + // recipe — so once the recipe WAS installed, this tool still returned + // "install required views". Use the real view name. + const hasView = await tableExists( + supabase, + "ops_source_ingestion_24h", + ); + if (!hasView) { + return toolSuccess( + "This tool requires operational monitoring views. " + + "Install the brain-health-monitoring recipe to enable per-source monitoring.", + { available: false }, + ); + } + + const [ + sourceIngestionResponse, + sourceErrorsResponse, + sourceFailuresResponse, + ] = await Promise.all([ + supabase + .from("ops_source_ingestion_24h") + .select("source, status, events_24h") + .order("source", { ascending: true }) + .limit(250), + supabase + .from("ops_source_errors_24h") + .select("source, error_events_24h") + .order("source", { ascending: true }) + .limit(100), + supabase + .from("ops_source_recent_failures") + .select( + "id, source, status, reason, source_event_id, metadata, created_at", + ) + .order("created_at", { ascending: false }) + .limit(sampleLimit), + ]); + + // If one of the individual views is missing (partial install), fall + // back to a graceful "not fully installed" response rather than + // raising — the tool should light up in best-effort mode. + const viewMissing = (err: { message?: string } | null | undefined) => + !!err?.message && + /(does not exist|not found|relation .* does not exist)/i.test(err.message); + + if ( + viewMissing(sourceIngestionResponse.error) || + viewMissing(sourceErrorsResponse.error) || + viewMissing(sourceFailuresResponse.error) + ) { + return toolSuccess( + "Ops monitoring views are only partially installed. " + + "Verify the brain-health-monitoring recipe has been applied in full.", + { + available: false, + ingestion_ok: !viewMissing(sourceIngestionResponse.error), + errors_ok: !viewMissing(sourceErrorsResponse.error), + failures_ok: !viewMissing(sourceFailuresResponse.error), + }, + ); + } + + if (sourceIngestionResponse.error) { + throw new Error( + `ops_source_ingestion_24h query failed: ${sourceIngestionResponse.error.message}`, + ); + } + if (sourceErrorsResponse.error) { + throw new Error( + `ops_source_errors_24h query failed: ${sourceErrorsResponse.error.message}`, + ); + } + if (sourceFailuresResponse.error) { + throw new Error( + `ops_source_recent_failures query failed: ${sourceFailuresResponse.error.message}`, + ); + } + + type SourceIngestionRow = { + source: string; + status: string; + events_24h: number; + }; + type SourceErrorRow = { + source: string; + error_events_24h: number; + }; + + const sourceIngestionRows = (sourceIngestionResponse.data ?? + []) as SourceIngestionRow[]; + const sourceErrorRows = (sourceErrorsResponse.data ?? + []) as SourceErrorRow[]; + const sourceFailureRows = (sourceFailuresResponse.data ?? + []) as Record[]; + + const statusBySource = new Map(); + for (const row of sourceIngestionRows) { + if (!statusBySource.has(row.source)) { + statusBySource.set(row.source, "PASS"); + } + } + for (const row of sourceErrorRows) { + if (Number(row.error_events_24h) > 0) { + statusBySource.set(row.source, "ATTN"); + } + } + + const sourceStatuses = [...statusBySource.entries()] + .map(([source, status]) => ({ source, status })) + .sort((a, b) => a.source.localeCompare(b.source)); + + const summaryLines = [ + "Per-Source Monitor (24h)", + ...sourceStatuses.map((row) => `${row.source}: ${row.status}`), + `Recent failure samples: ${sourceFailureRows.length}`, + ]; + + return toolSuccess(summaryLines.join("\n"), { + available: true, + source_statuses: sourceStatuses, + source_ingestion_24h: sourceIngestionRows, + source_errors_24h: sourceErrorRows, + source_recent_failures: sourceFailureRows, + }); + } catch (error) { + console.error("ops_source_monitor failed", error); + return toolFailure(String(error)); + } + }, +); + +// ── Hono App with Auth + CORS ───────────────────────────────────────────── + +const corsHeaders = { + "Access-Control-Allow-Origin": "*", + "Access-Control-Allow-Headers": + "authorization, x-client-info, apikey, content-type, x-brain-key, accept, mcp-session-id", + "Access-Control-Allow-Methods": "GET, POST, OPTIONS, DELETE", +}; + +const app = new Hono(); + +/** + * Constant-time compare for the MCP access key. Uses + * `crypto.subtle.timingSafeEqual` where available (Deno >= 1.41), falling + * back to a manual XOR loop on older runtimes. Both paths short-circuit + * on length mismatch — for fixed-length 32-char keys this is acceptable; + * a variable-length key deployment should prefer the SubtleCrypto path + * which operates on equal-length Uint8Array buffers. + */ +function timingSafeEqualStrings(a: string, b: string): boolean { + if (a.length !== b.length) return false; + const enc = new TextEncoder(); + const aBuf = enc.encode(a); + const bBuf = enc.encode(b); + const subtle = (crypto as unknown as { + subtle?: { timingSafeEqual?: (x: ArrayBufferView, y: ArrayBufferView) => boolean }; + }).subtle; + if (typeof subtle?.timingSafeEqual === "function") { + return subtle.timingSafeEqual(aBuf, bBuf); + } + // Fallback: manual XOR loop — constant-time across equal-length inputs. + let diff = 0; + for (let i = 0; i < aBuf.length; i++) diff |= aBuf[i] ^ bBuf[i]; + return diff === 0; +} + +// CORS preflight -- required for browser/Electron-based clients (Claude Desktop, claude.ai) +app.options("*", (c) => { + return c.text("ok", 200, corsHeaders); +}); + +app.all("*", async (c) => { + // Header-only auth: `x-brain-key: ` or `Authorization: Bearer `. + // We do NOT accept the key via a `?key=` query parameter — URL query + // strings end up in Supabase/CDN/proxy access logs, which leaks the + // credential into places that don't get rotated with the secret itself. + const headerKey = c.req.header("x-brain-key"); + const authHeader = c.req.header("authorization") ?? c.req.header("Authorization"); + const bearerKey = authHeader?.match(/^Bearer\s+(.+)$/i)?.[1]; + const provided = headerKey ?? bearerKey; + if (!provided || !timingSafeEqualStrings(provided, MCP_ACCESS_KEY)) { + return c.json( + { error: "Invalid or missing access key" }, + 401, + corsHeaders, + ); + } + + // Fix: Claude Desktop connectors don't send the Accept header that + // StreamableHTTPTransport requires. Build a patched request if missing. + if (!c.req.header("accept")?.includes("text/event-stream")) { + const headers = new Headers(c.req.raw.headers); + headers.set("Accept", "application/json, text/event-stream"); + const patched = new Request(c.req.raw.url, { + method: c.req.raw.method, + headers, + body: c.req.raw.body, + // @ts-ignore -- duplex required for streaming body in Deno + duplex: "half", + }); + Object.defineProperty(c.req, "raw", { + value: patched, + writable: true, + }); + } + + const transport = new StreamableHTTPTransport(); + await server.connect(transport); + return transport.handleRequest(c); +}); + +Deno.serve(app.fetch); diff --git a/integrations/enhanced-mcp/metadata.json b/integrations/enhanced-mcp/metadata.json new file mode 100644 index 000000000..d6520aa49 --- /dev/null +++ b/integrations/enhanced-mcp/metadata.json @@ -0,0 +1,20 @@ +{ + "name": "Enhanced MCP Server", + "description": "Production-grade remote MCP server expanding the tool surface from 4 to 13 tools with enhanced search, CRUD, enrichment, and operational monitoring.", + "category": "integrations", + "author": { + "name": "Alan Shurafa", + "github": "alanshurafa" + }, + "version": "1.0.0", + "requires": { + "open_brain": true, + "services": ["OpenRouter", "Supabase"], + "tools": ["Supabase CLI", "Deno"] + }, + "tags": ["mcp", "tools", "search", "capture", "enrichment", "ops"], + "difficulty": "intermediate", + "estimated_time": "30 minutes", + "created": "2026-04-06", + "updated": "2026-04-06" +}