feat(graph): require evidence on manual concept creation#358
Conversation
Proposes replacing hardcoded model lists with a database-backed catalog per provider, adding OpenRouter as a fourth inference endpoint, and integrating per-model pricing into cost estimation.
…R-800) - Add provider_model_catalog table (migration 059) with pricing, capabilities, and curation fields per model per provider - Seed catalog with known models and pricing (migration 060) - Add OpenRouterProvider using OpenAI SDK with custom base URL - Add fetch_model_catalog() to all providers for dynamic discovery - Add model_catalog.py with upsert/query/enable/default/pricing helpers - Add admin API endpoints: GET/POST/PUT /admin/models/catalog/* - Add configure.py 'models' subcommand: list, refresh, enable, disable, default, price - Update CostEstimator to read pricing from catalog before env vars - Wire OpenRouter into provider factory, rate limiter, and configure.py - Update list_available_models() on all providers to prefer catalog
Replace hardcoded OpenAI setup in guided-init.sh with interactive flow: - Step 4: Choose provider (OpenAI, Anthropic, OpenRouter) - Step 5: Enter and validate API key - Step 6: Refresh model catalog, present filtered menu, user picks model OpenRouter shows curated subset (GPT-4o, Claude, Gemini, Llama, etc.) with option [0] to show all 200+ models. Ollama noted as post-init config. Also adds --tsv, --category, --limit flags to configure.py models list for machine-parseable output used by the init script.
Needed by OpenRouter and Ollama fetch_model_catalog() which use requests.get() directly rather than the OpenAI SDK.
Follow existing pattern — INSERT INTO public.schema_migrations at the end of each migration file.
058 uses AGE Cypher DDL (CREATE VLABEL/ELABEL) which can fail on cold start when AGE isn't fully initialized. Wrap each label creation in EXCEPTION handler so failures log a notice instead of aborting — labels get created on first use anyway. Also add ON_ERROR_STOP=1 to psql in migrate-db.sh so failed migrations don't self-register via the INSERT at the end of the file. Previously, a mid-file failure would still record the migration as applied, then the runner would exit, blocking all subsequent migrations.
Split migrations into two phases: - schema/migrations/ — standard SQL (tables, indexes, permissions) - schema/migrations-warm/ — AGE/Cypher DDL (needs running graph engine) Move 058 (precreate_graph_labels) to migrations-warm. After cold migrations, restart postgres so AGE is fully initialized, verify it's healthy, then apply warm migrations. Also: - Migration runner continues past failures instead of stopping - configure.py ai-provider handles missing catalog table gracefully - Runner supports --warm flag to select migration directory
…input The ai_extraction_config table has a CHECK constraint that only allowed openai, anthropic, ollama. Add openrouter via migration 061. Also remove -s flag from API key read so input is visible — easier to verify the key was pasted correctly.
NUMERIC(12,6) overflows on high-cost OpenRouter models (image generation, specialized). Use unconstrained NUMERIC which handles arbitrary precision.
No backwards compatibility needed — merge table creation, seed data, openrouter constraint, and NUMERIC widening into one migration.
[$] cycles through: unsorted → cheapest first → most expensive first. [0] toggles between curated and full model list. Both compose — sort applies to whichever list is currently shown.
The catalog ID is the 2nd positional arg which argparse maps to provider_name, not model_id. Fall back to provider_name for actions that expect a catalog ID.
4096 truncates extraction responses for verbose models via OpenRouter. Bump to 16384 as interim fix — proper solution is catalog-driven token limits per model.
- Add --max-tokens flag to configure.py ai-provider - Store max_tokens in ai_extraction_config table (column already existed) - OpenRouterProvider reads max_tokens from config (default 16384) - Init flow prompts for max tokens with press-enter-to-accept default - Factory passes max_tokens from database config to provider
…e endpoint
Manually created concepts were born with zero evidence instances, making them
ungrounded. Now evidence_text is required when creating concepts via API/CLI/MCP
(except match_only mode and LLM extraction). Evidence is stored as an Instance
node linked to the concept's synthetic source.
Also adds:
- POST /concepts/{id}/evidence endpoint for adding evidence to existing concepts
- add_evidence action on the MCP concept tool
- evidence_text parameter on the MCP graph tool (create concept + queue)
- Missing @types dev deps that were causing pre-existing TS build failures
Code Review: Evidence-Required for Manual Concept CreationScope: Evidence feature changes only (models, routes, service, CLI types, MCP client, MCP server, graph-operations). Not reviewing ADR-800 model catalog changes bundled in this PR. Overall Assessment: Solid feature with clean separation across the stack. The validation enforcement, Instance node creation, and add_evidence endpoint follow existing patterns well. A few issues worth addressing before merge. Finding 1: Audit Action Mismatch (Bug)Location: Problem: The Why it matters: Audit logs become unreliable for forensic analysis. Someone filtering audit events for concept creation will get false positives from evidence additions, and there's no way to filter for evidence additions specifically. Suggestion: Add an Finding 2: Missing
|
| Category | Count |
|---|---|
| Bug (audit action) | 1 |
| API contract gap (missing response_model) | 1 |
| Missing tests | 1 (blocking) |
| Provenance accuracy | 1 |
| Client-side validation opportunity | 1 |
| Query safety (pre-existing, non-blocking) | 1 |
Recommendation: Fix findings 1, 2, and 5 before merge. Finding 6 is worth addressing if feasible. Findings 3 and 4 are non-blocking improvements.
AI-assisted review via Claude
- Add ADD_EVIDENCE audit action instead of reusing CREATE_CONCEPT - Add EvidenceResponse model and response_model on evidence endpoint - Accept creation_method parameter in add_evidence service method - Add tests for evidence endpoint (success, validation, 404, auth) - Add tests for evidence_text requirement on concept creation
Summary
evidence_text(min 10 chars) — stored as an Instance node linked to the concept's synthetic sourcePOST /concepts/{id}/evidenceendpoint to add evidence to existing conceptsadd_evidenceaction on the MCPconcepttoolevidence_textparameter added to MCPgraphtool (create concept + queue operations)@types/*dev dependenciesContext
Concepts created via the
graphMCP tool were born with zero evidence instances — completely ungrounded (e.g.c_341dd030bb48had "Unexplored [0% conf]"). The synthetic source tracked who created the concept but not why. This closes that gap by requiring evidence at creation time and providing a way to add evidence after the fact.Test plan