MCP Semantic Analysis - Knowledge Base Updates

Overview

The MCP Semantic Analysis system provides automated knowledge base updates through a 14-agent multi-agent system with semantic orchestration. Users trigger updates by typing "ukb" in Claude chat, which causes Claude to call the MCP semantic-analysis server.

IMPORTANT: ukb is NOT a shell command. It's a keyword that triggers Claude to execute the semantic analysis workflow via MCP tools.

Multi-Agent Architecture

The system uses a SmartOrchestrator that provides semantic routing, confidence propagation, and intelligent retry mechanisms. Unlike traditional sequential pipelines, agents communicate through a central orchestrator that makes LLM-assisted routing decisions.

How It Works

User Interaction

User types in Claude chat: "ukb"
                          ↓
Claude detects knowledge update request
                          ↓
Claude decides: incremental or full analysis
                          ↓
Claude calls MCP tool: mcp__semantic-analysis__execute_workflow
                          ↓
14-Agent workflow executes
                          ↓
Results stored to GraphDB + JSON export
                          ↓
Claude displays summary to user

Trigger Variations

Users can type any of these to trigger the workflow:

ukb (incremental analysis)
ukb full (complete analysis)
run ukb
update knowledge base

Claude intelligently detects the intent and chooses the appropriate workflow.

14-Agent Multi-Agent System

When triggered, the MCP semantic-analysis server executes a coordinated workflow with semantic routing:

Core Data Extraction Agents

1. GitHistoryAgent

Analyzes git commits since last checkpoint
Identifies architectural changes
Extracts technical decisions

2. VibeHistoryAgent

Analyzes session logs (.specstory/history/)
Extracts conversation patterns
Identifies problem-solution pairs

3. CodeGraphAgent

AST-based code analysis via Memgraph
Function call graphs and dependencies
Natural language queries about code structure
Requires Docker (Memgraph)

Analysis & Enrichment Agents

4. SemanticAnalysisAgent

Deep semantic analysis of code and conversations
Pattern extraction with significance scoring
Uses LLM fallback chain: Groq → Gemini → Custom → Anthropic → OpenAI

5. OntologyClassificationAgent

Maps entities to project-specific ontology
Validates entity types against upper/lower ontologies
Ensures semantic consistency

6. WebSearchAgent

Researches technical patterns (DuckDuckGo)
Finds best practices
Validates approaches

Knowledge Generation Agents

7. InsightGenerationAgent

Creates structured insights with PlantUML diagrams
Generates documentation artifacts
Formats knowledge entities

8. ObservationGenerationAgent

Adds observations to entities
Links related concepts
Enriches knowledge graph

9. DocumentationLinkerAgent

Links entities to documentation files
Cross-references code and docs
Maintains documentation graph

Quality & Persistence Agents

10. QualityAssuranceAgent (Semantic Router)

Central role in multi-agent coordination
Validates insight quality with confidence scoring
Generates routing decisions: proceed, retry, skip, escalate
Provides semantic feedback for retry guidance
Reports confidence breakdown per step

11. DeduplicationAgent

Prevents duplicate entities using embeddings
Merges similar patterns
Uses OpenAI embeddings for similarity scoring

12. ContentValidationAgent

Final content validation
Refreshes stale entities
Ensures persistence-ready data

13. PersistenceAgent

Stores entities to GraphDB
Creates relations
Updates knowledge graph

SmartOrchestrator (Coordination Layer)

The SmartOrchestrator manages all agents with:

Semantic Retry Guidance: Instead of mechanical threshold tightening, provides specific feedback about what went wrong and how to fix it
Confidence Propagation: Each step reports confidence; downstream agents receive upstream confidence context
Dynamic Routing: QA results drive routing decisions (proceed/retry/skip/escalate)
LLM-Assisted Decisions: Uses AI to interpret complex failures and suggest remediation
Workflow Modification: Can skip steps or adjust parameters based on runtime conditions

Workflow Visualization

The System Health Dashboard provides real-time visualization of workflow execution:

Dashboard Features:

Visual DAG Workflow Graph - Shows agent execution flow with QA as central coordinator
Confidence Bars - Per-agent confidence levels (green ≥80%, amber ≥50%, red <50%)
Routing Decision Badges - Shows proceed (✓), retry (↻), skip (⊘), or escalate (!) decisions
Retry Count Badges - Displays semantic retry iterations per agent
Pipeline Statistics - Commits processed, sessions analyzed, deduplication metrics
Entity Breakdown - Final counts by type (GraphDatabase, MCPAgent, Pattern, etc.)
Multi-Agent Data - stepConfidences, routingHistory, workflowModifications

Access via http://localhost:3032 → UKB Workflow Monitor.

Storage Architecture

Three-Layer Synchronization

Graphology (in-memory)
        ↕ (1s auto-persist)
    LevelDB (persistent)
        ↕ (5s debounced export)
JSON Files (git-tracked)

GraphDB:

In-memory: Graphology graph structure
Persistent: LevelDB at .data/knowledge-graph/
Auto-persist interval: 1 second

JSON Export:

Path: .data/knowledge-export/coding.json
Auto-export: 5 seconds (debounced)
Git-tracked for team synchronization

Checkpoint:

Path: .data/ukb-last-run.json
Git-tracked for team-wide incremental processing
Records: lastSuccessfulRun, lastCommit, lastSession, stats

Incremental vs Full Analysis

Incremental Analysis (Default)

When user types ukb:

Load checkpoint from .data/ukb-last-run.json
Calculate gap since last run:
- New commits since lastCommit
- New session logs since lastSession
Process only the gap (efficient)
Update checkpoint with new timestamp

Use when: Regular updates, daily/weekly knowledge capture

Full Analysis

When user types ukb full:

Ignore checkpoint (analyze everything)
Process entire history:
- All git commits
- All session logs
Update checkpoint with completion timestamp

Use when: First-time setup, major refactoring, comprehensive review

Team Synchronization

How Team Sync Works

Developer A:
  1. Types "ukb" in Claude
  2. Workflow executes
  3. Updates: .data/knowledge-export/coding.json
  4. Updates: .data/ukb-last-run.json
  5. Git commits both files
  6. Git pushes to remote

Developer B:
  1. Git pulls from remote
  2. Gets updated coding.json (knowledge)
  3. Gets updated ukb-last-run.json (checkpoint)
  4. Types "ukb" in Claude
  5. Only processes commits/sessions since checkpoint
     (avoiding duplicate work)

Bi-Directional Sync

Forward Flow (Local → Team):

Graphology → LevelDB (1s) → JSON export (5s) → Git commit → Git push

Backward Flow (Team → Local):

Git pull → JSON import (on startup) → LevelDB → Graphology

Conflict Resolution: Newest-wins strategy on startup import

MCP Tool Reference

execute_workflow

Primary tool for running semantic analysis workflows.

mcp__semantic-analysis__execute_workflow({
  workflow_name: "incremental-analysis",  // or "complete-analysis"
  parameters: {
    // optional parameters
  }
})

Workflows Available:

incremental-analysis: Gap-based processing (default)
complete-analysis: Full history processing

create_ukb_entity_with_insight

Create a single entity with insight document.

mcp__semantic-analysis__create_ukb_entity_with_insight({
  entity_name: "CachingPattern",
  entity_type: "TechnicalPattern",
  insights: "Detailed insight content...",
  significance: 8,
  tags: ["caching", "performance"]
})

Use when: Creating specific entities programmatically

Integration with Claude Code

Constraint System

The constraint monitor enforces that ukb is never executed as a shell command:

Pattern: ^\s*ukb\s+(?:--help|-h|status|entity|...)

Blocks:

ukb --help ❌
ukb status ❌
ukb entity list ❌

Allows:

plantuml ukb-cli-data-flow.puml ✅ (filename)
grep ukb docs/*.md ✅ (search term)
User typing "ukb" in chat ✅ (triggers MCP)

Claude's Decision Process

// When user types "ukb", Claude's internal logic:
if (message.includes("ukb full")) {
  workflow = "complete-analysis";
} else if (message.includes("ukb")) {
  workflow = "incremental-analysis";
}

// Claude calls MCP tool:
await mcp__semantic_analysis__execute_workflow({
  workflow_name: workflow
});

Visualization

VKB Server Integration

The VKB (Visualize Knowledge Base) server provides a web UI:

# Start VKB server
vkb server start

# Open browser to http://localhost:8080

Features:

Interactive graph visualization
Entity browsing and search
Relation exploration
Real-time updates via WebSocket

Knowledge Architecture Diagram

Shows the complete flow from user typing "ukb" through MCP integration to storage.

Common Workflows

Daily Knowledge Update

# In Claude chat:
User: "ukb"

# Claude executes incremental workflow
# Shows summary:
✅ Knowledge base updated!
📊 Total entities: 234
🔗 Total relations: 567
💾 Exports: .data/knowledge-export/coding.json
📁 Checkpoint: .data/ukb-last-run.json

Entities created: 5
Relations created: 8
Insights generated: 3
Commits analyzed: 12
Sessions analyzed: 3
Duration: 45s

First-Time Setup

# In Claude chat:
User: "ukb full"

# Claude executes complete analysis
# Processes entire git history and all sessions
# May take several minutes for large projects

Check What Will Be Analyzed

# In Claude chat:
User: "What would ukb analyze?"

# Claude can check checkpoint and report:
Since last run (2024-11-22T10:00:00Z):
- 12 new commits
- 3 new session logs

Troubleshooting

No New Knowledge Created

Possible causes:

No new commits since last checkpoint
No new session logs since last checkpoint
Semantic analysis didn't identify significant patterns

Solution: Run ukb full to force complete analysis

Checkpoint Issues

Reset checkpoint:

# Manually delete checkpoint to force full analysis next run
rm .data/ukb-last-run.json

# Next "ukb" will process everything

GraphDB Corruption

Recovery:

Stop all processes accessing GraphDB
Backup: cp -r .data/knowledge-graph .data/knowledge-graph.backup
Delete corrupted DB: rm -rf .data/knowledge-graph
Restart and let auto-import rebuild from JSON

MCP Server Not Responding

Check MCP server status:

# In terminal (NOT Claude chat):
./bin/mcp-status

# If semantic-analysis not running:
# Restart Claude Code via: ./bin/coding --claude

Advanced Usage

Programmatic Access

For automation or scripting, use MCP tools directly:

// Example: Node.js script using MCP client
const result = await mcpClient.call_tool({
  name: "mcp__semantic-analysis__execute_workflow",
  arguments: {
    workflow_name: "incremental-analysis"
  }
});

console.log(`Created ${result.stats.entitiesCreated} entities`);

Custom Workflows

The semantic-analysis server supports custom workflow definitions. See integrations/mcp-server-semantic-analysis/workflows/ for examples.

Historical Context

Previous System (Deprecated): The original ukb CLI tool existed as a shell command in earlier versions. This was removed and replaced with the MCP integration to:

Provide AI-driven semantic analysis
Enable Claude to make intelligent decisions about analysis scope
Integrate seamlessly with Claude Code workflow
Eliminate manual command-line operations

FilesExpand file tree

mcp-semantic-analysis.md

Latest commit

History