Skip to content
This repository was archived by the owner on Apr 2, 2026. It is now read-only.

Latest commit

 

History

History
497 lines (390 loc) · 17.3 KB

File metadata and controls

497 lines (390 loc) · 17.3 KB

Beacon Proxy: Technical Design Document

Overview

A transparent proxy that wraps any STDIO-based MCP server, intercepts every JSON-RPC message, and builds an audit trail with intent-to-action tracing.


Architecture

┌─────────────┐     stdin/stdout      ┌──────────────┐     stdin/stdout      ┌─────────────┐
│  MCP Client  │ ◄──────────────────► │ beacon-proxy  │ ◄──────────────────► │  MCP Server  │
│ (Claude, etc)│                      │               │                      │ (GitHub, etc)│
└─────────────┘                      │  ┌─────────┐  │                      └─────────────┘
                                     │  │ Audit DB │  │
                                     │  └─────────┘  │
                                     │  ┌─────────┐  │
                                     │  │ Policy   │  │
                                     │  │ Engine   │  │
                                     │  └─────────┘  │
                                     │  ┌─────────┐  │
                                     │  │ Web UI   │  │
                                     │  │ :8080    │  │
                                     │  └─────────┘  │
                                     └──────────────┘

How It Works

  1. Client config points command at beacon-proxy instead of the real server binary
  2. beacon-proxy spawns the real MCP server as a child process
  3. Every JSON-RPC message (both directions) passes through the proxy
  4. Proxy logs, classifies, risk-scores, and optionally pauses for approval
  5. Web UI shows real-time audit trail at localhost:8080

Data Model

Core Tables

-- A session is one proxy lifetime (client connects, does stuff, disconnects)
CREATE TABLE sessions (
    id              TEXT PRIMARY KEY,   -- uuid
    server_name     TEXT NOT NULL,      -- e.g. "github", "postgres", from config
    server_command  TEXT NOT NULL,      -- the actual command being wrapped
    started_at      TIMESTAMP NOT NULL,
    ended_at        TIMESTAMP,
    client_info     JSON                -- MCP initialize result (client name, version)
);

-- Every JSON-RPC message in both directions
CREATE TABLE messages (
    id              TEXT PRIMARY KEY,   -- uuid
    session_id      TEXT NOT NULL REFERENCES sessions(id),
    direction       TEXT NOT NULL,      -- 'client_to_server' | 'server_to_client'
    timestamp       TIMESTAMP NOT NULL,
    
    -- JSON-RPC fields
    jsonrpc_id      TEXT,               -- request/response correlation (can be null for notifications)
    method          TEXT,               -- e.g. 'tools/call', 'tools/list', 'initialize'
    params          JSON,               -- request params
    result          JSON,               -- response result
    error           JSON,               -- response error if any
    
    -- Raw payload for forensics
    raw             TEXT NOT NULL
);

-- Higher-level: a tool invocation (request + response paired)
CREATE TABLE tool_calls (
    id              TEXT PRIMARY KEY,   -- uuid
    session_id      TEXT NOT NULL REFERENCES sessions(id),
    request_msg_id  TEXT NOT NULL REFERENCES messages(id),
    response_msg_id TEXT REFERENCES messages(id),  -- null until response arrives
    
    -- Tool details
    tool_name       TEXT NOT NULL,      -- e.g. 'read_file', 'create_issue', 'send_message'
    arguments       JSON NOT NULL,      -- what was passed to the tool
    result          JSON,               -- what came back
    error           JSON,               -- error if failed
    
    -- Classification
    operation_type  TEXT NOT NULL,      -- 'read' | 'write' | 'delete' | 'execute' | 'unknown'
    risk_score      INTEGER NOT NULL DEFAULT 0,  -- 0-100
    risk_reasons    JSON,               -- ["bulk_write", "destructive", "sensitive_data"]
    
    -- Timing
    requested_at    TIMESTAMP NOT NULL,
    responded_at    TIMESTAMP,
    duration_ms     INTEGER,
    
    -- Policy
    policy_action   TEXT NOT NULL DEFAULT 'pass',  -- 'pass' | 'flag' | 'pause' | 'block'
    approved_by     TEXT,               -- if paused, who approved
    approved_at     TIMESTAMP
);

-- Intent context: what human instruction triggered this chain of tool calls
-- This is the novel part — nobody else captures this
CREATE TABLE intent_contexts (
    id              TEXT PRIMARY KEY,   -- uuid
    session_id      TEXT NOT NULL REFERENCES sessions(id),
    
    -- The human's instruction (if capturable)
    human_prompt    TEXT,               -- the original instruction
    prompt_summary  TEXT,               -- AI-generated summary of intent
    
    -- Which tool calls resulted from this intent
    created_at      TIMESTAMP NOT NULL
);

-- Many-to-many: which tool calls belong to which intent
CREATE TABLE intent_tool_calls (
    intent_id       TEXT NOT NULL REFERENCES intent_contexts(id),
    tool_call_id    TEXT NOT NULL REFERENCES tool_calls(id),
    sequence_order  INTEGER NOT NULL,   -- order within the intent chain
    PRIMARY KEY (intent_id, tool_call_id)
);

-- Policy rules
CREATE TABLE policy_rules (
    id              TEXT PRIMARY KEY,
    name            TEXT NOT NULL,       -- "block_bulk_deletes"
    description     TEXT,
    
    -- Matching
    tool_pattern    TEXT,                -- glob: "delete_*", "*", "sql_query"
    server_pattern  TEXT,                -- glob: "postgres", "*"
    
    -- Conditions
    min_risk_score  INTEGER DEFAULT 0,
    operation_types TEXT,                -- comma-separated: "delete,write"
    
    -- Action
    action          TEXT NOT NULL,       -- 'flag' | 'pause' | 'block'
    
    enabled         BOOLEAN NOT NULL DEFAULT TRUE,
    created_at      TIMESTAMP NOT NULL
);

Indexes

CREATE INDEX idx_messages_session ON messages(session_id, timestamp);
CREATE INDEX idx_tool_calls_session ON tool_calls(session_id, requested_at);
CREATE INDEX idx_tool_calls_risk ON tool_calls(risk_score DESC);
CREATE INDEX idx_tool_calls_policy ON tool_calls(policy_action) WHERE policy_action != 'pass';
CREATE INDEX idx_intent_tool_calls_intent ON intent_tool_calls(intent_id);

Risk Classification

Operation Type Detection

Classify by tool name pattern matching. Start simple, refine with real data:

READ patterns:
  - get_*, read_*, list_*, search_*, describe_*, show_*
  - sql queries containing only SELECT

WRITE patterns:
  - create_*, update_*, set_*, add_*, put_*, edit_*, modify_*
  - sql queries containing INSERT, UPDATE

DELETE patterns:
  - delete_*, remove_*, drop_*, destroy_*, purge_*
  - sql queries containing DELETE, DROP, TRUNCATE

EXECUTE patterns:
  - run_*, exec_*, invoke_*, call_*, trigger_*
  - anything involving shell/command execution

Risk Scoring (0-100)

Start with additive scoring:

Factor Points
Base: read operation 0
Base: write operation 20
Base: delete operation 40
Base: execute operation 30
Bulk operation (>10 records) +20
Touches auth/credentials +30
SQL with no WHERE clause +30
Modifies config/settings +20
Sends external message +15
First time seeing this tool +10

Policy Actions

Risk Score Default Action
0-30 pass (log only)
31-60 flag (log + highlight in UI)
61-80 pause (hold for human approval, 60s timeout, default deny)
81-100 block (reject, return error to client)

Intent Capture

This is the hard/novel part. How do you know what the human asked for?

Approach 1: Client-Side Hook (Ideal, Harder)

Some MCP clients expose the conversation context. Claude Desktop's MCP calls include sampling context. If available, extract the most recent human message as the intent.

Approach 2: Temporal Grouping (Practical, Start Here)

Group tool calls by time proximity. If an agent makes 5 tool calls within 2 seconds, they're likely from the same intent. Create an intent_context for each burst:

Tool calls at T+0.0s, T+0.3s, T+0.8s, T+1.2s  → Intent A
[gap > 5 seconds]
Tool calls at T+8.0s, T+8.5s                     → Intent B

The human_prompt field stays null, but you still get the chain of actions grouped together.

Approach 3: MCP Sampling (Best of Both)

MCP spec includes a sampling/createMessage flow where the server can ask the client to generate text. If the client supports it, the conversation context is available in the sampling request. Intercept these to extract intent.

Start with Approach 2. It works today with zero client changes.


Project Structure

beacon-proxy/
├── cmd/
│   └── beacon-proxy/
│       └── main.go              # CLI entry point
├── internal/
│   ├── proxy/
│   │   ├── proxy.go             # Core: spawn child, pipe stdin/stdout
│   │   └── jsonrpc.go           # Parse JSON-RPC messages from stream
│   ├── audit/
│   │   ├── store.go             # SQLite audit log storage
│   │   ├── classifier.go        # Operation type + risk scoring
│   │   └── intent.go            # Temporal grouping for intent chains
│   ├── policy/
│   │   ├── engine.go            # Rule evaluation
│   │   └── rules.go             # Default rules + custom rule loading
│   └── web/
│       ├── server.go            # HTTP server for dashboard
│       ├── handlers.go          # API endpoints
│       └── static/              # Dashboard HTML/JS (embed)
├── configs/
│   ├── default_rules.yaml       # Sensible default policies
│   └── example_claude.json      # Example claude_desktop_config.json
├── go.mod
├── go.sum
└── README.md

Key Implementation Notes

JSON-RPC Stream Parsing

MCP over STDIO uses newline-delimited JSON-RPC. Each message is a single line of JSON. The proxy reads lines from client stdin, parses, logs, optionally transforms, and writes to server stdin. Reverse for server stdout → client stdout.

// Simplified core loop
func (p *Proxy) Run() {
    // Spawn real MCP server
    cmd := exec.Command(p.serverCommand, p.serverArgs...)
    serverIn, _ := cmd.StdinPipe()
    serverOut, _ := cmd.StdoutPipe()
    cmd.Start()

    // Client → Server (with interception)
    go p.pipe(os.Stdin, serverIn, "client_to_server")
    
    // Server → Client (with interception)
    go p.pipe(serverOut, os.Stdout, "server_to_client")
    
    cmd.Wait()
}

func (p *Proxy) pipe(src io.Reader, dst io.Writer, direction string) {
    scanner := bufio.NewScanner(src)
    for scanner.Scan() {
        line := scanner.Bytes()
        
        // Parse, log, classify, check policy
        msg := p.parseMessage(line)
        p.audit.Log(direction, msg)
        
        action := p.policy.Evaluate(msg)
        if action == "block" {
            // Send error response back to client
            p.sendError(dst, msg)
            continue
        }
        if action == "pause" {
            // Hold and wait for approval (webhook/UI)
            if !p.waitForApproval(msg, 60*time.Second) {
                p.sendError(dst, msg)
                continue
            }
        }
        
        // Pass through
        dst.Write(line)
        dst.Write([]byte("\n"))
    }
}

Pairing Requests and Responses

JSON-RPC uses id to correlate requests with responses. When you see a tools/call request with id: 5, store it. When you see a response with id: 5, pair them into a tool_call record.

Dashboard API Endpoints

GET  /api/sessions                    # List sessions
GET  /api/sessions/:id/tool-calls     # Tool calls for a session
GET  /api/sessions/:id/intents        # Intent chains for a session
GET  /api/tool-calls/flagged          # All flagged/paused tool calls
GET  /api/tool-calls/:id              # Single tool call detail
POST /api/tool-calls/:id/approve      # Approve a paused tool call
POST /api/tool-calls/:id/deny         # Deny a paused tool call
GET  /api/rules                       # List policy rules
POST /api/rules                       # Create rule
WS   /ws/live                         # WebSocket for real-time stream

Config Example

claude_desktop_config.json (Before)

{
  "mcpServers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": { "GITHUB_TOKEN": "ghp_xxx" }
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/otto/projects"]
    }
  }
}

claude_desktop_config.json (After — with beacon-proxy)

{
  "mcpServers": {
    "github": {
      "command": "beacon-proxy",
      "args": [
        "--server-name", "github",
        "--db", "~/.beacon/audit.db",
        "--port", "8080",
        "--", "npx", "-y", "@modelcontextprotocol/server-github"
      ],
      "env": { "GITHUB_TOKEN": "ghp_xxx" }
    },
    "filesystem": {
      "command": "beacon-proxy",
      "args": [
        "--server-name", "filesystem",
        "--db", "~/.beacon/audit.db",
        "--port", "8080",
        "--", "npx", "-y", "@modelcontextprotocol/server-filesystem", "/Users/otto/projects"
      ]
    }
  }
}

All proxy instances share the same SQLite DB and web UI port.


Default Policy Rules (YAML)

rules:
  - name: flag_all_writes
    description: Flag any write operation for visibility
    operation_types: [write]
    action: flag

  - name: pause_deletes
    description: Require approval for delete operations
    operation_types: [delete]
    action: pause

  - name: block_bulk_destructive
    description: Block bulk delete operations
    operation_types: [delete]
    min_risk_score: 80
    action: block

  - name: pause_shell_execution
    description: Require approval for shell/command execution
    tool_pattern: "run_*,exec_*,execute_*"
    action: pause

  - name: flag_external_messages
    description: Flag outgoing messages (Slack, email, etc.)
    tool_pattern: "send_*,post_*"
    action: flag

  - name: pause_sql_mutations
    description: Require approval for SQL writes
    server_pattern: "postgres,mysql,sqlite"
    operation_types: [write, delete, execute]
    action: pause

MVP Milestones

Evening 1: The Interceptor

  • beacon-proxy CLI that spawns child process
  • Pipes stdin/stdout bidirectionally
  • Parses JSON-RPC, logs to SQLite
  • Verify it works with one real MCP server (filesystem is easiest)

Evening 2: Classification + Pairing

  • Pair request/response by JSON-RPC id
  • Classify operation type from tool name
  • Risk scoring
  • tool_calls table populated correctly

Evening 3: Policy Engine + Intent Grouping

  • Load rules from YAML
  • Evaluate rules against tool calls
  • Implement pause (hold message, wait for approval via HTTP endpoint)
  • Temporal intent grouping

Evening 4: Dashboard

  • Embed a simple web UI (single HTML file with htmx or vanilla JS)
  • Real-time tool call stream via WebSocket
  • Approve/deny paused actions
  • Filter by session, risk level, server

Evening 5: Polish + Demo

  • Multi-server config (GitHub + filesystem + Slack)
  • Record a demo: complex task → multiple MCP calls → risk flag → approval
  • README with clear setup instructions
  • Capture metrics: total calls, flagged, blocked, avg risk score

What Makes This Different From Runlayer et al.

Capability Runlayer MintMCP Beacon Proxy
MCP gateway/catalog ❌ (not the goal)
IdP integration (Okta/Entra) ❌ (prototype)
Threat detection (prompt injection) ❌ (not the goal)
Tool-level RBAC Basic (policy rules)
Audit trail
Intent-to-action chain
Cross-system action grouping
Outcome verification ✅ (planned)
Human-readable action narrative ✅ (planned)

The pitch: "They tell you WHO did WHAT. We tell you WHY, and whether it matched what the human actually intended."


Future: Where This Goes

  1. LLM-powered intent matching: Feed the human prompt + tool call chain to an LLM, ask "did these actions match the stated intent?" Flag mismatches.

  2. Cross-server intent chains: When intent A triggers calls across GitHub + Slack + Jira, show the full chain as one narrative.

  3. Rollback suggestions: For each write/delete, capture enough state to suggest a rollback. "Agent deleted 3 GitHub issues. Here's the API call to recreate them."

  4. Aimable integration: Sits downstream of Aimable's Layer 1. Traffic flows: Human → Aimable (PII redacted, policy checked) → Agent → Beacon Proxy (action governance, intent verification) → MCP Servers.

  5. Export to SIEM: Stream audit events to Datadog, Splunk, etc. for SOC teams.