A transparent proxy that wraps any STDIO-based MCP server, intercepts every JSON-RPC message, and builds an audit trail with intent-to-action tracing.
┌─────────────┐ stdin/stdout ┌──────────────┐ stdin/stdout ┌─────────────┐
│ MCP Client │ ◄──────────────────► │ beacon-proxy │ ◄──────────────────► │ MCP Server │
│ (Claude, etc)│ │ │ │ (GitHub, etc)│
└─────────────┘ │ ┌─────────┐ │ └─────────────┘
│ │ Audit DB │ │
│ └─────────┘ │
│ ┌─────────┐ │
│ │ Policy │ │
│ │ Engine │ │
│ └─────────┘ │
│ ┌─────────┐ │
│ │ Web UI │ │
│ │ :8080 │ │
│ └─────────┘ │
└──────────────┘
- Client config points
commandatbeacon-proxyinstead of the real server binary beacon-proxyspawns the real MCP server as a child process- Every JSON-RPC message (both directions) passes through the proxy
- Proxy logs, classifies, risk-scores, and optionally pauses for approval
- Web UI shows real-time audit trail at localhost:8080
-- A session is one proxy lifetime (client connects, does stuff, disconnects)
CREATE TABLE sessions (
id TEXT PRIMARY KEY, -- uuid
server_name TEXT NOT NULL, -- e.g. "github", "postgres", from config
server_command TEXT NOT NULL, -- the actual command being wrapped
started_at TIMESTAMP NOT NULL,
ended_at TIMESTAMP,
client_info JSON -- MCP initialize result (client name, version)
);
-- Every JSON-RPC message in both directions
CREATE TABLE messages (
id TEXT PRIMARY KEY, -- uuid
session_id TEXT NOT NULL REFERENCES sessions(id),
direction TEXT NOT NULL, -- 'client_to_server' | 'server_to_client'
timestamp TIMESTAMP NOT NULL,
-- JSON-RPC fields
jsonrpc_id TEXT, -- request/response correlation (can be null for notifications)
method TEXT, -- e.g. 'tools/call', 'tools/list', 'initialize'
params JSON, -- request params
result JSON, -- response result
error JSON, -- response error if any
-- Raw payload for forensics
raw TEXT NOT NULL
);
-- Higher-level: a tool invocation (request + response paired)
CREATE TABLE tool_calls (
id TEXT PRIMARY KEY, -- uuid
session_id TEXT NOT NULL REFERENCES sessions(id),
request_msg_id TEXT NOT NULL REFERENCES messages(id),
response_msg_id TEXT REFERENCES messages(id), -- null until response arrives
-- Tool details
tool_name TEXT NOT NULL, -- e.g. 'read_file', 'create_issue', 'send_message'
arguments JSON NOT NULL, -- what was passed to the tool
result JSON, -- what came back
error JSON, -- error if failed
-- Classification
operation_type TEXT NOT NULL, -- 'read' | 'write' | 'delete' | 'execute' | 'unknown'
risk_score INTEGER NOT NULL DEFAULT 0, -- 0-100
risk_reasons JSON, -- ["bulk_write", "destructive", "sensitive_data"]
-- Timing
requested_at TIMESTAMP NOT NULL,
responded_at TIMESTAMP,
duration_ms INTEGER,
-- Policy
policy_action TEXT NOT NULL DEFAULT 'pass', -- 'pass' | 'flag' | 'pause' | 'block'
approved_by TEXT, -- if paused, who approved
approved_at TIMESTAMP
);
-- Intent context: what human instruction triggered this chain of tool calls
-- This is the novel part — nobody else captures this
CREATE TABLE intent_contexts (
id TEXT PRIMARY KEY, -- uuid
session_id TEXT NOT NULL REFERENCES sessions(id),
-- The human's instruction (if capturable)
human_prompt TEXT, -- the original instruction
prompt_summary TEXT, -- AI-generated summary of intent
-- Which tool calls resulted from this intent
created_at TIMESTAMP NOT NULL
);
-- Many-to-many: which tool calls belong to which intent
CREATE TABLE intent_tool_calls (
intent_id TEXT NOT NULL REFERENCES intent_contexts(id),
tool_call_id TEXT NOT NULL REFERENCES tool_calls(id),
sequence_order INTEGER NOT NULL, -- order within the intent chain
PRIMARY KEY (intent_id, tool_call_id)
);
-- Policy rules
CREATE TABLE policy_rules (
id TEXT PRIMARY KEY,
name TEXT NOT NULL, -- "block_bulk_deletes"
description TEXT,
-- Matching
tool_pattern TEXT, -- glob: "delete_*", "*", "sql_query"
server_pattern TEXT, -- glob: "postgres", "*"
-- Conditions
min_risk_score INTEGER DEFAULT 0,
operation_types TEXT, -- comma-separated: "delete,write"
-- Action
action TEXT NOT NULL, -- 'flag' | 'pause' | 'block'
enabled BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMP NOT NULL
);CREATE INDEX idx_messages_session ON messages(session_id, timestamp);
CREATE INDEX idx_tool_calls_session ON tool_calls(session_id, requested_at);
CREATE INDEX idx_tool_calls_risk ON tool_calls(risk_score DESC);
CREATE INDEX idx_tool_calls_policy ON tool_calls(policy_action) WHERE policy_action != 'pass';
CREATE INDEX idx_intent_tool_calls_intent ON intent_tool_calls(intent_id);Classify by tool name pattern matching. Start simple, refine with real data:
READ patterns:
- get_*, read_*, list_*, search_*, describe_*, show_*
- sql queries containing only SELECT
WRITE patterns:
- create_*, update_*, set_*, add_*, put_*, edit_*, modify_*
- sql queries containing INSERT, UPDATE
DELETE patterns:
- delete_*, remove_*, drop_*, destroy_*, purge_*
- sql queries containing DELETE, DROP, TRUNCATE
EXECUTE patterns:
- run_*, exec_*, invoke_*, call_*, trigger_*
- anything involving shell/command execution
Start with additive scoring:
| Factor | Points |
|---|---|
| Base: read operation | 0 |
| Base: write operation | 20 |
| Base: delete operation | 40 |
| Base: execute operation | 30 |
| Bulk operation (>10 records) | +20 |
| Touches auth/credentials | +30 |
| SQL with no WHERE clause | +30 |
| Modifies config/settings | +20 |
| Sends external message | +15 |
| First time seeing this tool | +10 |
| Risk Score | Default Action |
|---|---|
| 0-30 | pass (log only) |
| 31-60 | flag (log + highlight in UI) |
| 61-80 | pause (hold for human approval, 60s timeout, default deny) |
| 81-100 | block (reject, return error to client) |
This is the hard/novel part. How do you know what the human asked for?
Some MCP clients expose the conversation context. Claude Desktop's MCP calls include sampling context. If available, extract the most recent human message as the intent.
Group tool calls by time proximity. If an agent makes 5 tool calls within 2 seconds, they're likely from the same intent. Create an intent_context for each burst:
Tool calls at T+0.0s, T+0.3s, T+0.8s, T+1.2s → Intent A
[gap > 5 seconds]
Tool calls at T+8.0s, T+8.5s → Intent B
The human_prompt field stays null, but you still get the chain of actions grouped together.
MCP spec includes a sampling/createMessage flow where the server can ask the client to generate text. If the client supports it, the conversation context is available in the sampling request. Intercept these to extract intent.
Start with Approach 2. It works today with zero client changes.
beacon-proxy/
├── cmd/
│ └── beacon-proxy/
│ └── main.go # CLI entry point
├── internal/
│ ├── proxy/
│ │ ├── proxy.go # Core: spawn child, pipe stdin/stdout
│ │ └── jsonrpc.go # Parse JSON-RPC messages from stream
│ ├── audit/
│ │ ├── store.go # SQLite audit log storage
│ │ ├── classifier.go # Operation type + risk scoring
│ │ └── intent.go # Temporal grouping for intent chains
│ ├── policy/
│ │ ├── engine.go # Rule evaluation
│ │ └── rules.go # Default rules + custom rule loading
│ └── web/
│ ├── server.go # HTTP server for dashboard
│ ├── handlers.go # API endpoints
│ └── static/ # Dashboard HTML/JS (embed)
├── configs/
│ ├── default_rules.yaml # Sensible default policies
│ └── example_claude.json # Example claude_desktop_config.json
├── go.mod
├── go.sum
└── README.md
MCP over STDIO uses newline-delimited JSON-RPC. Each message is a single line of JSON. The proxy reads lines from client stdin, parses, logs, optionally transforms, and writes to server stdin. Reverse for server stdout → client stdout.
// Simplified core loop
func (p *Proxy) Run() {
// Spawn real MCP server
cmd := exec.Command(p.serverCommand, p.serverArgs...)
serverIn, _ := cmd.StdinPipe()
serverOut, _ := cmd.StdoutPipe()
cmd.Start()
// Client → Server (with interception)
go p.pipe(os.Stdin, serverIn, "client_to_server")
// Server → Client (with interception)
go p.pipe(serverOut, os.Stdout, "server_to_client")
cmd.Wait()
}
func (p *Proxy) pipe(src io.Reader, dst io.Writer, direction string) {
scanner := bufio.NewScanner(src)
for scanner.Scan() {
line := scanner.Bytes()
// Parse, log, classify, check policy
msg := p.parseMessage(line)
p.audit.Log(direction, msg)
action := p.policy.Evaluate(msg)
if action == "block" {
// Send error response back to client
p.sendError(dst, msg)
continue
}
if action == "pause" {
// Hold and wait for approval (webhook/UI)
if !p.waitForApproval(msg, 60*time.Second) {
p.sendError(dst, msg)
continue
}
}
// Pass through
dst.Write(line)
dst.Write([]byte("\n"))
}
}JSON-RPC uses id to correlate requests with responses. When you see a tools/call request with id: 5, store it. When you see a response with id: 5, pair them into a tool_call record.
GET /api/sessions # List sessions
GET /api/sessions/:id/tool-calls # Tool calls for a session
GET /api/sessions/:id/intents # Intent chains for a session
GET /api/tool-calls/flagged # All flagged/paused tool calls
GET /api/tool-calls/:id # Single tool call detail
POST /api/tool-calls/:id/approve # Approve a paused tool call
POST /api/tool-calls/:id/deny # Deny a paused tool call
GET /api/rules # List policy rules
POST /api/rules # Create rule
WS /ws/live # WebSocket for real-time stream
{
"mcpServers": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_TOKEN": "ghp_xxx" }
},
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/otto/projects"]
}
}
}{
"mcpServers": {
"github": {
"command": "beacon-proxy",
"args": [
"--server-name", "github",
"--db", "~/.beacon/audit.db",
"--port", "8080",
"--", "npx", "-y", "@modelcontextprotocol/server-github"
],
"env": { "GITHUB_TOKEN": "ghp_xxx" }
},
"filesystem": {
"command": "beacon-proxy",
"args": [
"--server-name", "filesystem",
"--db", "~/.beacon/audit.db",
"--port", "8080",
"--", "npx", "-y", "@modelcontextprotocol/server-filesystem", "/Users/otto/projects"
]
}
}
}All proxy instances share the same SQLite DB and web UI port.
rules:
- name: flag_all_writes
description: Flag any write operation for visibility
operation_types: [write]
action: flag
- name: pause_deletes
description: Require approval for delete operations
operation_types: [delete]
action: pause
- name: block_bulk_destructive
description: Block bulk delete operations
operation_types: [delete]
min_risk_score: 80
action: block
- name: pause_shell_execution
description: Require approval for shell/command execution
tool_pattern: "run_*,exec_*,execute_*"
action: pause
- name: flag_external_messages
description: Flag outgoing messages (Slack, email, etc.)
tool_pattern: "send_*,post_*"
action: flag
- name: pause_sql_mutations
description: Require approval for SQL writes
server_pattern: "postgres,mysql,sqlite"
operation_types: [write, delete, execute]
action: pause-
beacon-proxyCLI that spawns child process - Pipes stdin/stdout bidirectionally
- Parses JSON-RPC, logs to SQLite
- Verify it works with one real MCP server (filesystem is easiest)
- Pair request/response by JSON-RPC id
- Classify operation type from tool name
- Risk scoring
-
tool_callstable populated correctly
- Load rules from YAML
- Evaluate rules against tool calls
- Implement pause (hold message, wait for approval via HTTP endpoint)
- Temporal intent grouping
- Embed a simple web UI (single HTML file with htmx or vanilla JS)
- Real-time tool call stream via WebSocket
- Approve/deny paused actions
- Filter by session, risk level, server
- Multi-server config (GitHub + filesystem + Slack)
- Record a demo: complex task → multiple MCP calls → risk flag → approval
- README with clear setup instructions
- Capture metrics: total calls, flagged, blocked, avg risk score
| Capability | Runlayer | MintMCP | Beacon Proxy |
|---|---|---|---|
| MCP gateway/catalog | ✅ | ✅ | ❌ (not the goal) |
| IdP integration (Okta/Entra) | ✅ | ✅ | ❌ (prototype) |
| Threat detection (prompt injection) | ✅ | ❌ | ❌ (not the goal) |
| Tool-level RBAC | ✅ | ✅ | Basic (policy rules) |
| Audit trail | ✅ | ✅ | ✅ |
| Intent-to-action chain | ❌ | ❌ | ✅ |
| Cross-system action grouping | ❌ | ❌ | ✅ |
| Outcome verification | ❌ | ❌ | ✅ (planned) |
| Human-readable action narrative | ❌ | ❌ | ✅ (planned) |
The pitch: "They tell you WHO did WHAT. We tell you WHY, and whether it matched what the human actually intended."
-
LLM-powered intent matching: Feed the human prompt + tool call chain to an LLM, ask "did these actions match the stated intent?" Flag mismatches.
-
Cross-server intent chains: When intent A triggers calls across GitHub + Slack + Jira, show the full chain as one narrative.
-
Rollback suggestions: For each write/delete, capture enough state to suggest a rollback. "Agent deleted 3 GitHub issues. Here's the API call to recreate them."
-
Aimable integration: Sits downstream of Aimable's Layer 1. Traffic flows: Human → Aimable (PII redacted, policy checked) → Agent → Beacon Proxy (action governance, intent verification) → MCP Servers.
-
Export to SIEM: Stream audit events to Datadog, Splunk, etc. for SOC teams.