-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture
The MATLAB MCP Server is a Python-based bridge that connects AI agents (Claude, Cursor, etc.) to MATLAB via the Model Context Protocol. It provides elastic engine pooling, session isolation, security validation, and hybrid sync/async execution.
graph TB
Agent["AI Agent<br/>(Claude, Cursor, etc.)"]
Agent -->|MCP Protocol<br/>stdio or HTTP| Server["MCP Server<br/>(FastMCP 3.2.0)"]
Server -->|Tool Calls| Tools["20+ Built-in Tools<br/>+ Custom Tools"]
Tools -->|Job Creation| JobExec["Job Executor<br/>(Sync/Async Promotion)"]
JobExec -->|Engine Acquire| PoolMgr["Engine Pool Manager<br/>(Elastic Scaling)"]
PoolMgr -->|Engine Lifecycle| Engines["MATLAB Engine Pool<br/>(min_engines → max_engines)"]
Tools -->|Workspace Queries| EngineAPI["MATLAB Engine API"]
Tools -->|Code Security| SecVal["Security Validator<br/>(Blocked Functions,<br/>Path Traversal)"]
JobExec -->|Session Mgmt| Sessions["Session Manager<br/>(Per-user Isolation)"]
Tools -->|File I/O| FileOps["File Operations<br/>(upload, delete, read)"]
FileOps -->|Temp Dir| Sessions
Tools -->|Plotting| Converter["Plotly Converter<br/>(MATLAB → Interactive JSON)"]
Tools -->|Metrics| Monitor["Monitoring Dashboard<br/>(HTTP UI)"]
Monitor -->|Query| Collector["Metrics Collector<br/>(Events, Percentiles)"]
Collector -->|Persist| Store["SQLite Store"]
style Server fill:#4A90E2
style Tools fill:#7CB342
style JobExec fill:#FB8C00
style PoolMgr fill:#E53935
style Engines fill:#8E24AA
FastMCP-based server that:
- Registers 20 built-in tools + custom MATLAB functions as MCP tools
- Manages tool lifespan (startup, shutdown, graceful drain)
- Routes incoming requests to tool implementations
- Handles session allocation and cleanup
- Supports three transports:
- stdio (single-user, default)
- SSE (multi-user, deprecated)
- streamable HTTP (multi-user, recommended)
Key design decision: Bearer token auth via ASGI middleware (Phase 2) validates all HTTP requests before reaching the MCP layer.
Manages MATLAB engine instances with elastic scaling:
graph LR
Req["Tool Request"]
Req --> Acquire["Acquire Engine<br/>from Pool"]
Acquire --> Available{"Engine<br/>Available?"}
Available -->|Yes| Run["Execute<br/>Immediately"]
Available -->|No| Check{"Count<br/>< max?"}
Check -->|Yes| Scale["Scale Up:<br/>Start New Engine"]
Check -->|No| Queue["Enqueue Request<br/>Wait for Release"]
Scale --> Run
Queue --> Run
Run --> Release["Release to Pool"]
Release --> ScaleDown{"Idle >15min<br/>& Count > min?"}
ScaleDown -->|Yes| Stop["Stop Engine"]
ScaleDown -->|No| Ready["Return to Available"]
style Acquire fill:#FB8C00
style Scale fill:#E53935
style Queue fill:#FFA726
style Stop fill:#C62828
- Minimum engines: Always running (warmth for quick response)
- Proactive warmup: When utilization > 80%, starts next engine before it's needed
-
On-demand scaling: Creates engines up to
max_engineswhen all are busy - Scale-down: Stops idle engines > 15 minutes, down to minimum
-
Health checks:
1+1eval every 60 seconds; unhealthy engines are replaced - Queueing: When full, requests wait in an async queue
Orchestrates the complete execution lifecycle with hybrid sync/async promotion:
sequenceDiagram
participant Agent
participant Server
participant Executor
participant Pool
participant Engine
participant Store
Agent->>Server: execute_code("x = magic(3)")
Server->>Executor: create_and_run_job()
Executor->>Executor: security_check() ✓
Executor->>Pool: acquire_engine()
Pool->>Engine: Engine available?
Engine-->>Pool: Yes
Pool-->>Executor: engine handle
Executor->>Engine: inject_job_context(__mcp_job_id__)
Executor->>Engine: eval(code, sync=True, timeout=30s)
Engine->>Engine: code executes
alt Completes < 30s (Sync Path)
Engine-->>Executor: result, output, vars
Executor->>Store: mark_completed(job_id)
Executor->>Pool: release_engine()
Executor-->>Server: return result immediately
Server-->>Agent: {output, variables, figures}
else Exceeds 30s (Async Path)
Engine->>Engine: (background execution)
Executor-->>Executor: mark_running()
Executor-->>Server: return {job_id, status: "running"}
Server-->>Agent: {job_id: "abc123"}
Agent->>Server: get_job_status("abc123")
Server-->>Agent: {status: "running", progress: 45%}
Engine->>Engine: (completes in background)
Executor->>Store: mark_completed(job_id)
Executor->>Pool: release_engine()
Agent->>Server: get_job_result("abc123")
Server-->>Agent: {output, variables, figures}
end
Key design decisions:
- Sync timeout: 30 seconds for user-facing code (configurable)
- Promotion trigger: If execution exceeds timeout, move to background job
- Context injection: Every job gets unique ID and temp directory in MATLAB workspace
-
Progress reporting: Jobs can call
mcp_progress()MATLAB helper to report percentage/message
Isolates workspace state between agents:
| Aspect | stdio | streamable HTTP |
|---|---|---|
| Sessions | Single "default" | Per-client (via ctx.session_id) |
| Temp Dir | /tmp/matlab_mcp/ |
/tmp/matlab_mcp/s_<session_id>/ |
| Workspace | Reused across tools | Cleared between jobs (configurable) |
| Duration | Server lifetime |
session_timeout (15 min idle) |
Cleanup: Expired sessions auto-deleted; files in temp directories are accessible via read_script, read_data, read_image tools.
Pre-execution code validation:
graph TD
Code["User Code"] -->|Input| Strip["Strip Comments<br/>& Strings"]
Strip -->|Cleaned| Check["Check for<br/>Blocked Functions"]
Check -->|Found| Block["Raise<br/>BlockedFunctionError"]
Check -->|Not Found| OK["✓ Allow Execution"]
Block --> Agent["Return Error<br/>to Agent"]
OK --> Engine["Send to MATLAB"]
style Check fill:#E53935
style Block fill:#C62828
style OK fill:#7CB342
Blocked by default: system, unix, dos, !, eval, feval, evalc, evalin, assignin, perl, python, shell escapes
Smart detection: Strips string literals and comments before scanning to avoid false positives (e.g., code = "system('ls')" is safe)
Custom blocklist: Can be extended via config.yaml or disabled entirely (with explicit acknowledgment)
Optional approval gates for sensitive operations (Phase 4):
graph TD
Tool["Tool Call<br/>(execute_code)"]
Tool -->|Check Config| GateEnabled{"Gate<br/>Enabled?"}
GateEnabled -->|No| Execute["✓ Execute<br/>Immediately"]
GateEnabled -->|Yes| Check2{"Protected<br/>Function?"}
Check2 -->|No| Execute
Check2 -->|Yes| Elicit["Elicit User<br/>Approval via<br/>MCP Protocol"]
Elicit -->|Approved| Execute
Elicit -->|Declined| Reject["✗ Reject<br/>with Error"]
Elicit -->|Cancelled| Reject
style GateEnabled fill:#FB8C00
style Check2 fill:#FB8C00
style Execute fill:#7CB342
style Reject fill:#E53935
Configuration:
-
enabled(default: false) — Master switch -
protected_functions— Function names requiring approval -
all_execute— Require approval for all code, not just protected functions -
protect_file_ops— Require approval for uploads/deletes
Structures tool responses into MCP format:
graph TD
Result["MATLAB Result<br/>(output, vars, figures)"]
Result -->|Text| TruncText["Truncate to<br/>max_text_length"]
Result -->|Variables| FormatVars["Format with<br/>Type & Size Info"]
Result -->|Figures| Convert["Convert to<br/>Plotly JSON"]
TruncText -->|Long| SaveFile["Save to File<br/>& Return Path"]
TruncText -->|Short| Return["Return Inline"]
Convert -->|Success| JSON["Plotly JSON<br/>+ PNG + Thumbnail"]
Convert -->|Fail| PNG["Static PNG<br/>Only"]
SaveFile --> Final["Assemble Response"]
Return --> Final
JSON --> Final
PNG --> Final
Final --> Agent["Return to Agent"]
style Convert fill:#4A90E2
Output limits (configurable):
- Text: 10,000 characters (save excess to disk)
- Variables: Show type/size, exclude large objects (>10MB)
- Figures: Plotly JSON + 400px-wide PNG thumbnail
MATLAB figures → interactive Plotly JSON for agent visualization:
graph LR
MATLAB["MATLAB<br/>figure, plot()<br/>bar(), etc."]
MATLAB -->|mcp_extract_props.m| Extract["Extract Properties<br/>(JSON)"]
Extract -->|Save to Temp| File["matlab_figure_props.json"]
File -->|Load| Load["load_plotly_json()"]
Load -->|Parse| Props["Figure Properties<br/>Dict"]
Props -->|Convert| Style["plotly_style_mapper.py<br/>(colors, lines,<br/>markers, fonts)"]
Style -->|Build Plotly| Plotly["Plotly Figure Dict<br/>{traces, layout}"]
Plotly -->|Serialize| JSON["JSON Response<br/>to Agent"]
MATLAB -->|saveas(...png)| PNG["Static PNG<br/>(fallback)"]
PNG -->|Thumbnail| Thumb["Base64 PNG<br/>400px wide"]
JSON -->|Agent UI| Render["Interactive<br/>Plot"]
Thumb -->|Agent UI| Display["Embedded<br/>Thumbnail"]
style Extract fill:#8E24AA
style Style fill:#4A90E2
style Plotly fill:#7CB342
Supported trace types: line, scatter, bar, histogram, surface, heatmap, image, patch
Performance optimization: Uses WebGL rendering for datasets > 10,000 points
Real-time HTTP dashboard with metrics collection:
graph TB
Tools["Tool Calls<br/>(code exec, file ops)"]
Events["Events<br/>(completed, failed,<br/>sessions, scaling)"]
Tools -->|Fire & Forget| Collector["MetricsCollector<br/>(in-memory)"]
Events -->|Fire & Forget| Collector
Collector -->|Sample (every 10s)| Store["MetricsStore<br/>(SQLite)"]
Collector -->|Query Current| Current["/metrics<br/>endpoint"]
Store -->|Query History| History["/dashboard/api/history<br/>endpoint"]
Store -->|Query Events| EventLog["/dashboard/api/events<br/>endpoint"]
Current -->|HTTP JSON| UI["Dashboard UI<br/>(Plotly charts)"]
History -->|HTTP JSON| UI
EventLog -->|HTTP JSON| UI
UI -->|1-second refresh| Display["Live Metrics<br/>(pool, jobs, errors)"]
style Collector fill:#FB8C00
style Store fill:#E53935
style UI fill:#4A90E2
Available metrics:
- Pool utilization (busy/total engines)
- Job throughput (completed, failed, cancelled per minute)
- Execution time (avg, p95, p99)
- Active sessions
- System memory usage
- Error rates by type
1. AGENT sends:
tool: execute_code
code: "x = randn(1, 100000); Y = fft(x); plot(abs(Y))"
2. SERVER receives, dispatches to execute_code_impl():
3. SECURITY validates code
- Scans for blocked functions: ✓ (fft, plot are safe)
- Checks for file I/O, system calls: ✓
4. HITL checks (if enabled):
- Is all_execute gate on? No
- Code calls protected functions? No
- Proceed without approval
5. JOB EXECUTOR:
- Creates job in tracker (status: PENDING)
- Acquire engine from pool (waits if all busy)
- Inject context: __mcp_job_id__ = "job_abc123"
- Execute synchronously with 30s timeout
6. ENGINE starts code (background=False initially):
- x = randn(1, 100000) [200ms]
- Y = fft(x) [5000ms total, exceeds 30s cutoff]
- At 30s mark: timeout triggered
7. EXECUTOR auto-promotes to async:
- Moves to background (background=True)
- Returns immediately with job_id
- Future awaited by background task
8. AGENT receives:
response: {
status: "running",
job_id: "job_abc123",
message: "Long-running job promoted to async"
}
9. AGENT polls get_job_status("job_abc123"):
- Job tracker returns: {status: "running", progress: 0}
- No progress file yet (code hasn't called mcp_progress())
- Agent waits 2 seconds, polls again
10. ENGINE finishes (5500ms total):
- plot(abs(Y)) [figure generated]
- Execution complete
11. EXECUTOR:
- Extracts figure via mcp_extract_props.m (JSON props)
- Captures output and variables
- Formats result
- Marks job COMPLETED
- Releases engine back to pool
12. AGENT calls get_job_result("job_abc123"):
response: {
status: "completed",
output: "Y = [5.2 3.1 2.8 ...]",
variables: {
x: {type: "double", size: "1x100000"},
Y: {type: "double", size: "1x100000"}
},
figures: [{
type: "plotly",
data: {traces: [...], layout: {...}},
png_thumbnail: "data:image/png;base64,..."
}]
}
13. AGENT displays interactive Plotly figure in UI
-
HTTP transports (SSE, streamable HTTP):
BearerAuthMiddlewarevalidatesAuthorization: Bearer <token>header -
Token source:
MATLAB_MCP_AUTH_TOKENenvironment variable (static, no rotation mid-session) -
Health endpoint:
/healthbypassed (allows agents to check readiness without token) - Stdio transport: No auth (single-user, assumed trusted)
-
Endpoint:
http://127.0.0.1:8765/mcp(default) -
Per-session routing:
ctx.session_idfor multi-agent isolation -
Fallback:
ctx.client_idfor stateless mode (no persistent workspace)
-
Default host:
127.0.0.1(loopback, no Firewall UAC prompt) -
Temp dir: Uses
tempfile.gettempdir()(platform-aware, not hardcoded/tmp) - Deployment: Single-machine scenarios avoid admin requirement entirely
| Decision | Rationale | Trade-off |
|---|---|---|
| Elastic pooling | Handles variable load without pre-allocating expensive engines | Scale-up latency (~5s per engine) on load spikes |
| Sync→async promotion | Responsive UX for quick queries; async for long jobs | Dual code paths, more testing |
| MATLAB workspace isolation | Security + correctness (side effects don't cross sessions) | Startup cost per session (~200ms) |
| Bearer tokens (not OAuth) | Simplicity for CLI agents, no external dependencies | No token rotation, revocation requires restart |
| SQLite metrics store | Lightweight, no external DB, point-in-time queries | Limited to single-machine deployments |
| Streamable HTTP (not SSE) | Single transport for all clients, simpler reverse proxy setup | New in FastMCP 3.x, less battle-tested than SSE |
| Plotly figures (not static PNG) | Interactive visualization in agents (zoom, pan, tooltips) | Larger JSON payloads, WebGL fallback needed for huge datasets |
| Security blocklist | Pragmatic: block dangerous functions, not all-allow | New functions added to MATLAB can bypass (mitigated by monitoring) |
-
ctx.session_id stability under streamable HTTP (Phase 3 blocker)
- Some agents may not provide consistent
ctx.session_idacross requests - Fallback to
ctx.client_idimplemented; may reduce workspace isolation
- Some agents may not provide consistent
-
Memory leak in failed engine startup (Phase 1 issue)
- Engines that crash during
start()not fully cleaned up - Mitigated by health checks (bad engines replaced within 60s)
- Engines that crash during
-
Windows 10 CI environment (Phase 5 blocker)
- GitHub Actions Windows runners may not support MATLAB installation
- Workaround:
--inspectmode (mock engines) for CI; live Windows testing deferred to post-v2.0
-
Large figure Plotly JSON (Known limitation)
- Figures with 100k+ traces can exceed result size limits
- Mitigation: PNG fallback; agents should request PNG for large datasets
- Unit tests (732 tests, 185 test classes): Components tested in isolation with mocks
-
Integration tests (CI): Server starts in
--inspectmode (mock engines), real MCP client connects via streamable HTTP with bearer auth, tools execute - Live tests (manual, deferred): Windows no-admin deployment, multi-agent session isolation, agent UI rendering of Plotly figures