Skip to content

Latest commit

 

History

History
408 lines (299 loc) · 12.7 KB

File metadata and controls

408 lines (299 loc) · 12.7 KB

Binary Ninja Integration for smartdiff

Overview

smartdiff includes Binary Ninja integration for inspecting binaries through the MCP (Model Context Protocol).

Key Features:

  • ✅ Discover binaries loaded in Binary Ninja
  • ✅ List functions in binaries
  • ✅ Decompile functions to high-level C code
  • ✅ Works with Binary Ninja Personal License
  • Comparison tools are intentionally disabled by default and require the binary-comparison Cargo feature.

Architecture

┌─────────────────────────────────────────────────────────┐
│  AI Agent (Claude Desktop)                              │
└──────────────┬──────────────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────────────┐
│  smartdiff MCP Server                                   │
│  - Source code comparison tools                         │
│  - Binary inspection tools                              │
└──────────────┬──────────────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────────────┐
│  Binary Ninja Client Library                           │
│  - HTTP client (no Binary Ninja dependencies)          │
│  - Server discovery                                     │
│  - Function listing and decompilation                   │
└──────────────┬──────────────────────────────────────────┘
               │ (HTTP)
               ▼
┌─────────────────────────────────────────────────────────┐
│  Binary Ninja MCP Server                                │
│  - Multi-binary support (ports 9009+)                   │
│  - Decompilation API                                    │
│  - Function analysis                                    │
└──────────────┬──────────────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────────────┐
│  Binary Ninja (GUI with Personal License)              │
└─────────────────────────────────────────────────────────┘

Components

1. Binary Ninja Client Library

Location: crates/binary-ninja-client/

Purpose: HTTP client for communicating with Binary Ninja MCP servers

Features:

  • No Binary Ninja dependencies (just HTTP/JSON)
  • Server discovery (scans ports 9009-9018)
  • Function listing and searching
  • Decompilation fetching
  • Async/await with tokio

Example:

use smart_diff_binary_ninja_client::BinaryNinjaClient;

let client = BinaryNinjaClient::new();
let servers = client.discover_servers().await?;
let functions = client.list_functions("port_9009").await?;
let code = client.decompile_function("port_9009", "main").await?;

2. Binary Function Matcher (Feature-Gated)

Location: crates/diff-engine/src/binary_matcher.rs

Purpose: Match functions between two binaries using multiple strategies. This is compiled only when binary-comparison is explicitly enabled.

Matching Strategies:

  1. Exact Name Matching - O(n) HashMap lookup for identical names
  2. Fuzzy Name Matching - Levenshtein distance for renamed functions
  3. Code Similarity - Framework ready for tree-sitter C parser integration

Example:

use smart_diff_engine::{BinaryFunctionMatcher, BinaryFunctionInfo};

let matcher = BinaryFunctionMatcher::new();
let matches = matcher.match_functions(&functions_a, &functions_b)?;

3. Binary Comparison Manager (Feature-Gated)

Location: crates/mcp-server/src/comparison/binary_comparison.rs

Purpose: Store and manage binary comparison results. This is not exposed to agents in the default build.

Features:

  • UUID-based comparison IDs
  • Match statistics and summaries
  • Added/deleted function tracking
  • Sorted match retrieval

4. MCP Tools

Location: crates/mcp-server/src/tools/binary_tools.rs

Purpose: Expose binary analysis capabilities to AI agents

Default Tools:

  1. list_binja_servers - Discover available binaries
  2. list_binary_functions - List functions in a binary
  3. decompile_binary_function - Get decompiled code

Comparison tools are hidden unless smartdiff is built with --features binary-comparison.

Quick Start

Prerequisites

  1. Binary Ninja (Personal License or higher)
  2. Binary Ninja MCP Plugin (from /home/matteius/smartdiff/binary_ninja_mcp/)
  3. smartdiff (this repository)

Setup

1. Install Binary Ninja MCP Plugin

cd /home/matteius/smartdiff/binary_ninja_mcp
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Copy plugin to Binary Ninja plugins folder
# Restart Binary Ninja

2. Build smartdiff

cd /home/matteius/smartdiff
cargo build --release -p smart-diff-mcp-server
cargo build --release -p smart-diff-cli

3. Load Binaries in Binary Ninja

  1. Open Binary Ninja
  2. Load first binary (e.g., malware_v1.exe)
  3. Plugins > MCP Server > Start Server for This Binary (port 9009)
  4. Open another Binary Ninja window
  5. Load second binary (e.g., malware_v2.exe)
  6. Plugins > MCP Server > Start Server for This Binary (port 9010)

This is the supported Personal-license path: Binary Ninja does the analysis in the GUI process, and smartdiff talks to the plugin over HTTP. Do not use headless Binary Ninja export scripts for this workflow.

4. Verify With The Rust CLI

cd /home/matteius/smartdiff
cargo run --release -p smart-diff-cli -- binja servers
cargo run --release -p smart-diff-cli -- binja functions port_9009 --limit 25
cargo run --release -p smart-diff-cli -- binja decompile port_9009 main

5. Configure Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "smartdiff": {
      "command": "/home/matteius/smartdiff/target/release/smart-diff-mcp",
      "args": []
    }
  }
}

6. Restart Claude Desktop

Usage Examples

Example 1: Explore a Binary

User: "I have a binary loaded in Binary Ninja. What functions does it have?"

Agent:
1. Calls list_binja_servers() → Finds binary on port_9009
2. Calls list_binary_functions(binary_id="port_9009")
3. Shows list of 150 functions
4. User can ask to decompile specific functions

Example 2: Analyze Specific Function

User: "Show me the decompiled code for the main function"

Agent:
1. Calls decompile_binary_function(binary_id="port_9009", function_name="main")
2. Returns decompiled C code:
   ```c
   int64_t main(int64_t argc, char** argv)
   {
       initialize();
       process_data();
       cleanup();
       return 0;
   }

## MCP Tool Reference

### list_binja_servers

**Description**: List available Binary Ninja MCP servers

**Input**: None

**Output**: List of servers with binary IDs, filenames, ports

### list_binary_functions

**Description**: List all functions in a binary

**Input**:
- `binary_id` (required): Binary server ID
- `search` (optional): Search term to filter functions

**Output**: List of function names

### decompile_binary_function

**Description**: Decompile a specific function

**Input**:
- `binary_id` (required): Binary server ID
- `function_name` (required): Function name

**Output**: Decompiled C code

### Feature-Gated Comparison Tools

The following tools are not visible in the default MCP tool list. Build with
`--features binary-comparison` only when binary-to-binary matching is desired.

### compare_binaries

**Description**: Compare two binaries and identify matching functions

**Input**:
- `binary_a_id` (required): First binary server ID
- `binary_b_id` (required): Second binary server ID
- `use_decompiled_code` (optional): Use code similarity (default: false)
- `similarity_threshold` (optional): Minimum similarity (default: 0.7)

**Output**: Comparison ID and summary statistics

### list_binary_matches

**Description**: List matched functions from a comparison

**Input**:
- `comparison_id` (required): Comparison ID from compare_binaries
- `limit` (optional): Maximum matches to return (default: 100)
- `min_similarity` (optional): Minimum similarity filter

**Output**: List of matches sorted by similarity (most changed first)

### get_binary_function_diff

**Description**: Get detailed diff for a specific function match

**Input**:
- `comparison_id` (required): Comparison ID
- `function_name` (required): Function name

**Output**: Decompiled code from both binaries with similarity metrics

## Implementation Status

### ✅ Phase 1: Binary Ninja MCP Client Library (Complete)
- HTTP client with server discovery
- Function listing and searching
- Decompilation fetching
- No Binary Ninja dependencies
- All unit tests passing

### ✅ Phase 2: Binary Comparison MCP Tools (Complete)
- Binary tool handler
- Integration with MCP server
- Three basic tools (list servers, list functions, decompile)
- All integration tests passing

### ✅ Phase 3: Binary Function Matching Engine (Complete)
- Exact name matching
- Fuzzy name matching (Levenshtein distance)
- Code similarity framework (ready for tree-sitter)
- Comparison storage and management
- Three comparison tools (compare, list matches, get diff)
- All unit tests passing

### 🚧 Phase 4: Testing & Documentation (In Progress)
- End-to-end testing with real binaries
- Performance benchmarking
- User documentation
- API documentation

## Testing

### Unit Tests

Run all tests:
```bash
cargo test

Run specific component tests:

cargo test -p smart-diff-binary-ninja-client
cargo test -p smart-diff-engine binary_matcher
cargo test -p smart-diff-mcp-server

Integration Testing

Requires running Binary Ninja with MCP server:

  1. Load test binaries in Binary Ninja
  2. Start MCP servers
  3. Run smartdiff MCP server
  4. Test with Claude Desktop

Performance

  • Server Discovery: < 1 second (scans 10 ports)
  • Function Listing: < 1 second (typical binary with 100-200 functions)
  • Decompilation: 1-2 seconds per function
  • Binary Comparison: 2-5 seconds (100-200 functions, name matching only)

Known Limitations

  1. Code Similarity: Framework is ready but tree-sitter C parser integration is pending
  2. Parallel Processing: Not yet implemented (can be added for large binaries)
  3. Advanced Metrics: Basic similarity only (no CFG or basic block analysis)

Future Enhancements

  1. Code Similarity Matching

    • Integrate tree-sitter C parser
    • Parse decompiled code as AST
    • Apply tree edit distance algorithms
    • Reuse existing smartdiff infrastructure
  2. Advanced Binary Analysis

    • CFG (Control Flow Graph) similarity
    • Basic block analysis
    • Instruction-level comparison
    • Call graph analysis
  3. Performance Optimization

    • Parallel processing with rayon
    • Caching of decompiled code
    • Incremental comparison
  4. Visualization

    • Function match visualization
    • Diff highlighting
    • Call graph visualization

Documentation

License

AGPL-3.0-only

Contributing

Contributions are welcome! Please see the main smartdiff README for contribution guidelines.

Support

For issues or questions:

  1. Check the documentation
  2. Review existing issues
  3. Create a new issue with details

Acknowledgments

  • Binary Ninja team for the excellent reverse engineering platform
  • MCP (Model Context Protocol) for enabling AI agent integration
  • rust_diff project for inspiration on binary diffing algorithms