Skip to content

[FEATURE]: Virtual meta-server - Comprehensive tool discovery and execution layer #2230

@crivetimihai

Description

@crivetimihai

🎯 Feature: Virtual Meta-Server - Comprehensive Tool Discovery & Execution Layer

Goal

Create a Virtual Meta-Server that exposes a complete set of meta-tools (search_tools, list_tools, describe_tool, execute_tool, get_tool_categories, get_similar_tools) enabling AI agents to discover and invoke thousands of underlying tools through a unified, simple interface—while completely hiding the real tools from the agent's tool list.

Why Now?

  1. Context Overflow: LLMs can't handle 1000+ tool definitions; meta-tools solve this by exposing only 4-6 tools
  2. MCP Limitations: Current MCP implementations struggle with large tool catalogs; this provides a scalable abstraction
  3. Abstraction Need: AI agents shouldn't manage individual server connections or know about backend topology
  4. Unified Interface: One server, few meta-tools, access to everything
  5. Security & Scoping: Hide real tools, expose only what's permitted per scope/team
  6. Existing Foundation: Virtual servers, tags, visibility, team scoping already exist
  7. AI Agent UX: Dramatically simplifies agent development and reduces prompt complexity

📖 User Stories

US-1: AI Agent - Semantic Search for Tools

As an AI Agent
I want to search for tools using natural language with multiple filter options
So that I find relevant tools without knowing exact names or browsing large catalogs

Acceptance Criteria:

Given I'm connected to the meta-server:
When I call:
  search_tools(
    query="tools that can analyze CSV data and create visualizations",
    filters={
      "tags": ["data", "analytics"],
      "servers": ["analytics-server"],
      "integration_type": "MCP",
      "visibility": ["public", "team"],
      "min_score": 0.7
    },
    limit=10,
    include_schemas=false
  )
Then I receive:
  {
    "tools": [
      {
        "id": "abc123",
        "name": "csv_analyzer",
        "qualified_name": "analytics-server/csv_analyzer",
        "description": "Analyze CSV files and generate insights",
        "score": 0.94,
        "tags": ["data", "csv", "analytics"],
        "server": "analytics-server",
        "category": "Data Processing"
      },
      ...
    ],
    "total_matches": 47,
    "search_metadata": {
      "query_embedding_model": "text-embedding-3-small",
      "search_type": "semantic"
    }
  }
US-2: AI Agent - List Tools with Pagination

As an AI Agent
I want to browse all available tools with pagination and filtering
So that I can explore the tool catalog systematically

Acceptance Criteria:

Given I'm connected to the meta-server:
When I call:
  list_tools(
    page=1,
    page_size=20,
    sort_by="name",
    sort_order="asc",
    filters={
      "tags": ["production"],
      "category": "File Operations"
    }
  )
Then I receive:
  {
    "tools": [...],
    "pagination": {
      "page": 1,
      "page_size": 20,
      "total_items": 156,
      "total_pages": 8,
      "has_next": true,
      "has_previous": false
    }
  }
US-3: AI Agent - Get Detailed Tool Description

As an AI Agent
I want to get complete details about a specific tool including its full schema
So that I can understand exactly how to use it before execution

Acceptance Criteria:

Given I found tool "analytics-server/csv_analyzer":
When I call:
  describe_tool(
    tool_name="analytics-server/csv_analyzer",
    include_examples=true,
    include_metrics=true
  )
Then I receive:
  {
    "id": "abc123",
    "name": "csv_analyzer",
    "qualified_name": "analytics-server/csv_analyzer",
    "description": "Analyze CSV files and generate statistical insights",
    "input_schema": {
      "type": "object",
      "properties": {
        "file_url": {"type": "string", "description": "URL to the CSV file"},
        "columns": {"type": "array", "items": {"type": "string"}, "description": "Columns to analyze"},
        "output_format": {"type": "string", "enum": ["json", "markdown", "html"]}
      },
      "required": ["file_url"]
    },
    "output_schema": {...},
    "annotations": {
      "title": "CSV Analyzer",
      "readOnlyHint": true,
      "destructiveHint": false,
      "idempotentHint": true,
      "openWorldHint": false
    },
    "tags": ["data", "csv", "analytics"],
    "category": "Data Processing",
    "server": {
      "name": "analytics-server",
      "description": "Analytics and data processing tools"
    },
    "examples": [
      {
        "description": "Analyze sales data",
        "arguments": {"file_url": "https://example.com/sales.csv", "output_format": "json"}
      }
    ],
    "metrics": {
      "total_executions": 1523,
      "success_rate": 0.97,
      "avg_response_time_ms": 450
    },
    "similar_tools": ["data-viz/chart_generator", "reporting/table_builder"]
  }
US-4: AI Agent - Execute Tool

As an AI Agent
I want to execute a discovered tool through the meta-server
So that I can complete my task without managing backend connections

Acceptance Criteria:

Given I understand tool "analytics-server/csv_analyzer":
When I call:
  execute_tool(
    tool_name="analytics-server/csv_analyzer",
    arguments={
      "file_url": "https://example.com/data.csv",
      "columns": ["revenue", "date"],
      "output_format": "json"
    }
  )
Then the meta-server:
  - Validates arguments against input_schema
  - Routes to the correct backend server
  - Handles authentication transparently
  - Applies rate limiting and timeouts
  - Returns the tool result with execution metadata
And I receive:
  {
    "result": {
      "analysis": {...},
      "summary": "..."
    },
    "execution_metadata": {
      "tool_id": "abc123",
      "execution_time_ms": 423,
      "server": "analytics-server"
    }
  }
US-5: AI Agent - Get Tool Categories

As an AI Agent
I want to browse available tool categories and their counts
So that I can understand what kinds of tools are available

Acceptance Criteria:

When I call:
  get_tool_categories()
Then I receive:
  {
    "categories": [
      {"name": "Data Processing", "count": 45, "description": "Tools for data manipulation and analysis"},
      {"name": "File Operations", "count": 32, "description": "Tools for file management"},
      {"name": "Communication", "count": 28, "description": "Email, messaging, notifications"},
      {"name": "Code Generation", "count": 24, "description": "Code writing and refactoring tools"}
    ],
    "total_tools": 234,
    "tags": [
      {"name": "production", "count": 180},
      {"name": "experimental", "count": 54}
    ]
  }
US-6: AI Agent - Find Similar Tools

As an AI Agent
I want to find tools similar to one I already know
So that I can discover alternatives or complementary tools

Acceptance Criteria:

When I call:
  get_similar_tools(
    tool_name="analytics-server/csv_analyzer",
    limit=5
  )
Then I receive:
  {
    "reference_tool": "analytics-server/csv_analyzer",
    "similar_tools": [
      {"name": "data-viz/excel_parser", "similarity": 0.92, "reason": "Similar input/output patterns"},
      {"name": "reporting/data_transformer", "similarity": 0.87, "reason": "Related data processing"},
      ...
    ]
  }
US-7: Platform Admin - Create Scoped Meta-Server with Hidden Tools

As a Platform Administrator
I want to create a meta-server that completely hides underlying tools
So that AI agents only see meta-tools and can't bypass discovery

Acceptance Criteria:

Given I create a meta-server:
  POST /servers
  {
    "name": "finance-ai-tools",
    "type": "meta",
    "hide_underlying_tools": true,
    "scope": {
      "include_tags": ["finance", "accounting", "production"],
      "include_servers": ["quickbooks-server", "reporting-server"],
      "exclude_tags": ["experimental", "deprecated"],
      "include_visibility": ["public", "team"],
      "include_teams": ["finance-team"],
      "name_patterns": ["^finance_.*", "^report_.*"]
    },
    "meta_config": {
      "enable_semantic_search": true,
      "enable_categories": true,
      "enable_similar_tools": true,
      "default_search_limit": 10,
      "max_search_limit": 50,
      "include_metrics_in_search": false
    }
  }
Then:
  - AI agents connecting see ONLY: search_tools, list_tools, describe_tool, execute_tool, get_tool_categories, get_similar_tools
  - The 500+ underlying finance tools are completely hidden from tools/list
  - Agents MUST use search_tools or list_tools to discover tools
  - execute_tool is the ONLY way to invoke underlying tools
  - Out-of-scope tools cannot be discovered or executed

🏗 Architecture

Meta-Tool Definitions

{
  "tools": [
    {
      "name": "search_tools",
      "description": "Search for tools using natural language or filters. Use this to find tools by describing what you need. Supports semantic search, tag filtering, server filtering, and more.",
      "inputSchema": {
        "type": "object",
        "properties": {
          "query": {
            "type": "string",
            "description": "Natural language search query (e.g., 'tools that can convert PDF to text')"
          },
          "filters": {
            "type": "object",
            "description": "Optional filters to narrow results",
            "properties": {
              "tags": {
                "type": "array",
                "items": {"type": "string"},
                "description": "Filter by tags (e.g., ['production', 'data'])"
              },
              "exclude_tags": {
                "type": "array",
                "items": {"type": "string"},
                "description": "Exclude tools with these tags"
              },
              "servers": {
                "type": "array",
                "items": {"type": "string"},
                "description": "Filter by server names"
              },
              "categories": {
                "type": "array",
                "items": {"type": "string"},
                "description": "Filter by category names"
              },
              "integration_type": {
                "type": "string",
                "enum": ["REST", "MCP", "any"],
                "description": "Filter by integration type"
              },
              "name_pattern": {
                "type": "string",
                "description": "Regex pattern to match tool names"
              },
              "description_contains": {
                "type": "string",
                "description": "Full-text search in descriptions"
              },
              "min_score": {
                "type": "number",
                "minimum": 0,
                "maximum": 1,
                "description": "Minimum semantic similarity score (0-1)"
              },
              "has_examples": {
                "type": "boolean",
                "description": "Only return tools with usage examples"
              }
            }
          },
          "limit": {
            "type": "integer",
            "default": 10,
            "minimum": 1,
            "maximum": 50,
            "description": "Maximum number of results"
          },
          "include_schemas": {
            "type": "boolean",
            "default": false,
            "description": "Include full input/output schemas in results"
          },
          "search_type": {
            "type": "string",
            "enum": ["semantic", "keyword", "hybrid"],
            "default": "semantic",
            "description": "Type of search to perform"
          }
        }
      }
    },
    {
      "name": "list_tools",
      "description": "List all available tools with pagination. Use this to browse the complete tool catalog or filter by criteria.",
      "inputSchema": {
        "type": "object",
        "properties": {
          "page": {"type": "integer", "default": 1, "minimum": 1},
          "page_size": {"type": "integer", "default": 20, "minimum": 1, "maximum": 100},
          "sort_by": {
            "type": "string",
            "enum": ["name", "created_at", "execution_count", "success_rate", "category"],
            "default": "name"
          },
          "sort_order": {"type": "string", "enum": ["asc", "desc"], "default": "asc"},
          "filters": {
            "type": "object",
            "description": "Same filter options as search_tools"
          },
          "include_schemas": {"type": "boolean", "default": false}
        }
      }
    },
    {
      "name": "describe_tool",
      "description": "Get complete details about a specific tool including its full schema, examples, and usage metrics. Always call this before executing an unfamiliar tool.",
      "inputSchema": {
        "type": "object",
        "properties": {
          "tool_name": {
            "type": "string",
            "description": "Full tool name (server/tool_name) or tool ID"
          },
          "include_examples": {"type": "boolean", "default": true},
          "include_metrics": {"type": "boolean", "default": false},
          "include_similar": {"type": "boolean", "default": false}
        },
        "required": ["tool_name"]
      }
    },
    {
      "name": "execute_tool",
      "description": "Execute a tool by its full name. The meta-server handles routing, authentication, and error handling.",
      "inputSchema": {
        "type": "object",
        "properties": {
          "tool_name": {
            "type": "string",
            "description": "Full tool name (server/tool_name) or tool ID"
          },
          "arguments": {
            "type": "object",
            "description": "Tool-specific arguments matching the tool's input schema"
          },
          "options": {
            "type": "object",
            "properties": {
              "timeout_ms": {"type": "integer", "default": 30000, "description": "Execution timeout"},
              "include_metadata": {"type": "boolean", "default": true}
            }
          }
        },
        "required": ["tool_name", "arguments"]
      }
    },
    {
      "name": "get_tool_categories",
      "description": "Get a list of all tool categories and their counts. Use this to understand what kinds of tools are available.",
      "inputSchema": {
        "type": "object",
        "properties": {
          "include_tags": {"type": "boolean", "default": true},
          "include_servers": {"type": "boolean", "default": false}
        }
      }
    },
    {
      "name": "get_similar_tools",
      "description": "Find tools similar to a given tool. Useful for discovering alternatives or complementary tools.",
      "inputSchema": {
        "type": "object",
        "properties": {
          "tool_name": {
            "type": "string",
            "description": "Reference tool name to find similar tools for"
          },
          "limit": {"type": "integer", "default": 5, "minimum": 1, "maximum": 20}
        },
        "required": ["tool_name"]
      }
    }
  ]
}

Complete Execution Flow

sequenceDiagram
    participant AI as AI Agent
    participant Meta as Meta-Server
    participant Scope as Scope Filter
    participant Embed as Embedding Service
    participant Index as Tool Index (pgvector)
    participant Router as Tool Router
    participant Backend as Backend Servers

    Note over AI,Meta: Discovery Phase
    AI->>Meta: get_tool_categories()
    Meta->>Scope: Apply scope filters
    Meta-->>AI: Categories, tags, counts

    AI->>Meta: search_tools("analyze spreadsheet data")
    Meta->>Scope: Apply scope filters
    Meta->>Embed: Generate query embedding
    Embed-->>Meta: Query vector
    Meta->>Index: Semantic similarity search
    Index-->>Meta: Top-K tool IDs + scores
    Meta->>Scope: Filter results by scope
    Meta-->>AI: Matching tools with scores

    Note over AI,Meta: Understanding Phase
    AI->>Meta: describe_tool("analytics/spreadsheet_analyzer")
    Meta->>Scope: Verify tool in scope
    Meta-->>AI: Full schema, examples, metrics

    Note over AI,Meta: Execution Phase
    AI->>Meta: execute_tool("analytics/spreadsheet_analyzer", {...})
    Meta->>Scope: Verify tool in scope
    Meta->>Meta: Validate arguments vs schema
    Meta->>Router: Route to backend
    Router->>Backend: Invoke tool (with auth)
    Backend-->>Router: Result
    Router-->>Meta: Result
    Meta-->>AI: Formatted result + metadata
Loading

🔒 Tool Hiding Mechanism

When hide_underlying_tools: true:

  1. tools/list endpoint: Returns ONLY the 6 meta-tools, not the underlying tools
  2. MCP tools/list: Same—only meta-tools visible to AI agents
  3. Direct tool execution: Blocked—must go through execute_tool
  4. Tool discovery: Only possible via search_tools and list_tools meta-tools
  5. Scope enforcement: Even discovered tools are filtered by scope rules
┌─────────────────────────────────────────────────────────────────────┐
│                     AI Agent View                                    │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │  Available Tools (6 only):                                   │   │
│  │  • search_tools    • list_tools    • describe_tool          │   │
│  │  • execute_tool    • get_tool_categories  • get_similar_tools│   │
│  └─────────────────────────────────────────────────────────────┘   │
│                              │                                      │
│                              ▼                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │              Meta-Server Abstraction Layer                   │   │
│  │  • Semantic search via embeddings                            │   │
│  │  • Scope enforcement                                         │   │
│  │  • Request routing                                           │   │
│  │  • Auth handling                                             │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                              │                                      │
│                              ▼                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │              Hidden Tool Layer (1000s of tools)              │   │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐       │   │
│  │  │Server A  │ │Server B  │ │Server C  │ │Server D  │ ...   │   │
│  │  │150 tools │ │200 tools │ │180 tools │ │300 tools │       │   │
│  │  └──────────┘ └──────────┘ └──────────┘ └──────────┘       │   │
│  └─────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

🔍 Search Filter Types

Filter Type Description Example
query string Semantic/natural language search "convert PDF to text"
tags array Include tools with ANY of these tags ["production", "stable"]
exclude_tags array Exclude tools with ANY of these tags ["deprecated", "experimental"]
servers array Include tools from these servers only ["analytics-server"]
categories array Include tools in these categories ["Data Processing"]
integration_type enum Filter by REST, MCP, or any "MCP"
name_pattern regex Match tool names by pattern "^report_.*"
description_contains string Full-text search in descriptions "spreadsheet"
min_score float Minimum semantic similarity (0-1) 0.75
has_examples bool Only tools with usage examples true
visibility array Filter by visibility level ["public", "team"]
teams array Filter by team ownership ["finance-team"]

📋 Implementation Tasks

  • Meta-Server Type

    • Add "meta" server type to virtual server framework
    • Implement hide_underlying_tools flag
    • Define meta-tool schemas (all 6 tools)
    • Implement scope configuration model
    • Add meta_config options
  • search_tools Implementation

  • list_tools Implementation

    • Paginated tool listing
    • All sort options (name, created_at, execution_count, etc.)
    • Scope-aware filtering
    • Optional schema inclusion
  • describe_tool Implementation

    • Full tool details retrieval
    • Example retrieval from tool metadata
    • Metrics aggregation
    • Similar tools via embedding similarity
    • Scope verification
  • execute_tool Implementation

    • Tool name/ID resolution
    • Scope verification
    • Argument validation against input_schema
    • Request routing to correct backend
    • Auth header forwarding
    • Response formatting
    • Execution metadata inclusion
    • Error handling and retries
  • get_tool_categories Implementation

    • Category aggregation from tool metadata
    • Tag frequency counting
    • Server listing (optional)
    • Scope-aware filtering
  • get_similar_tools Implementation

    • Embedding-based similarity search
    • Similarity scoring and explanation
    • Scope filtering on results
  • Tool Hiding Logic

    • Modify tools/list to respect hide_underlying_tools
    • Block direct tool execution when hidden
    • Audit logging for hidden tool access attempts
  • Admin UI

    • Meta-server creation wizard
    • Scope configuration UI (visual rule builder)
    • Meta-tool testing interface
    • Tool visibility preview
  • Testing

    • Unit tests for each meta-tool
    • Integration tests with backend servers
    • Scope filtering tests (all filter types)
    • Tool hiding enforcement tests
    • Performance tests with 1000+ tools

⚙️ Configuration Example

meta_server:
  # Default meta-server for all users
  default:
    enabled: true
    name: "tool-discovery"
    hide_underlying_tools: true

  # Scope templates (reusable)
  scope_templates:
    production:
      include_tags: ["production", "stable"]
      exclude_tags: ["experimental", "deprecated", "internal"]
    internal:
      include_visibility: ["internal", "team"]
      include_teams: ["platform-team"]
    finance:
      include_tags: ["finance", "accounting"]
      include_servers: ["quickbooks", "reporting", "erp"]

  # Meta-tool configuration
  meta_config:
    enable_semantic_search: true
    enable_categories: true
    enable_similar_tools: true
    default_search_limit: 10
    max_search_limit: 50
    include_metrics_in_search: false
    embedding_model: "text-embedding-3-small"

  # Execution settings
  execution:
    timeout_seconds: 30
    retry_attempts: 2
    log_all_executions: true
    validate_arguments: true

✅ Success Criteria

  • Meta-server type fully functional
  • All 6 meta-tools implemented and working
  • search_tools returns accurate semantic results with <100ms latency
  • list_tools provides efficient paginated browsing
  • describe_tool returns complete tool information
  • execute_tool routes correctly to any backend
  • get_tool_categories provides accurate taxonomy
  • get_similar_tools finds relevant alternatives
  • All filter types work correctly
  • Tool hiding completely prevents direct access
  • Scope filtering enforced on all operations
  • Admin UI for meta-server management
  • Works with 1000+ tools without performance degradation
  • 80%+ test coverage

🔗 Related Issues

Metadata

Metadata

Labels

MUSTP1: Non-negotiable, critical requirements without which the product is non-functional or unsafeenhancementNew feature or requestpythonPython / backend development (FastAPI)sweng-group-19Group 19 - AI-Powered Conversational Gateway & Semantic DiscoverytcdSwEng Projects
No fields configured for Feature.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions