Skip to content

Latest commit

 

History

History
738 lines (628 loc) · 23.5 KB

File metadata and controls

738 lines (628 loc) · 23.5 KB
title Composite tools and workflows
description Create multi-step workflows that span multiple backend MCP servers.

Composite tools let you define multi-step workflows that execute across multiple backend MCP servers with parallel execution, conditional logic, approval gates, and error handling.

Overview

A composite tool combines multiple backend tool calls into a single workflow. When a client calls a composite tool, vMCP orchestrates the execution across backend MCP servers, handling dependencies and collecting results.

Key capabilities

  • Parallel execution: Independent steps run concurrently; dependent steps wait for their prerequisites
  • Template expansion: Dynamic arguments using step outputs
  • Elicitation: Request user input mid-workflow (approval gates, choices)
  • Iteration: Loop over collections with forEach steps
  • Error handling: Configurable abort, continue, or retry behavior
  • Timeouts: Workflow and per-step timeout configuration

:::info

Elicitation (user prompts during workflow execution) is defined in the CRD but has not been extensively tested. Test thoroughly in non-production environments first.

:::

Configuration location

Composite tools are defined in the VirtualMCPServer resource under spec.config.compositeTools:

apiVersion: toolhive.stacklok.dev/v1beta1
kind: VirtualMCPServer
metadata:
  name: my-vmcp
spec:
  incomingAuth:
    type: anonymous
  groupRef:
    name: my-tools
  config:
    # ... other configuration ...
    compositeTools:
      - name: my_workflow
        description: A multi-step workflow
        parameters:
          # Input parameters (JSON Schema)
        steps:
          # Workflow steps

For complex, reusable workflows, you can also reference external VirtualMCPCompositeToolDefinition resources using spec.config.compositeToolRefs.

Simple example

Here's a composite tool that searches arXiv for papers on a topic and reads the top result. This example assumes you have an MCPServer resource named arxiv in a group that your vMCP server references, and that you're using the default conflict resolution strategy and prefix format (<SERVER_NAME>_<TOOL_NAME>):

spec:
  config:
    compositeTools:
      - name: research_topic
        description: Search arXiv for papers and read the top result
        parameters:
          type: object
          properties:
            query:
              type: string
              description: Research topic to search for
          required:
            - query
        steps:
          # Step 1: Search arXiv for papers matching the query
          - id: search
            tool: arxiv_search_papers
            arguments:
              query: '{{.params.query}}'
              max_results: 1
          # Step 2: Download the paper (required before reading)
          # Note: fromJson is needed when the MCP server returns JSON as text
          # rather than structured content. This is common for servers that
          # don't fully support MCP's structuredContent field.
          - id: download
            tool: arxiv_download_paper
            arguments:
              paper_id:
                '{{(index (fromJson .steps.search.output.text).papers 0).id}}'
            dependsOn: [search]
          # Step 3: Read the downloaded paper content
          - id: read
            tool: arxiv_read_paper
            arguments:
              paper_id:
                '{{(index (fromJson .steps.search.output.text).papers 0).id}}'
            dependsOn: [download]

What's happening:

  1. Parameters: Define the workflow inputs (query for the research topic)
  2. Step 1 (search): Calls arxiv_search_papers with the query from parameters using template syntax {{.params.query}}
  3. Step 2 (download): Waits for search (dependsOn: [search]), then downloads the paper. The fromJson function parses the JSON text returned by the server, and index accesses the first paper's ID.
  4. Step 3 (read): Waits for download, then reads the paper content.

When a client calls this composite tool, vMCP executes all three steps in sequence and returns the paper content.

Structured content vs JSON text

MCP servers can return data in two ways:

  • Structured content: Data is in structuredContent and can be accessed directly: {{.steps.stepid.output.field}}
  • JSON text: Data is returned as a JSON string in the text field and requires parsing: {{(fromJson .steps.stepid.output.text).field}}

The arxiv-mcp-server in this example uses JSON text, so we use fromJson. Check your backend's response format to determine which approach to use.

Use cases

Incident investigation

Gather data from multiple monitoring systems in parallel:

spec:
  config:
    compositeTools:
      - name: investigate_incident
        description: Gather incident data from multiple sources in parallel
        parameters:
          type: object
          properties:
            incident_id:
              type: string
          required:
            - incident_id
        steps:
          # These steps run in parallel (no dependencies)
          - id: get_logs
            tool: logging_search_logs
            arguments:
              query: 'incident_id={{.params.incident_id}}'
              timerange: '1h'
          - id: get_metrics
            tool: monitoring_get_metrics
            arguments:
              filter: 'error_rate'
              timerange: '1h'
          - id: get_alerts
            tool: pagerduty_list_alerts
            arguments:
              incident: '{{.params.incident_id}}'
          # This step waits for all parallel steps to complete
          - id: create_summary
            tool: docs_create_document
            arguments:
              title: 'Incident {{.params.incident_id}} Summary'
              content: 'Logs: {{.steps.get_logs.output.results}}'
            dependsOn: [get_logs, get_metrics, get_alerts]

Deployment with approval

Human-in-the-loop workflow for production deployments:

spec:
  config:
    compositeTools:
      - name: deploy_with_approval
        description: Deploy to production with human approval gate
        parameters:
          type: object
          properties:
            pr_number:
              type: string
            environment:
              type: string
              default: production
          required:
            - pr_number
        steps:
          - id: get_pr_details
            tool: github_get_pull_request
            arguments:
              pr: '{{.params.pr_number}}'
          - id: approval
            type: elicitation
            message:
              'Deploy PR #{{.params.pr_number}} to {{.params.environment}}?'
            schema:
              type: object
              properties:
                approved:
                  type: boolean
            timeout: '10m'
            dependsOn: [get_pr_details]
          - id: deploy
            tool: deploy_trigger_deployment
            arguments:
              ref: '{{.steps.get_pr_details.output.head_sha}}'
              environment: '{{.params.environment}}'
            condition: '{{.steps.approval.content.approved}}'
            dependsOn: [approval]

Cross-system data aggregation

Collect and correlate data from multiple backend MCP servers:

spec:
  config:
    compositeTools:
      - name: security_scan_report
        description: Run security scans and create consolidated report
        parameters:
          type: object
          properties:
            package_name:
              type: string
            ecosystem:
              type: string
            repo:
              type: string
          required:
            - package_name
            - ecosystem
            - repo
        steps:
          - id: vulnerability_scan
            tool: osv_query_vulnerability
            arguments:
              package_name: '{{.params.package_name}}'
              ecosystem: '{{.params.ecosystem}}'
          - id: secret_scan
            tool: gitleaks_scan_repo
            arguments:
              repository: '{{.params.repo}}'
          - id: create_issue
            tool: github_create_issue
            arguments:
              repo: '{{.params.repo}}'
              title: 'Security Scan Results'
              body: 'Vulnerability scan completed for {{.params.package_name}}'
            dependsOn: [vulnerability_scan, secret_scan]
            onError:
              action: continue

Workflow definition

Parameters

Define input parameters using JSON Schema format:

spec:
  config:
    compositeTools:
      - name: <TOOL_NAME>
        parameters:
          type: object
          properties:
            required_param:
              type: string
            optional_param:
              type: integer
              default: 10
          required:
            - required_param

Steps

Each step can be a tool call, an elicitation, or a forEach loop:

spec:
  config:
    compositeTools:
      - name: <TOOL_NAME>
        steps:
          - id: step_name # Unique identifier
            tool: backend_tool # Tool to call
            arguments: # Arguments with template expansion
              arg1: '{{.params.input}}'
            dependsOn: [other_step] # Dependencies (this step waits for other_step)
            condition: '{{.steps.check.output.approved}}' # Optional condition
            timeout: '30s' # Step timeout
            onError:
              action: abort # abort | continue | retry

The tool field specifies which MCP server tool to call. This depends on your conflict resolution strategy and prefix format. For example, if you have a tool named search in an MCP server named arxiv, and you're using the default prefix format, you would reference it as arxiv_search.

:::tip

When using the condition field, downstream steps that reference the conditional step's output may require default step outputs to handle cases where the condition evaluates to false.

:::

Elicitation (user prompts)

Request input from users during workflow execution:

spec:
  config:
    compositeTools:
      - name: <TOOL_NAME>
        steps:
          - id: approval
            type: elicitation
            message: 'Proceed with deployment?'
            schema:
              type: object
              properties:
                confirm: { type: boolean }
            timeout: '5m'

forEach steps

Iterate over a collection from a previous step's output and execute a tool call for each item:

spec:
  config:
    compositeTools:
      - name: scan_repositories
        description: Check each repository for security advisories
        parameters:
          type: object
          properties:
            org:
              type: string
          required:
            - org
        steps:
          - id: list_repos
            tool: github_list_repos
            arguments:
              org: '{{.params.org}}'
          # highlight-start
          - id: check_advisories
            type: forEach
            collection: '{{json .steps.list_repos.output.repositories}}'
            itemVar: repo
            maxParallel: 5
            step:
              type: tool
              tool: github_list_security_advisories
              arguments:
                repo: '{{.forEach.repo.name}}'
            onError:
              action: continue
            dependsOn: [list_repos]
          # highlight-end

forEach fields:

Field Description Default
collection Template expression that resolves to a JSON array -
itemVar Variable name for the current item item
maxParallel Maximum concurrent iterations (max 50) 10
maxIterations Maximum total iterations (max 1000) 100
step Inner step definition (tool call to execute per item) -
onError Error handling: abort (stop) or continue (skip) abort

:::note

forEach does not support onError.action: retry. Use retry on regular tool steps. The maxParallel cap of 50 is enforced at runtime regardless of the configured value.

:::

Access the current item inside the inner step using {{.forEach.<itemVar>.<field>}}. In the example above, {{.forEach.repo.name}} accesses the name field of the current repository. You can also use {{.forEach.index}} to access the zero-based iteration index.

maxParallel controls how many iterations run concurrently on the pod that received the composite tool request. Iterations are not distributed across vMCP replicas - all parallel backend calls originate from a single pod regardless of spec.replicas. When sizing your deployment, account for the per-pod fan-out: a maxParallel: 50 forEach step can open up to 50 simultaneous connections to backend MCP servers from one pod. Ensure both the vMCP pod resources and the backend MCP servers can handle that per-pod concurrency.

:::tip[Plan your workflow timeouts]

With maxIterations: 1000 and maxParallel: 10 (the defaults), a forEach loop runs up to 100 serial batches. If each backend call takes a few seconds, the total duration can easily exceed a workflow-level timeout. Set the workflow timeout to at least ceil(maxIterations / maxParallel) × expected step duration to avoid silent truncation.

:::

Error handling

Configure behavior when steps fail:

Action Description
abort Stop workflow immediately
continue Log error, proceed to next step
retry Retry with exponential backoff
spec:
  config:
    compositeTools:
      - name: <TOOL_NAME>
        steps:
          - id: <STEP_ID>
            # ... other step config (tool, arguments, etc.)
            onError:
              action: retry
              retryCount: 3

:::tip

When using onError.action: continue, downstream steps that reference this step's output may require default step outputs to handle cases where the step fails.

:::

Default step outputs

When steps can be skipped (due to condition being false or onError.action: continue), downstream steps that reference their outputs need fallback values. Use defaultResults to provide these values.

When defaultResults are required

You must provide defaultResults when both of these conditions are true:

  1. A step can be skipped (has a condition field or onError.action: continue)
  2. A downstream step references the skipped step's output in its arguments

Configuration

Define default values that match the expected output structure:

spec:
  config:
    compositeTools:
      - name: optional_security_check
        description: Run security scan with optional vulnerability check
        parameters:
          type: object
          properties:
            package_name:
              type: string
            ecosystem:
              type: string
            run_vuln_scan:
              type: boolean
              default: false
          required:
            - package_name
            - ecosystem
        steps:
          # Step 1: Optional vulnerability scan
          - id: vuln_scan
            tool: osv_query_vulnerability
            arguments:
              package_name: '{{.params.package_name}}'
              ecosystem: '{{.params.ecosystem}}'
            condition: '{{.params.run_vuln_scan}}'
            # highlight-start
            defaultResults:
              vulns: []
            # highlight-end
          # Step 2: Create report using scan results
          - id: create_report
            tool: docs_create_document
            arguments:
              title: 'Security Report'
              # This references vuln_scan output, so defaultResults are needed
              body:
                'Found {{len .steps.vuln_scan.output.vulns}} vulnerabilities'
            dependsOn: [vuln_scan]

Continue on error example

When using onError.action: continue, provide defaults for potential failures:

spec:
  config:
    compositeTools:
      - name: multi_source_data
        description: Gather data from multiple sources, continue on failures
        steps:
          # Step 1: Fetch from primary source (may fail)
          - id: fetch_primary
            tool: api_get_data
            arguments:
              source: 'primary'
            onError:
              action: continue
            # highlight-start
            defaultResults:
              status: 'unavailable'
              data: null
            # highlight-end
          # Step 2: Aggregate results
          - id: aggregate
            tool: processing_combine_data
            arguments:
              # Uses fetch_primary output even if it failed
              primary: '{{.steps.fetch_primary.output.data}}'
            dependsOn: [fetch_primary]

Validation

vMCP validates defaultResults at configuration time:

  • Missing defaults: If a step can be skipped and downstream steps reference its output, but defaultResults is not provided, vMCP returns a validation error.
  • Structure: The defaultResults value can be any valid JSON type (object, array, string, number, boolean, null).
  • No type checking: vMCP does not verify that defaultResults match the actual output structure. You must ensure they match the format your downstream steps expect.

Example validation error

# This will fail validation
steps:
  - id: conditional_step
    tool: backend_fetch
    condition: '{{.params.enabled}}'
    # Missing defaultResults!
  - id: use_result
    tool: backend_process
    arguments:
      # References conditional_step output
      data: '{{.steps.conditional_step.output.value}}'
    dependsOn: [conditional_step]

Error message:

step 'conditional_step' can be skipped but is referenced by downstream steps
without defaultResults defined

Template syntax

Access workflow context in arguments:

Template Description
{{.params.name}} Input parameter
{{.steps.id.output}} Step output (map)
{{.steps.id.output.text}} Text content from step output
{{.steps.id.content}} Elicitation response content
{{.steps.id.action}} Elicitation action (accept/decline/cancel)
{{.forEach.<itemVar>}} Current forEach item
{{.forEach.<itemVar>.<field>}} Field on current forEach item
{{.forEach.index}} Zero-based iteration index

Template functions

The following functions are available for use in templates:

Function Description Example
fromJson Parse a JSON string into a value {{(fromJson .steps.s1.output.text).field}}
json Encode a value as a JSON string {{json .steps.s1.output}}
quote Quote a string value {{quote .params.name}}
index Access array elements by index {{index .steps.s1.output.items 0}}

All Go template built-in functions are also supported (e.g., len, eq, and, or, printf).

Accessing step outputs

When an MCP server returns structured content, you can access output fields directly:

# Direct access when server supports structuredContent
result: '{{.steps.fetch.output.data}}'
items: '{{index .steps.search.output.results 0}}'

This is the simplest approach and works when the backend MCP server populates the structuredContent field in its response.

Working with JSON text responses

Some MCP servers return structured data as JSON text rather than using MCP's structuredContent field. When this happens, use fromJson to parse it:

# Parse JSON text and access a nested field
paper_id: '{{(index (fromJson .steps.search.output.text).papers 0).id}}'

This pattern:

  1. Gets the text output: .steps.search.output.text
  2. Parses it as JSON: fromJson ...
  3. Accesses the papers array and gets the first element: index ... 0
  4. Gets the id field: .id

How to tell which approach to use: Call the backend tool directly and inspect the response. If structuredContent contains your data fields, use direct access. If structuredContent only has a text field containing JSON, use fromJson.

Complete example

A VirtualMCPServer with an inline composite tool using the arxiv-mcp-server:

apiVersion: toolhive.stacklok.dev/v1beta1
kind: VirtualMCPServer
metadata:
  name: research-vmcp
  namespace: toolhive-system
spec:
  incomingAuth:
    type: anonymous
  groupRef:
    name: research-tools
  config:
    aggregation:
      conflictResolution: prefix
      conflictResolutionConfig:
        prefixFormat: '{workload}_'
    compositeTools:
      - name: research_topic
        description: Search arXiv for papers and read the top result
        parameters:
          type: object
          properties:
            query:
              type: string
              description: Research topic to search for
          required:
            - query
        steps:
          - id: search
            tool: arxiv_search_papers
            arguments:
              query: '{{.params.query}}'
              max_results: 1
          - id: download
            tool: arxiv_download_paper
            arguments:
              paper_id:
                '{{(index (fromJson .steps.search.output.text).papers 0).id}}'
            dependsOn: [search]
          - id: read
            tool: arxiv_read_paper
            arguments:
              paper_id:
                '{{(index (fromJson .steps.search.output.text).papers 0).id}}'
            dependsOn: [download]
        timeout: '5m'

Note: The example above assumes you have:

  • An MCPGroup named research-tools.
  • An arxiv-mcp-server deployed as an MCPServer or MCPRemoteProxy resource that references the research-tools group.

For a complete example of configuring MCP groups and backend servers, see the quickstart and tool aggregation guides. For complex, reusable workflows, create VirtualMCPCompositeToolDefinition resources and reference them with spec.config.compositeToolRefs:

spec:
  groupRef:
    name: my-tools
  config:
    compositeToolRefs:
      - name: my-reusable-workflow
      - name: another-workflow

Next steps

Related information