Skip to content

Latest commit

 

History

History
96 lines (72 loc) · 7.52 KB

File metadata and controls

96 lines (72 loc) · 7.52 KB

MCP Tools & Permissions

The handler filters tools dynamically based on the Sysdig user's permissions. Each tool declares mandatory permissions via WithRequiredPermissions.

Sysdig Monitor

Tool File Capability Required Permissions Useful Prompts
k8s_list_clusters tool_k8s_list_clusters.go Lists Kubernetes cluster information. metrics-data.read "List all Kubernetes clusters"
k8s_list_nodes tool_k8s_list_nodes.go Lists Kubernetes node information. metrics-data.read "List all Kubernetes nodes in the cluster 'production-gke'"
k8s_list_workloads tool_k8s_list_workloads.go Lists Kubernetes workload information. metrics-data.read "List all desired workloads in the cluster 'production-gke' and namespace 'default'"
k8s_list_pod_containers tool_k8s_list_pod_containers.go Retrieves information from a particular pod and container. metrics-data.read "Show me info for pod 'my-pod' in cluster 'production-gke'"
k8s_list_cronjobs tool_k8s_list_cronjobs.go Retrieves information from the cronjobs in the cluster. metrics-data.read "List all cronjobs in cluster 'prod' and namespace 'default'"
k8s_list_count_pods_per_cluster tool_k8s_list_count_pods_per_cluster.go List the count of running Kubernetes Pods grouped by cluster and namespace. metrics-data.read "List the count of running Kubernetes Pods in cluster 'production'"
k8s_list_top_unavailable_pods tool_k8s_list_top_unavailable_pods.go Shows the top N pods with the highest number of unavailable or unready replicas. metrics-data.read "Show the top 20 unavailable pods in cluster 'production'"
k8s_list_top_restarted_pods tool_k8s_list_top_restarted_pods.go Lists the pods with the highest number of container restarts. metrics-data.read "Show the top 10 pods with the most container restarts in cluster 'production'"
k8s_list_top_http_errors_in_pods tool_k8s_list_top_http_errors_in_pods.go Lists the pods with the highest rate of HTTP 4xx and 5xx errors over a specified time interval. metrics-data.read "Show the top 20 pods with the most HTTP errors in cluster 'production'"
k8s_list_top_network_errors_in_pods tool_k8s_list_top_network_errors_in_pods.go Shows the top network errors by pod over a given interval. metrics-data.read "Show the top 10 pods with the most network errors in cluster 'production'"
k8s_list_top_cpu_consumed_workload tool_k8s_list_top_cpu_consumed_workload.go Identifies the Kubernetes workloads (all containers) consuming the most CPU (in cores). metrics-data.read "Show the top 10 workloads consuming the most CPU in cluster 'production'"
k8s_list_top_cpu_consumed_container tool_k8s_list_top_cpu_consumed_container.go Identifies the Kubernetes containers consuming the most CPU (in cores). metrics-data.read "Show the top 10 containers consuming the most CPU in cluster 'production'"
k8s_list_top_memory_consumed_workload tool_k8s_list_top_memory_consumed_workload.go Lists memory-intensive workloads (all containers). metrics-data.read "Show the top 10 workloads consuming the most memory in cluster 'production'"
k8s_list_top_memory_consumed_container tool_k8s_list_top_memory_consumed_container.go Lists memory-intensive containers. metrics-data.read "Show the top 10 containers consuming the most memory in cluster 'production'"
k8s_list_underutilized_pods_cpu_quota tool_k8s_list_underutilized_pods_cpu_quota.go List Kubernetes pods with CPU usage below 25% of the quota limit. metrics-data.read "Show the top 10 underutilized pods by CPU quota in cluster 'production'"
k8s_list_underutilized_pods_memory_quota tool_k8s_list_underutilized_pods_memory_quota.go List Kubernetes pods with memory usage below 25% of the limit. metrics-data.read "Show the top 10 underutilized pods by memory quota in cluster 'production'"

Sysdig Monitor & Sysdig Secure

Tool File Capability Required Permissions Useful Prompts
generate_sysql tool_generate_sysql.go Convert natural language to SysQL via Sysdig Sage. sage.exec (does not work with Service Accounts) "Create a SysQL to list S3 buckets."
run_sysql tool_run_sysql.go Execute caller-supplied Sysdig SysQL queries safely. sage.exec, risks.read "Run the following SysQL…".

Dedicated Sysdig Secure tools (runtime events, event details, process trees) live in the separate @sysdig/secure-mcp-server package.

Historical range (start / end)

All Sysdig Monitor k8s_list_* tools accept two optional parameters:

  • start — RFC3339 timestamp, e.g. 2026-04-16T00:00:00Z
  • end — RFC3339 timestamp, e.g. 2026-04-16T01:00:00Z

When omitted, tools return an instant snapshot (current behaviour). When provided, the underlying PromQL is wrapped in the aggregation appropriate for each tool and evaluated at end:

Tool group Wrapping applied when windowed
CPU / memory usage, underutilized quota, pod count avg_over_time(metric[Ns])
Top restarted pods increase(kube_pod_container_status_restarts_total[Ns])
Top unavailable pods min_over_time(kube_workload_status_unavailable[Ns]) >= 1 (Sysdig-canonical pattern — requires continuous unavailability for the entire window)
HTTP / network errors sum_over_time(metric[Ns]) / N (rate per second)
Inventory tools (clusters, nodes, workloads, pod_containers, cronjobs) max_over_time(metric[Ns]) > 0 (workloads with status=ready/desired/running drop the > 0 guard)

Validation rules (helper: utils.go):

  • end without start → error.
  • start without endend defaults to now.
  • end in the future → clamped to now.
  • end <= start → error.

Windowed queries carry a 60 s client-side PromQL Timeout to fail fast before the Sysdig edge proxy's own 80–90 s cut-off.

Adding a New Tool

  1. See other tools: Check how other tools are implemented so you can have the context on how they should look like.

  2. Create Files: Add tool_<name>.go and tool_<name>_test.go in internal/infra/mcp/tools/.

  3. Implement the Tool:

    • Define a struct that holds the Sysdig client, or any required collaborator.
    • Implement the handle method, which contains the tool's core logic.
    • Implement the RegisterInServer method to define the tool's MCP schema, including its name, description, parameters, and required permissions. Use helpers from utils.go.
    • If a tool does not have any required permission, just specify WithRequiredPermissions(). If the tool requires one or multiple permissions, specify them like WithRequiredPermissions("a.permission", "another.permission").
  4. Write Tests: Use Ginkgo/Gomega to write BDD-style tests. Mock the Sysdig client to cover:

    • Parameter validation
    • Permission metadata
    • Sysdig API client interactions (mocked)
    • Error handling
  5. Register the Tool: Add the new tool to setupHandler() in cmd/server/main.go.

  6. Document: Add the new tool to the README.md and the table in this document.

Testing Philosophy

  • Use BDD-style tests with Ginkgo/Gomega
  • Each tool requires comprehensive test coverage for:
    • Parameter validation (all possible combinations need to be tested)
    • Permission metadata
    • Sysdig API client interactions (mocked using go-mock)
    • Error handling
  • No focused specs (FDescribe, FIt) should be committed