How to ask GitHub Copilot Chat questions and let AI automatically choose the right Kubernetes tools
⭐ Starring this repository to support this work
In this chapter, you'll learn the magic of MCP integration: how to ask GitHub Copilot Chat simple questions like "What pods are running in production?" and watch the AI automatically choose and execute the right Kubernetes tools from our MCP server.
💡 The Key Insight: You don't need to remember tool names or parameters. Just ask questions naturally, and the AI figures out what tools to use!
Example of what you'll be able to do:
You: "Are there any failing pods in the default namespace?"
GitHub Copilot: I'll check the pod status for you using the Kubernetes MCP server.
[AI automatically calls k8s_list_pods tool]
[Returns formatted results showing pod status]
Based on the cluster data, I found 2 pods with issues:
- web-app-xyz: CrashLoopBackOff
- api-server-abc: Pending (insufficient resources)
By the end of this chapter, you will understand:
- How the AI decides which MCP tools to use based on your questions
- The exact conversation flow from your question to tool execution to response
- How to ask better questions to get more accurate results
- How to set up VS Code so this "magic" works seamlessly
Let's understand exactly what happens when you ask a question:
graph TB
A["You ask: What pods are failing?"] --> B[GitHub Copilot Chat]
B --> C[AI analyzes your question]
C --> D[AI sees available MCP tools]
D --> E["AI chooses: k8s_list_pods"]
E --> F[MCP Client calls our server]
F --> G[Our server queries Kubernetes]
G --> H[Server formats response for AI]
H --> I[AI gets structured data]
I --> J[AI writes human response]
J --> K["You see: Found 3 failing pods..."]
Step-by-step breakdown:
- Your Question: You type a natural language question
- AI Analysis: Copilot understands what you're asking for
- Tool Selection: AI sees our MCP tools and picks the right one
- Tool Execution: The chosen tool runs against your cluster
- Data Processing: Results are formatted for AI understanding
- Human Response: AI converts technical data into readable answer
Let's get this working with minimal setup. You need three things:
From your terminal in the MCP server directory:
cd 02-mcp-for-devops/code/04
go run ./cmd/serverYou'll see output like:
INFO[2025-08-06T10:30:15Z] Starting Kubernetes MCP Server
INFO[2025-08-06T10:30:15Z] Connected to Kubernetes cluster
INFO[2025-08-06T10:30:15Z] MCP server listening on stdio
Keep this terminal open - this is your MCP server running.
Create or update .vscode/settings.json in your workspace:
{
"mcp.enabled": true,
"chat.mcp.serverSampling": {
"devops-ai-guidelines/.vscode/mcp.json: k8s-mcp-server": {
"allowedModels": [
"github.copilot-chat/claude-sonnet-4"
]
}
}
}Next create .vscode/mcp.json:
{
"servers": {
"k8s-mcp-server": {
"type": "stdio",
"command": "${workspaceFolder}/02-mcp-for-devops/code/04/bin/k8s-mcp-server",
"args": []
}
},
"inputs": []
}What this does: Tells VS Code how to connect to your MCP server.
In VS Code:
- Open Extensions (Ctrl+Shift+X)
- Search for "GitHub Copilot Chat"
- Install it
- Sign in with your GitHub account
That's it! Now let's understand how GitHub Copilot Chat actually finds your MCP server.
This is the crucial part that connects everything together. Here's exactly how GitHub Copilot Chat knows about your MCP server:
graph TB
A[VS Code starts] --> B[Reads .vscode/settings.json]
B --> C[Finds mcp.servers configuration]
C --> D[Starts MCP server process]
D --> E[MCP server registers its tools]
E --> F[GitHub Copilot Chat connects]
F --> G[Copilot sees available tools]
G --> H[Ready to answer questions]
- VS Code Startup: When you open VS Code in your workspace
- Configuration Reading: VS Code reads
.vscode/settings.json - MCP Server Discovery: VS Code finds the
mcp.serverssection:{ "mcp.servers": { "kubernetes": { "command": "go", "args": ["run", "./cmd/server"], "cwd": "${workspaceFolder}/02-mcp-for-devops/code/04" } } } - Server Process Start: VS Code automatically runs
go run ./cmd/server - Tool Registration: Your MCP server tells VS Code about its available tools:
k8s_list_podsk8s_scale_deploymentk8s_get_pod_logs- etc.
- Copilot Connection: GitHub Copilot Chat connects to VS Code's MCP interface
- Tool Availability: Copilot can now see and use your Kubernetes tools
VS Code acts as the bridge between GitHub Copilot Chat and your MCP server:
GitHub Copilot Chat ←→ VS Code MCP Client ←→ Your MCP Server ←→ Kubernetes
- GitHub Copilot Chat: Doesn't directly connect to your MCP server
- VS Code MCP Client: Built into VS Code, manages MCP server connections
- Your MCP Server: Provides Kubernetes tools and handles requests
- Kubernetes: Your actual cluster that gets queried
- You type in Copilot Chat: "What pods are running?"
- Copilot analyzes: Understands you want pod information
- Copilot checks available tools: Sees
k8s_list_podsfrom your MCP server - Copilot sends request to VS Code: "Use k8s_list_pods tool"
- VS Code forwards to MCP server: Calls your Go server
- Your server queries Kubernetes: Gets actual pod data
- Response flows back: Server → VS Code → Copilot → You
If Copilot Chat doesn't see your tools:
- Check VS Code Output: View → Output → Select "MCP" from dropdown
- Verify server is running: Look for "MCP server listening" message
- Check settings syntax: Ensure JSON is valid in
.vscode/settings.json - Restart VS Code: Sometimes needed after configuration changes
Common Issues:
❌ MCP server not starting:
// Wrong path
"cwd": "./wrong/path"✅ Correct path:
// Correct path with workspace variable
"cwd": "${workspaceFolder}/02-mcp-for-devops/code/04"❌ Tool not found:
- Server might not be exposing tools correctly
- Check server logs for tool registration errors
✅ Tools working:
- You'll see tool calls in the MCP output panel
- Copilot will successfully execute commands
The power of MCP is that you don't need to learn commands. Just ask natural questions in GitHub Copilot Chat.
- In VS Code, press
Ctrl+Shift+I(orCmd+Shift+Ion Mac) - This opens the Copilot Chat panel
- You'll see a text box where you can type questions
Here are examples that will automatically use your MCP server:
Question: "What pods are running in the default namespace?"
What happens behind the scenes:
- AI sees your question about pods
- AI looks at available MCP tools
- AI finds
k8s_list_podstool - AI calls the tool with namespace="default"
- Your MCP server queries Kubernetes
- AI formats the response for you
You'll see something like:
I'll check the pods in the default namespace for you.
[Tool execution: k8s_list_pods]
Here are the pods currently running in the default namespace:
✅ **nginx-deployment-abc123** - Running (Ready: 1/1)
✅ **redis-server-def456** - Running (Ready: 1/1)
❌ **webapp-ghi789** - CrashLoopBackOff (Ready: 0/1)
I notice one pod (webapp-ghi789) is failing. Would you like me to get its logs to investigate?
"Are there any failing deployments in production?"
- AI chooses:
k8s_list_deployments - Filters for production namespace
- Analyzes status
"Scale the web-app deployment to 5 replicas"
- AI chooses:
k8s_scale_deployment - Asks for confirmation first
- Executes the scaling
"Show me logs from the api-server pod"
- AI chooses:
k8s_get_pod_logs - Finds the pod automatically
- Returns formatted logs
This is the most important concept to understand. Let's see exactly how the AI decides what to do.
When you ask "What pods are failing?", here's the AI's thought process:
1. Parse question: User wants pod status information
2. Check available tools:
- k8s_list_pods ✓ (can get pod status)
- k8s_scale_deployment ✗ (for scaling, not listing)
- k8s_get_pod_logs ✗ (for logs, not status)
3. Choose: k8s_list_pods
4. Determine parameters:
- namespace: not specified, use "default"
- filter: look for non-Running status
Remember our tool definitions from Chapter 4? The AI uses these descriptions:
{
Name: "k8s_list_pods",
Description: "List all pods in a namespace with their current status",
// AI sees this and knows: "This tool is for getting pod information"
}
{
Name: "k8s_scale_deployment",
Description: "Scale a Kubernetes deployment to specified replicas",
// AI sees this and knows: "This tool is for changing deployment size"
}The more descriptive your tool descriptions, the better the AI chooses!
Sometimes the AI needs more information:
You ask: "Scale the deployment" AI responds: "I can help you scale a deployment. Which deployment would you like to scale, and to how many replicas?"
You ask: "Get logs" AI responds: "I can get logs for you. Which pod would you like logs from, and from which namespace?"
This happens when:
- Required parameters are missing
- Multiple options are possible
- The action could be destructive
Let's walk through actual conversations showing the complete flow:
You: "I'm seeing errors in production, can you help me investigate?"
GitHub Copilot: "I'll help you investigate the production issues. Let me start by checking the overall status of your production environment.
[Executing: k8s_list_pods with namespace="production"]
I found several issues in your production namespace:
🔴 Critical Issues:
user-service-abc123: CrashLoopBackOff (restarted 15 times)payment-api-def456: Error (ImagePullBackOff)
🟡 Warnings:
web-frontend-ghi789: Running but high memory usage
Would you like me to:
- Get detailed logs from the failing pods?
- Check the deployment configurations?
- Look at recent events in the namespace?"
You: "Yes, get logs from the user-service pod"
GitHub Copilot: "I'll get the logs from the failing user-service pod.
[Executing: k8s_get_pod_logs with pod="user-service-abc123", namespace="production"]
Here are the recent logs showing the error:
2025-08-06 10:45:23 ERROR: Failed to connect to database
2025-08-06 10:45:23 ERROR: Connection timeout after 30s
2025-08-06 10:45:23 FATAL: Application shutting down
The issue appears to be a database connectivity problem. The pod can't reach the database and times out after 30 seconds.
Would you like me to:
- Check if the database service is running?
- Verify the service endpoints?
- Look at network policies that might block the connection?"
You: "We're getting a traffic spike, can you scale up our web servers?"
GitHub Copilot: "I can help scale up your web servers for the traffic spike. Let me first check the current state of your deployments to recommend the right scaling approach.
[Executing: k8s_list_deployments]
Current deployment status:
📊 Web Services:
web-frontend: 3 replicas (all running)api-gateway: 2 replicas (all running)user-service: 1 replica (running)
Based on typical traffic patterns, I recommend:
- Scale
web-frontendfrom 3 to 8 replicas - Scale
api-gatewayfrom 2 to 5 replicas - Scale
user-servicefrom 1 to 3 replicas
Should I proceed with these scaling recommendations?"
You: "Yes, scale them up"
GitHub Copilot: "I'll scale up your services now to handle the traffic spike.
[Executing: k8s_scale_deployment with name="web-frontend", replicas=8] [Executing: k8s_scale_deployment with name="api-gateway", replicas=5] [Executing: k8s_scale_deployment with name="user-service", replicas=3]
✅ Scaling completed successfully:
web-frontend: 3 → 8 replicas (scaling in progress)api-gateway: 2 → 5 replicas (scaling in progress)user-service: 1 → 3 replicas (scaling in progress)
The new pods are starting up. It should take about 1-2 minutes for all replicas to be ready. Would you like me to monitor the scaling progress?"
Once you understand the basics, you can ask more sophisticated questions:
"What pods are failing and why?"
- AI uses
k8s_list_podsto find failing pods - Then uses
k8s_get_pod_logsto get error details - Analyzes logs to determine root cause
"Scale down all non-essential services during maintenance"
- AI uses
k8s_list_deploymentsto see all services - Asks which services are non-essential
- Uses
k8s_scale_deploymentfor each one
"The user-service is using too much memory, what should I do?"
- AI understands "user-service" refers to a specific deployment
- Uses
k8s_list_podsto check current resource usage - Suggests scaling or resource limit adjustments
- Can execute scaling with your approval
"Our website is down, help me debug it"
- AI creates a systematic debugging approach:
- Check ingress and load balancer status
- Verify backend services are running
- Check pod health and logs
- Analyze recent deployments or changes
- Suggest fixes based on findings
For developers who want to understand the technical details:
1. You type: "What pods are running?"
2. GitHub Copilot analyzes your message and determines:
- Intent: Get information about pods
- Required action: List pods
- Namespace: Not specified (default to "default")
3. Copilot looks at available MCP tools and sees:
- Tool name: "k8s_list_pods"
- Description: "List all pods in a namespace with their current status"
- Required parameters: namespace (optional, defaults to "default")
4. Copilot constructs MCP call:
{
"method": "tools/call",
"params": {
"name": "k8s_list_pods",
"arguments": {
"namespace": "default"
}
}
}
5. Your MCP server receives the call and:
- Validates the namespace parameter
- Calls Kubernetes API: kubectl get pods -n default
- Formats results as markdown for AI consumption
- Returns structured response
6. Copilot receives formatted data and creates human response:
"Here are the pods currently running in the default namespace..."
The AI chooses tools based on:
-
Keywords in your question:
- "pods" → likely
k8s_list_podsork8s_get_pod_logs - "scale" → likely
k8s_scale_deployment - "logs" → likely
k8s_get_pod_logs
- "pods" → likely
-
Action verbs:
- "show", "list", "get" → read operations
- "scale", "restart", "delete" → write operations
-
Context clues:
- "failing pods" → list pods and filter by status
- "scale to 5" → scale operation with specific replica count
-
Tool descriptions and parameters:
- AI reads our tool descriptions to understand what each tool does
- Checks required vs optional parameters
- Matches your question to the most appropriate tool
❌ Vague: "Check the pods" ✅ Better: "Check pods in the production namespace" ✅ Best: "Show me any failing pods in production and their error logs"
❌ Unclear: "Scale it up" ✅ Better: "Scale the web-app deployment up" ✅ Best: "Scale the web-app deployment to 5 replicas because we're expecting high traffic"
Good pattern:
- "What's the status of my production environment?"
- "Get logs from that failing pod"
- "What's causing the CrashLoopBackOff?"
- "Scale down that deployment temporarily while we fix it"
You don't need to memorize commands or syntax:
❌ Don't do this: "k8s_list_pods namespace=production" ✅ Do this: "Show me all pods in production"
❌ Don't do this: "k8s_scale_deployment name=web-app replicas=3" ✅ Do this: "Scale the web-app to 3 replicas"
In this chapter, we've transformed our MCP server from a standalone tool into a fully integrated development environment. The combination of VS Code, GitHub Copilot, Claude, and our Kubernetes MCP server creates a powerful AI-assisted DevOps workflow.
With our integrated development environment complete, we're ready to tackle enterprise-grade concerns.
Next: Chapter 6: Authentication & Security - Learn how to secure your MCP server for production use with enterprise authentication, authorization, and audit logging.
The integration we've built provides the foundation for secure, scalable, and maintainable AI-powered DevOps workflows. In the next chapter, we'll make it enterprise-ready with proper security controls.