diff --git a/evals/azure-cost/eval.yaml b/evals/azure-cost/eval.yaml index 39a64fc2f..04a501e8c 100644 --- a/evals/azure-cost/eval.yaml +++ b/evals/azure-cost/eval.yaml @@ -371,6 +371,48 @@ stimuli: config: pattern: "(?i)fatal error|unhandled exception|stack trace" + # ═══════════════════════════════════════════ + # App Service Cost Optimization routing prompts + # ═══════════════════════════════════════════ + + # ── appservice-idle-slots-prompt ── + # Assertions: softCheckSkill + isSkillInvoked (invocation rate ≥ 80%) + - name: "App Service idle deployment slots cost" + prompt: "I have deployment slots on my App Service that aren't being used, how much are they costing me?" + tags: + type: integration + tier: full + cost: llm + area: routing + earlyTerminate: '[{"type":"skill-call","skill":"azure-cost"},{"type":"tool-call-count","count":3}]' + graders: + - type: skill-invocation + config: + required: + - azure-cost + - type: output-not-matches + config: + pattern: "(?i)fatal error|unhandled exception|stack trace" + + # ── appservice-plan-downgrade-prompt ── + # Assertions: softCheckSkill + isSkillInvoked (invocation rate ≥ 80%) + - name: "App Service plan downgrade savings" + prompt: "Can I save money by downgrading my Premium App Service plans in dev/test to a cheaper tier?" + tags: + type: integration + tier: full + cost: llm + area: routing + earlyTerminate: '[{"type":"skill-call","skill":"azure-cost"},{"type":"tool-call-count","count":3}]' + graders: + - type: skill-invocation + config: + required: + - azure-cost + - type: output-not-matches + config: + pattern: "(?i)fatal error|unhandled exception|stack trace" + # ═══════════════════════════════════════════ # Response quality tests # ═══════════════════════════════════════════ diff --git a/plugin/skills/azure-cost/SKILL.md b/plugin/skills/azure-cost/SKILL.md index 4d287876f..f369bd3e3 100644 --- a/plugin/skills/azure-cost/SKILL.md +++ b/plugin/skills/azure-cost/SKILL.md @@ -1,6 +1,6 @@ --- name: azure-cost -description: "Azure cost management: query costs, forecast spending, optimize to reduce waste. WHEN: \"Azure costs\", \"Azure bill\", \"cost breakdown\", \"how much am I spending\", \"forecast spending\", \"optimize costs\", \"reduce spending\", \"orphaned resources\", \"rightsize VMs\", \"cost spike\", \"reduce storage costs\", \"AKS cost\". DO NOT USE FOR: deploying resources, provisioning, diagnostics, or security audits." +description: "Azure cost management: query costs, forecast spending, optimize to reduce waste. WHEN: \"Azure costs\", \"Azure bill\", \"cost breakdown\", \"how much am I spending\", \"forecast spending\", \"optimize costs\", \"reduce spending\", \"orphaned resources\", \"rightsize VMs\", \"cost spike\", \"reduce storage costs\", \"App Service cost\", \"web app spending\", \"App Service plan savings\", \"deployment slots\", \"AKS cost\". DO NOT USE FOR: deploying resources, provisioning, diagnostics, or security audits." license: MIT metadata: author: Microsoft @@ -34,12 +34,12 @@ Query historical costs, forecast future spending, optimize to reduce waste. - Management Group: `/providers/Microsoft.Management/managementGroups/` - Billing Account: `/providers/Microsoft.Billing/billingAccounts/` -## Service-Specific Optimization +## Service Optimization Guides - [Redis](cost-optimization/services/redis/azure-cache-for-redis.md) - [Storage](cost-optimization/services/storage/azure-storage.md) +- [App Service](cost-optimization/services/appservice/azure-app-service.md) ## References - [MCP Tools, Best Practices, Safety](references/tools-and-best-practices.md) -- [SDK: Redis .NET](cost-optimization/sdk/azure-resource-manager-redis-dotnet.md) diff --git a/plugin/skills/azure-cost/cost-optimization/sdk/azure-resource-manager-redis-dotnet.md b/plugin/skills/azure-cost/cost-optimization/sdk/azure-resource-manager-redis-dotnet.md deleted file mode 100644 index e5ce86e1e..000000000 --- a/plugin/skills/azure-cost/cost-optimization/sdk/azure-resource-manager-redis-dotnet.md +++ /dev/null @@ -1,31 +0,0 @@ -# Redis Management — .NET SDK Quick Reference - -> Condensed from **azure-resource-manager-redis-dotnet**. Full patterns -> (cache creation, firewall rules, access keys, geo-replication, patching) -> in the **azure-resource-manager-redis-dotnet** plugin skill if installed. - -## Install -dotnet add package Azure.ResourceManager.Redis -dotnet add package Azure.Identity - -## Quick Start - -> **Auth:** `DefaultAzureCredential` is for local development. See [auth-best-practices.md](../auth-best-practices.md) for production patterns. - -```csharp -using Azure.ResourceManager; -using Azure.Identity; -var armClient = new ArmClient(new DefaultAzureCredential()); -``` - -## Best Practices -- Use `WaitUntil.Completed` for operations that must finish before proceeding -- Use `WaitUntil.Started` when you want to poll manually or run operations in parallel -- Use DefaultAzureCredential for **local development only**. In production, use ManagedIdentityCredential — see [auth-best-practices.md](../auth-best-practices.md) -- Handle `RequestFailedException` for ARM API errors -- Use `CreateOrUpdateAsync` for idempotent operations -- Navigate hierarchy via `Get*` methods (e.g., `cache.GetRedisFirewallRules()`) -- Use Premium SKU for production workloads requiring geo-replication, clustering, or persistence -- Enable TLS 1.2 minimum — set `MinimumTlsVersion = RedisTlsVersion.Tls1_2` -- Disable non-SSL port — set `EnableNonSslPort = false` for security -- Rotate keys regularly — use `RegenerateKeyAsync` and update connection strings diff --git a/plugin/skills/azure-cost/cost-optimization/services/appservice/azure-app-service.md b/plugin/skills/azure-cost/cost-optimization/services/appservice/azure-app-service.md new file mode 100644 index 000000000..63150ca53 --- /dev/null +++ b/plugin/skills/azure-cost/cost-optimization/services/appservice/azure-app-service.md @@ -0,0 +1,95 @@ +## Azure App Service Cost Optimization + +Reduce App Service costs through plan rightsizing, idle slot cleanup, and dev/test pricing. + +> **Important:** Always present the total bill and cost breakdown (Step 2+ of the [cost optimization workflow](../../workflow.md)) alongside these recommendations — don't produce savings advice from this guide alone. + +## Subscription Input Options + +Accept any of these to scope the analysis: Subscription ID, Subscription Name, Resource Group, or "All my subscriptions". + +## Cost Optimization Rules + +| Priority | Rule | Detection Logic | Recommendation | Avg Savings | +|----------|------|----------------|----------------|-------------| +| 🔴 Critical | Stopped App on Paid Plan | Site `properties.state == 'Stopped'` joined to its plan (`serverFarmId`) where the plan's `sku.tier` is not Free/Shared/Dynamic | Delete the app or move it to a Free/Shared plan; if it’s the only app on the plan, delete/scale down the plan | $50-500/mo | +| 🔴 Critical | Empty App Service Plan | Plan has zero apps deployed | Delete the plan | $50-400/mo | +| 🟠 High | Premium in Non-Production | Plan `sku.tier in ['Premium','PremiumV2','PremiumV3']` AND `tags.environment in ['dev','test','staging','sandbox']` | Downgrade to Basic or Standard | $100-600/mo | +| 🟠 High | Idle Deployment Slots | Non-production slots with zero traffic for 14+ days | Delete unused slots (slots share plan workers but increase utilization, driving scale-out) | $30-150/mo | +| 🟠 High | Over-Provisioned Plan | CPU avg <20% AND memory avg <30% over 14 days | Scale down SKU or reduce instance count | $50-400/mo | +| 🟡 Medium | No Auto-Scale Rules | Production plan with fixed instance count >2 | Add auto-scale rules to scale in during low traffic | $30-200/mo | +| 🟡 Medium | Missing Dev/Test Pricing | Dev/test workloads on regular pricing | Apply Azure Dev/Test subscription offer (subscription-level, not plan-level) | 30-55% savings | +| 🟢 Low | Untagged App Service | Missing `environment`, `owner`, or `costCenter` tags | Apply tags for cost allocation | N/A | +| 🟢 Low | Old Deployment Slots | Slots older than 90 days not used for blue-green | Review if still needed | Variable | + +## Plan Tier Decision Matrix + +| Workload | Recommended Tier | Key Features | +|----------|-----------------|--------------| +| Dev/test, prototypes | Free F1 | No SLA, limited scale (60 CPU-min/day) | +| Light workloads needing an SLA | Basic B1 | 99.95% SLA, dedicated compute, no auto-scale/slots | +| Low-traffic production | Standard S1 | Auto-scale, slots, backups | +| High-traffic production | Premium P1v3 | Better perf, more slots, VNET | +| Isolated compliance | Isolated I1v2 | Private environment, max scale | + +## Resource Graph Queries + +**Find stopped apps on paid plans (review for deletion or plan downgrade):** + +```kql +Resources +| where type =~ 'microsoft.web/sites' +| where properties.state =~ 'Stopped' +| extend planId = tolower(tostring(properties.serverFarmId)) +| join kind=inner ( + Resources + | where type =~ 'microsoft.web/serverfarms' + | project planId = tolower(id), planSku = tostring(sku.name), planTier = tostring(sku.tier) + ) on planId +| where planTier !in~ ('Free', 'Shared', 'Dynamic') +| project name, resourceGroup, planSku, planTier +``` + +**Find empty App Service Plans:** + +```kql +Resources +| where type =~ 'microsoft.web/serverfarms' +| where properties.numberOfSites == 0 +| project name, resourceGroup, sku=sku.name, location +``` + +**Find Premium plans in non-production:** + +```kql +Resources +| where type =~ 'microsoft.web/serverfarms' +| where sku.tier in~ ('PremiumV2', 'PremiumV3', 'Premium') +| where tags.environment in~ ('dev', 'test', 'staging', 'sandbox') +| project name, resourceGroup, sku=sku.name, tier=sku.tier, tags +``` + +## Tools & Commands + +**Discovery:** Use Azure Resource Graph or `az` CLI for listing App Service resources (the `azure__appservice` MCP tool has limited list support). + +**Azure CLI:** +- `az appservice plan list --resource-group ` - List plans +- `az appservice plan show --name --resource-group ` - Plan details +- `az webapp list --resource-group ` - List web apps +- `az webapp show --name --resource-group ` - App details +- `az webapp deployment slot list --name --resource-group ` - List slots +- `az monitor metrics list --resource --metric CpuPercentage --interval PT1H` - CPU utilization +- `az monitor metrics list --resource --metric MemoryPercentage --interval PT1H` - Memory utilization + +## Pricing Quick Reference + +Approximate monthly costs (Linux, East US): +- **Free F1**: $0 (60 min CPU/day, 1 GB RAM) +- **Basic B1**: ~$13/mo (1 core, 1.75 GB) +- **Standard S1**: ~$69/mo (1 core, 1.75 GB, auto-scale, slots) +- **Premium P1v3**: ~$138/mo (2 cores, 8 GB, better perf) + +Deployment slots share the plan's compute workers (no separate per-slot charge), but extra slots increase resource utilization and can trigger scale-out. Windows plans cost ~30% more than Linux. + +Always validate from [official pricing](https://azure.microsoft.com/pricing/details/app-service/). diff --git a/plugin/skills/azure-cost/cost-optimization/services/storage/azure-storage.md b/plugin/skills/azure-cost/cost-optimization/services/storage/azure-storage.md index 552617b9f..8bd62e0fd 100644 --- a/plugin/skills/azure-cost/cost-optimization/services/storage/azure-storage.md +++ b/plugin/skills/azure-cost/cost-optimization/services/storage/azure-storage.md @@ -1,6 +1,6 @@ ## Azure Storage Cost Optimization -Reference guide for identifying cost savings opportunities in Azure Storage accounts through tier analysis, lifecycle policies, and orphaned resource detection. +Identify cost savings in Azure Storage through tier analysis, lifecycle policies, and orphaned resource detection. ## Subscription Input Options diff --git a/plugin/skills/azure-cost/cost-optimization/workflow.md b/plugin/skills/azure-cost/cost-optimization/workflow.md index 8be64ff02..faf4897bf 100644 --- a/plugin/skills/azure-cost/cost-optimization/workflow.md +++ b/plugin/skills/azure-cost/cost-optimization/workflow.md @@ -57,13 +57,14 @@ azure__get_azure_bestpractices({ Wait for user response before proceeding. -## Step 1.7: Storage-Specific Analysis (Conditional) +## Step 1.7: Service-Specific Analysis (Conditional) -**If the user requests Storage cost optimization**, load: [Azure Storage Cost Optimization](./services/storage/azure-storage.md) +For service-focused requests, load the relevant guide and follow it. For general optimization, skip to Step 2. -**Triggers:** "storage account cost", "blob storage savings", "LRS/GRS/ZRS downgrade", "storage lifecycle savings", "reduce storage spending". - -For Storage-only requests, follow the Storage reference. For general optimization that includes storage, continue to Step 2. +| Triggers | Reference | +|----------|-----------| +| "storage account cost", "blob savings", "LRS/GRS downgrade", "storage lifecycle savings" | [Storage](./services/storage/azure-storage.md) | +| "app service cost", "plan savings", "web app spending", "idle slots", "overprovisioned plan" | [App Service](./services/appservice/azure-app-service.md) | ## Step 1.8: AKS-Specific Analysis (Conditional) @@ -75,14 +76,14 @@ For Storage-only requests, follow the Storage reference. For general optimizatio - User reports a cost spike, unusual cluster utilization, or wants budget alerts **Tool Selection:** -- **Prefer MCP first**: Use `azure__aks` for AKS operations (list clusters, get node pools, inspect configuration) — it provides richer metadata and is consistent with AKS skill conventions in this repo -- **Fall back to CLI**: Use `az aks` and `kubectl` only when the specific operation cannot be performed via the MCP surface +- **Prefer MCP first**: Use `azure__aks` for AKS operations (list clusters, get node pools, inspect configuration) +- **Fall back to CLI**: Use `az aks` and `kubectl` only when MCP doesn't cover the operation -**Reference files (load only what is needed for the request):** -- [Cost Analysis Add-on](./azure-aks-cost-addon.md) — enable namespace-level cost visibility -- [Anomaly Investigation](./azure-aks-anomalies.md) — cost spikes, scaling events, budget alerts +**Reference files:** +- [Cost Analysis Add-on](./azure-aks-cost-addon.md) +- [Anomaly Investigation](./azure-aks-anomalies.md) -> **Note**: For general subscription-wide cost optimization (including AKS resource groups), continue with Step 2. For AKS-focused analysis, follow the instructions in the relevant reference file above. +For general optimization (including AKS resource groups), continue to Step 2. ## Step 1.9: Choose Analysis Scope (for AKS-specific analysis)