|
| 1 | +--- |
| 2 | +description: | |
| 3 | + This workflow performs comprehensive Terraform security and best practices scanning. |
| 4 | + It analyzes Terraform files for security vulnerabilities, hardcoded secrets, misconfigurations, |
| 5 | + and compliance issues, then generates a detailed security report as a GitHub issue. |
| 6 | +
|
| 7 | +on: |
| 8 | + schedule: daily |
| 9 | + workflow_dispatch: |
| 10 | + |
| 11 | +permissions: |
| 12 | + contents: read |
| 13 | + issues: read |
| 14 | + pull-requests: read |
| 15 | + |
| 16 | +network: defaults |
| 17 | + |
| 18 | +tools: |
| 19 | + github: |
| 20 | + lockdown: false |
| 21 | + |
| 22 | +safe-outputs: |
| 23 | + create-issue: |
| 24 | + title-prefix: "[Terraform Security] " |
| 25 | + labels: [terraform-security, security-scan, infrastructure] |
| 26 | +engine: copilot |
| 27 | +--- |
| 28 | + |
| 29 | +# Terraform Security & Best Practices Agent |
| 30 | + |
| 31 | +Perform a comprehensive security analysis of all Terraform code in the repository and generate a detailed security report as a GitHub issue. |
| 32 | + |
| 33 | +## 🔍 Security Checks to Perform |
| 34 | + |
| 35 | +### 1. **Secrets & Credentials Scanning** |
| 36 | +Scan all `.tf` files for: |
| 37 | +- Hardcoded AWS Access Keys (AKIA*, AWS_ACCESS_KEY_ID) |
| 38 | +- Azure subscription IDs, client secrets, tenant IDs |
| 39 | +- GCP service account keys |
| 40 | +- API tokens, passwords, or connection strings |
| 41 | +- Private keys or certificates |
| 42 | +- Database credentials |
| 43 | +- Any Base64-encoded secrets |
| 44 | + |
| 45 | +**Flag with HIGH severity** if found, even if marked as "dummy" or "example". |
| 46 | + |
| 47 | +### 2. **Network Security Issues** |
| 48 | +Check for: |
| 49 | +- Overly permissive CIDR blocks (`0.0.0.0/0` in security groups) |
| 50 | +- Public IP addresses on sensitive resources |
| 51 | +- Missing network ACLs or firewall rules |
| 52 | +- Unencrypted network traffic (HTTP instead of HTTPS) |
| 53 | +- VPN/VNet configurations exposing internal services |
| 54 | +- Missing private endpoints for PaaS services |
| 55 | + |
| 56 | +### 3. **Encryption & Data Protection** |
| 57 | +Verify that: |
| 58 | +- Storage accounts have encryption at rest enabled |
| 59 | +- Databases use TLS/SSL for connections |
| 60 | +- Key vaults are properly configured |
| 61 | +- Disk encryption is enabled for VMs |
| 62 | +- Backup encryption is configured |
| 63 | +- Customer-managed keys (CMK) are used where required |
| 64 | + |
| 65 | +### 4. **Identity & Access Management** |
| 66 | +Analyze: |
| 67 | +- Overly permissive IAM/RBAC roles |
| 68 | +- Missing principle of least privilege |
| 69 | +- Service principals with Owner/Contributor roles |
| 70 | +- Shared access signatures (SAS) with long expiration |
| 71 | +- Missing managed identities where applicable |
| 72 | +- Wildcard permissions in policies |
| 73 | + |
| 74 | +### 5. **Compliance & Configuration** |
| 75 | +Check for: |
| 76 | +- Missing required tags (environment, owner, cost-center, compliance) |
| 77 | +- Resources without proper naming conventions |
| 78 | +- Soft delete not enabled on critical resources |
| 79 | +- Audit logging disabled |
| 80 | +- Public network access enabled unnecessarily |
| 81 | +- Missing resource locks on production resources |
| 82 | + |
| 83 | +### 6. **Terraform Best Practices** |
| 84 | +Validate: |
| 85 | +- Module versions are pinned (not using `latest`) |
| 86 | +- Backend configuration is secure (no hardcoded values) |
| 87 | +- Variables have proper descriptions and validation |
| 88 | +- Outputs don't expose sensitive values |
| 89 | +- State file encryption is configured |
| 90 | +- `.terraform.lock.hcl` is committed |
| 91 | + |
| 92 | +### 7. **Cost Analysis & Optimization** |
| 93 | +Analyze and estimate for serverless architecture: |
| 94 | + |
| 95 | +- **Serverless Compute Costs**: Lambda/Azure Functions, execution time & memory |
| 96 | + - Total number of invocations per month |
| 97 | + - Average execution duration and memory allocation |
| 98 | + - Cold start optimization opportunities |
| 99 | + - Overprovisioned memory configurations (128MB vs 512MB impact) |
| 100 | + - ARM vs x86 architecture cost savings (AWS Lambda) |
| 101 | + |
| 102 | +- **Message Queue Costs**: SQS, SNS, Azure Service Bus, Event Grid |
| 103 | + - Number of messages per month (requests) |
| 104 | + - Message size and data transfer implications |
| 105 | + - Standard vs FIFO queues cost difference (SQS) |
| 106 | + - Dead letter queue storage costs |
| 107 | + - SNS topic publish operations and subscriptions |
| 108 | + - Event Grid publish operations and event delivery costs |
| 109 | + |
| 110 | +- **API & Gateway Costs**: API Gateway, Azure API Management, Azure Front Door |
| 111 | + - Number of API requests per month |
| 112 | + - REST vs HTTP API cost difference (AWS) |
| 113 | + - WebSocket connections if applicable |
| 114 | + - Caching enabled/disabled impact |
| 115 | + - Request/response payload sizes |
| 116 | + |
| 117 | +- **Storage Costs**: S3, Azure Blob Storage, DynamoDB, Cosmos DB |
| 118 | + - Storage tier usage (Hot, Cool, Archive, Intelligent Tiering) |
| 119 | + - Request costs (GET, PUT, LIST operations) |
| 120 | + - Data retrieval costs from Cool/Archive tiers |
| 121 | + - DynamoDB read/write capacity units (provisioned vs on-demand) |
| 122 | + - Storage redundancy levels (LRS vs GRS) |
| 123 | + |
| 124 | +- **Database & State Management**: DynamoDB, Cosmos DB, Table Storage |
| 125 | + - Provisioned vs on-demand capacity mode |
| 126 | + - Read/Write capacity units and auto-scaling |
| 127 | + - Global tables/replication costs |
| 128 | + - Backup and restore costs |
| 129 | + |
| 130 | +- **Orchestration Costs**: Step Functions, Logic Apps, EventBridge |
| 131 | + - Number of state transitions (Step Functions) |
| 132 | + - Logic Apps action executions |
| 133 | + - EventBridge custom bus and rule evaluations |
| 134 | + |
| 135 | +- **Observability Costs**: CloudWatch, Application Insights, Log Analytics |
| 136 | + - Logs ingestion volume (GB/month) |
| 137 | + - Metrics and custom metrics count |
| 138 | + - Log retention periods |
| 139 | + - Query/analytics costs |
| 140 | + - Distributed tracing costs (X-Ray, App Insights) |
| 141 | + |
| 142 | +- **Data Transfer Costs**: Inter-service, inter-region, internet egress |
| 143 | + - Data transfer between availability zones |
| 144 | + - Cross-region data transfer |
| 145 | + - Data transfer to internet |
| 146 | + - VPC/VNet peering costs |
| 147 | + - NAT Gateway data processing |
| 148 | + |
| 149 | +- **Key Vault & Secrets Management**: Key Vault operations, Secrets Manager |
| 150 | + - Number of secret retrievals per month |
| 151 | + - Standard vs Premium tier (HSM-backed keys) |
| 152 | + - Rotation operations |
| 153 | + - Parameter Store vs Secrets Manager (AWS) |
| 154 | + |
| 155 | +**Serverless Cost Optimization Opportunities**: |
| 156 | +- Functions with excessive memory allocation (right-sizing) |
| 157 | +- Dead code or rarely invoked functions (clean up) |
| 158 | +- Missing reserved capacity for predictable workloads |
| 159 | +- Inefficient polling patterns (consider event-driven) |
| 160 | +- Long-running functions that could be split |
| 161 | +- Missing SQS batch processing (reduce invocations) |
| 162 | +- CloudWatch logs without retention policies (unlimited storage) |
| 163 | +- Missing S3 lifecycle policies |
| 164 | +- DynamoDB tables without auto-scaling |
| 165 | +- Development/staging using same tier as production |
| 166 | + |
| 167 | +**Provide Monthly Cost Estimate**: |
| 168 | +- Calculate estimated monthly cost for each service category |
| 169 | +- Show cost per million invocations/requests |
| 170 | +- Identify top 5 most expensive services |
| 171 | +- Estimate data transfer costs between services |
| 172 | +- Suggest cost savings opportunities with potential savings amount |
| 173 | +- Flag unusual usage patterns that could cause cost spikes |
| 174 | + |
| 175 | +## 📊 Report Structure |
| 176 | + |
| 177 | +Generate a GitHub issue with the following sections: |
| 178 | + |
| 179 | +```markdown |
| 180 | +## 🛡️ Terraform Security Scan Report |
| 181 | + |
| 182 | +**Scan Date**: {current_date} |
| 183 | +**Files Scanned**: {count_of_tf_files} |
| 184 | +**Findings**: {total_issues_found} |
| 185 | + |
| 186 | +--- |
| 187 | + |
| 188 | +### 🚨 Critical Issues (P0) |
| 189 | +{List all critical security vulnerabilities that need immediate action} |
| 190 | + |
| 191 | +### ⚠️ High Priority Issues (P1) |
| 192 | +{Security misconfigurations that should be fixed soon} |
| 193 | + |
| 194 | +### 💡 Medium Priority Issues (P2) |
| 195 | +{Best practices and compliance recommendations} |
| 196 | + |
| 197 | +### ✅ Low Priority / Informational |
| 198 | +{Minor improvements and style suggestions} |
| 199 | + |
| 200 | +--- |
| 201 | + |
| 202 | +### � Cost Analysis & Estimates |
| 203 | + |
| 204 | +**Estimated Monthly Infrastructure Cost**: ${estimated_total_cost} |
| 205 | + |
| 206 | +#### Top 5 Most Expensive Resources/Services |
| 207 | +1. **{lambda_function_name}** (Lambda/Azure Function): ~${monthly_cost}/month |
| 208 | + - Invocations: {count}M/month |
| 209 | + - Avg Duration: {ms}ms, Memory: {mb}MB |
| 210 | + - Cost Driver: {high_invocation_rate / over_provisioned_memory / long_duration} |
| 211 | + |
| 212 | +2. **{dynamodb_table}** (DynamoDB/Cosmos DB): ~${monthly_cost}/month |
| 213 | + - Capacity Mode: {provisioned/on-demand} |
| 214 | + - Read/Write Units: {rcu}/{wcu} |
| 215 | + - Cost Driver: {over_provisioned_capacity / high_storage} |
| 216 | + |
| 217 | +3. **{api_gateway}** (API Gateway/API Management): ~${monthly_cost}/month |
| 218 | + - Requests: {count}M/month |
| 219 | + - Cost Driver: {high_request_volume / missing_caching} |
| 220 | + |
| 221 | +4. **{cloudwatch_logs}** (CloudWatch/Log Analytics): ~${monthly_cost}/month |
| 222 | + - Log Ingestion: {gb}GB/month |
| 223 | + - Cost Driver: {verbose_logging / no_retention_policy} |
| 224 | + |
| 225 | +5. **{data_transfer}** (Data Transfer): ~${monthly_cost}/month |
| 226 | + - Transfer Volume: {gb}GB/month |
| 227 | + - Cost Driver: {inter_region_calls / inefficient_data_flow} |
| 228 | + |
| 229 | +#### 💡 Cost Optimization Opportunities |
| 230 | +| Resource | Current Cost | Potential Savings | Recommendation | |
| 231 | +|----------|-------------|-------------------|----------------| |
| 232 | +| {lambda_function} | ${current}/mo | ${savings}/mo | Reduce memory from 1024MB to 512MB; optimize cold starts | |
| 233 | +| {dynamodb_table} | ${current}/mo | ${savings}/mo | Switch from provisioned to on-demand mode for variable workloads | |
| 234 | +| {cloudwatch_logs} | ${current}/mo | ${savings}/mo | Set 7-day retention for debug logs, 30 days for app logs | |
| 235 | +| {sqs_queue} | ${current}/mo | ${savings}/mo | Implement batching to reduce Lambda invocations by 80% | |
| 236 | +| {step_function} | ${current}/mo | ${savings}/mo | Use Express workflows for high-volume, short-duration tasks | |
| 237 | + |
| 238 | +**Total Potential Monthly Savings**: ~${total_savings}/month (${percentage}% reduction) |
| 239 | + |
| 240 | +#### Cost Breakdown by Category |
| 241 | +- ⚡ **Serverless Compute** (Lambda/Functions): ${compute_cost}/month ({percentage}%) |
| 242 | + - Total invocations: {count}M/month |
| 243 | + - Avg duration: {ms}ms, Avg memory: {mb}MB |
| 244 | +- 📨 **Messaging** (SQS/SNS/EventGrid/ServiceBus): ${messaging_cost}/month ({percentage}%) |
| 245 | + - Total messages: {count}M/month |
| 246 | +- 🌐 **API Gateway/Management**: ${api_cost}/month ({percentage}%) |
| 247 | + - Total requests: {count}M/month |
| 248 | +- 💾 **Storage** (S3/Blob/DynamoDB): ${storage_cost}/month ({percentage}%) |
| 249 | +- 📊 **Orchestration** (Step Functions/Logic Apps): ${orchestration_cost}/month ({percentage}%) |
| 250 | +- 📈 **Observability** (CloudWatch/App Insights): ${observability_cost}/month ({percentage}%) |
| 251 | +- 🌍 **Data Transfer**: ${transfer_cost}/month ({percentage}%) |
| 252 | +- 🔐 **Security & Secrets**: ${security_cost}/month ({percentage}%) |
| 253 | + |
| 254 | +#### ⚠️ Cost Risk Flags |
| 255 | +- Functions without memory optimization (over/under-provisioned) |
| 256 | +- High cold start rates increasing duration costs |
| 257 | +- CloudWatch logs without retention policies (unlimited growth) |
| 258 | +- Missing SQS/SNS message batching |
| 259 | +- Synchronous invocations that could be async |
| 260 | +- No reserved capacity for predictable workloads |
| 261 | +- DynamoDB tables in provisioned mode with low utilization |
| 262 | +- Missing S3/Blob lifecycle policies |
| 263 | +- Excessive inter-region data transfer |
| 264 | +- Development environments without usage limits |
| 265 | + |
| 266 | +--- |
| 267 | + |
| 268 | +### 📈 Security Score |
| 269 | +**Overall Score**: {calculate_score}/100 |
| 270 | + |
| 271 | +**Score Breakdown**: |
| 272 | +- Secrets Management: {score}/20 |
| 273 | +- Network Security: {score}/20 |
| 274 | +- Encryption: {score}/20 |
| 275 | +- IAM/RBAC: {score}/20 |
| 276 | +- Compliance: {score}/20 |
| 277 | + |
| 278 | +--- |
| 279 | + |
| 280 | +### 🎯 Top 3 Recommended Actions |
| 281 | +1. {most_critical_action} |
| 282 | +2. {second_critical_action} |
| 283 | +3. {third_critical_action} |
| 284 | + |
| 285 | +--- |
| 286 | + |
| 287 | +### 🔐 Expert Security Advice |
| 288 | +{One expert terraform security tip with emoji} |
| 289 | + |
| 290 | +--- |
| 291 | + |
| 292 | +### 💡 Expert Cost Optimization Tip |
| 293 | +{One expert serverless cost optimization tip with emoji - e.g., Lambda memory tuning, SQS batching, DynamoDB capacity modes, etc.} |
| 294 | + |
| 295 | +--- |
| 296 | + |
| 297 | +### 📚 Resources |
| 298 | +- [Azure Security Best Practices](https://docs.microsoft.com/azure/security/) |
| 299 | +- [Terraform Security Documentation](https://www.terraform.io/docs/language/values/sensitive.html) |
| 300 | +- [CIS Azure Foundations Benchmark](https://www.cisecurity.org/benchmark/azure) |
| 301 | +- [AWS Lambda Pricing](https://aws.amazon.com/lambda/pricing/) |
| 302 | +- [Azure Functions Pricing](https://azure.microsoft.com/pricing/details/functions/) |
| 303 | +- [AWS Serverless Cost Optimization](https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-2/) |
| 304 | +- [Azure Cost Management Best Practices](https://docs.microsoft.com/azure/cost-management-billing/) |
| 305 | +- [Serverless Cost Calculator](https://cost-calculator.bref.sh/) |
| 306 | +- [AWS Cost Explorer](https://aws.amazon.com/aws-cost-management/aws-cost-explorer/) |
| 307 | +``` |
| 308 | + |
| 309 | +## 🎨 Style Guidelines |
| 310 | + |
| 311 | +- Use clear severity levels: 🚨 CRITICAL, ⚠️ HIGH, 💡 MEDIUM, ℹ️ LOW |
| 312 | +- Include file paths and line numbers for each finding |
| 313 | +- Provide actionable remediation steps, not just problems |
| 314 | +- Reference specific Terraform resources by name |
| 315 | +- Link to relevant documentation for fixes |
| 316 | +- Keep tone professional but helpful |
| 317 | +- Use emojis sparingly for visual hierarchy |
| 318 | + |
| 319 | +## ⚡ Process |
| 320 | + |
| 321 | +1. **Scan Repository**: Read all `.tf`, `.tfvars`, and `.tf.json` files |
| 322 | +2. **Analyze Code**: Check against all security criteria listed above |
| 323 | +3. **Calculate Costs**: Estimate monthly infrastructure costs based on resource configurations |
| 324 | +4. **Identify Savings**: Find cost optimization opportunities |
| 325 | +5. **Calculate Score**: Assign severity and compute security score |
| 326 | +6. **Generate Report**: Create detailed issue with findings and cost analysis |
| 327 | +7. **Prioritize Actions**: List top 3 most important fixes |
| 328 | +8. **Add Context**: Include expert advice and relevant resources |
| 329 | + |
| 330 | +## 🎯 Success Criteria |
| 331 | + |
| 332 | +- All terraform files analyzed |
| 333 | +- Security issues categorized by severity |
| 334 | +- Specific file/line references provided |
| 335 | +- Actionable remediation steps included |
| 336 | +- Security score calculated |
| 337 | +- Monthly cost estimate provided |
| 338 | +- Cost optimization opportunities identified with potential savings |
| 339 | +- Top 5 most expensive resources highlighted |
| 340 | +- Cost breakdown by category included |
| 341 | +- Expert security and cost recommendations provided |
0 commit comments