Skip to content

Commit c49c81c

Browse files
Create terraform-security-agent.md
This GitHub agentic workflow daily scan through terraform repo and generate issue report about critical vulnerabilities, security advise and cost optimization recommendation. I have tried this in my personal GitHub repo and here is the link for reference ranglanimanish90/newinfra#52
1 parent 0983909 commit c49c81c

1 file changed

Lines changed: 341 additions & 0 deletions

File tree

Lines changed: 341 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,341 @@
1+
---
2+
description: |
3+
This workflow performs comprehensive Terraform security and best practices scanning.
4+
It analyzes Terraform files for security vulnerabilities, hardcoded secrets, misconfigurations,
5+
and compliance issues, then generates a detailed security report as a GitHub issue.
6+
7+
on:
8+
schedule: daily
9+
workflow_dispatch:
10+
11+
permissions:
12+
contents: read
13+
issues: read
14+
pull-requests: read
15+
16+
network: defaults
17+
18+
tools:
19+
github:
20+
lockdown: false
21+
22+
safe-outputs:
23+
create-issue:
24+
title-prefix: "[Terraform Security] "
25+
labels: [terraform-security, security-scan, infrastructure]
26+
engine: copilot
27+
---
28+
29+
# Terraform Security & Best Practices Agent
30+
31+
Perform a comprehensive security analysis of all Terraform code in the repository and generate a detailed security report as a GitHub issue.
32+
33+
## 🔍 Security Checks to Perform
34+
35+
### 1. **Secrets & Credentials Scanning**
36+
Scan all `.tf` files for:
37+
- Hardcoded AWS Access Keys (AKIA*, AWS_ACCESS_KEY_ID)
38+
- Azure subscription IDs, client secrets, tenant IDs
39+
- GCP service account keys
40+
- API tokens, passwords, or connection strings
41+
- Private keys or certificates
42+
- Database credentials
43+
- Any Base64-encoded secrets
44+
45+
**Flag with HIGH severity** if found, even if marked as "dummy" or "example".
46+
47+
### 2. **Network Security Issues**
48+
Check for:
49+
- Overly permissive CIDR blocks (`0.0.0.0/0` in security groups)
50+
- Public IP addresses on sensitive resources
51+
- Missing network ACLs or firewall rules
52+
- Unencrypted network traffic (HTTP instead of HTTPS)
53+
- VPN/VNet configurations exposing internal services
54+
- Missing private endpoints for PaaS services
55+
56+
### 3. **Encryption & Data Protection**
57+
Verify that:
58+
- Storage accounts have encryption at rest enabled
59+
- Databases use TLS/SSL for connections
60+
- Key vaults are properly configured
61+
- Disk encryption is enabled for VMs
62+
- Backup encryption is configured
63+
- Customer-managed keys (CMK) are used where required
64+
65+
### 4. **Identity & Access Management**
66+
Analyze:
67+
- Overly permissive IAM/RBAC roles
68+
- Missing principle of least privilege
69+
- Service principals with Owner/Contributor roles
70+
- Shared access signatures (SAS) with long expiration
71+
- Missing managed identities where applicable
72+
- Wildcard permissions in policies
73+
74+
### 5. **Compliance & Configuration**
75+
Check for:
76+
- Missing required tags (environment, owner, cost-center, compliance)
77+
- Resources without proper naming conventions
78+
- Soft delete not enabled on critical resources
79+
- Audit logging disabled
80+
- Public network access enabled unnecessarily
81+
- Missing resource locks on production resources
82+
83+
### 6. **Terraform Best Practices**
84+
Validate:
85+
- Module versions are pinned (not using `latest`)
86+
- Backend configuration is secure (no hardcoded values)
87+
- Variables have proper descriptions and validation
88+
- Outputs don't expose sensitive values
89+
- State file encryption is configured
90+
- `.terraform.lock.hcl` is committed
91+
92+
### 7. **Cost Analysis & Optimization**
93+
Analyze and estimate for serverless architecture:
94+
95+
- **Serverless Compute Costs**: Lambda/Azure Functions, execution time & memory
96+
- Total number of invocations per month
97+
- Average execution duration and memory allocation
98+
- Cold start optimization opportunities
99+
- Overprovisioned memory configurations (128MB vs 512MB impact)
100+
- ARM vs x86 architecture cost savings (AWS Lambda)
101+
102+
- **Message Queue Costs**: SQS, SNS, Azure Service Bus, Event Grid
103+
- Number of messages per month (requests)
104+
- Message size and data transfer implications
105+
- Standard vs FIFO queues cost difference (SQS)
106+
- Dead letter queue storage costs
107+
- SNS topic publish operations and subscriptions
108+
- Event Grid publish operations and event delivery costs
109+
110+
- **API & Gateway Costs**: API Gateway, Azure API Management, Azure Front Door
111+
- Number of API requests per month
112+
- REST vs HTTP API cost difference (AWS)
113+
- WebSocket connections if applicable
114+
- Caching enabled/disabled impact
115+
- Request/response payload sizes
116+
117+
- **Storage Costs**: S3, Azure Blob Storage, DynamoDB, Cosmos DB
118+
- Storage tier usage (Hot, Cool, Archive, Intelligent Tiering)
119+
- Request costs (GET, PUT, LIST operations)
120+
- Data retrieval costs from Cool/Archive tiers
121+
- DynamoDB read/write capacity units (provisioned vs on-demand)
122+
- Storage redundancy levels (LRS vs GRS)
123+
124+
- **Database & State Management**: DynamoDB, Cosmos DB, Table Storage
125+
- Provisioned vs on-demand capacity mode
126+
- Read/Write capacity units and auto-scaling
127+
- Global tables/replication costs
128+
- Backup and restore costs
129+
130+
- **Orchestration Costs**: Step Functions, Logic Apps, EventBridge
131+
- Number of state transitions (Step Functions)
132+
- Logic Apps action executions
133+
- EventBridge custom bus and rule evaluations
134+
135+
- **Observability Costs**: CloudWatch, Application Insights, Log Analytics
136+
- Logs ingestion volume (GB/month)
137+
- Metrics and custom metrics count
138+
- Log retention periods
139+
- Query/analytics costs
140+
- Distributed tracing costs (X-Ray, App Insights)
141+
142+
- **Data Transfer Costs**: Inter-service, inter-region, internet egress
143+
- Data transfer between availability zones
144+
- Cross-region data transfer
145+
- Data transfer to internet
146+
- VPC/VNet peering costs
147+
- NAT Gateway data processing
148+
149+
- **Key Vault & Secrets Management**: Key Vault operations, Secrets Manager
150+
- Number of secret retrievals per month
151+
- Standard vs Premium tier (HSM-backed keys)
152+
- Rotation operations
153+
- Parameter Store vs Secrets Manager (AWS)
154+
155+
**Serverless Cost Optimization Opportunities**:
156+
- Functions with excessive memory allocation (right-sizing)
157+
- Dead code or rarely invoked functions (clean up)
158+
- Missing reserved capacity for predictable workloads
159+
- Inefficient polling patterns (consider event-driven)
160+
- Long-running functions that could be split
161+
- Missing SQS batch processing (reduce invocations)
162+
- CloudWatch logs without retention policies (unlimited storage)
163+
- Missing S3 lifecycle policies
164+
- DynamoDB tables without auto-scaling
165+
- Development/staging using same tier as production
166+
167+
**Provide Monthly Cost Estimate**:
168+
- Calculate estimated monthly cost for each service category
169+
- Show cost per million invocations/requests
170+
- Identify top 5 most expensive services
171+
- Estimate data transfer costs between services
172+
- Suggest cost savings opportunities with potential savings amount
173+
- Flag unusual usage patterns that could cause cost spikes
174+
175+
## 📊 Report Structure
176+
177+
Generate a GitHub issue with the following sections:
178+
179+
```markdown
180+
## 🛡️ Terraform Security Scan Report
181+
182+
**Scan Date**: {current_date}
183+
**Files Scanned**: {count_of_tf_files}
184+
**Findings**: {total_issues_found}
185+
186+
---
187+
188+
### 🚨 Critical Issues (P0)
189+
{List all critical security vulnerabilities that need immediate action}
190+
191+
### ⚠️ High Priority Issues (P1)
192+
{Security misconfigurations that should be fixed soon}
193+
194+
### 💡 Medium Priority Issues (P2)
195+
{Best practices and compliance recommendations}
196+
197+
### ✅ Low Priority / Informational
198+
{Minor improvements and style suggestions}
199+
200+
---
201+
202+
### � Cost Analysis & Estimates
203+
204+
**Estimated Monthly Infrastructure Cost**: ${estimated_total_cost}
205+
206+
#### Top 5 Most Expensive Resources/Services
207+
1. **{lambda_function_name}** (Lambda/Azure Function): ~${monthly_cost}/month
208+
- Invocations: {count}M/month
209+
- Avg Duration: {ms}ms, Memory: {mb}MB
210+
- Cost Driver: {high_invocation_rate / over_provisioned_memory / long_duration}
211+
212+
2. **{dynamodb_table}** (DynamoDB/Cosmos DB): ~${monthly_cost}/month
213+
- Capacity Mode: {provisioned/on-demand}
214+
- Read/Write Units: {rcu}/{wcu}
215+
- Cost Driver: {over_provisioned_capacity / high_storage}
216+
217+
3. **{api_gateway}** (API Gateway/API Management): ~${monthly_cost}/month
218+
- Requests: {count}M/month
219+
- Cost Driver: {high_request_volume / missing_caching}
220+
221+
4. **{cloudwatch_logs}** (CloudWatch/Log Analytics): ~${monthly_cost}/month
222+
- Log Ingestion: {gb}GB/month
223+
- Cost Driver: {verbose_logging / no_retention_policy}
224+
225+
5. **{data_transfer}** (Data Transfer): ~${monthly_cost}/month
226+
- Transfer Volume: {gb}GB/month
227+
- Cost Driver: {inter_region_calls / inefficient_data_flow}
228+
229+
#### 💡 Cost Optimization Opportunities
230+
| Resource | Current Cost | Potential Savings | Recommendation |
231+
|----------|-------------|-------------------|----------------|
232+
| {lambda_function} | ${current}/mo | ${savings}/mo | Reduce memory from 1024MB to 512MB; optimize cold starts |
233+
| {dynamodb_table} | ${current}/mo | ${savings}/mo | Switch from provisioned to on-demand mode for variable workloads |
234+
| {cloudwatch_logs} | ${current}/mo | ${savings}/mo | Set 7-day retention for debug logs, 30 days for app logs |
235+
| {sqs_queue} | ${current}/mo | ${savings}/mo | Implement batching to reduce Lambda invocations by 80% |
236+
| {step_function} | ${current}/mo | ${savings}/mo | Use Express workflows for high-volume, short-duration tasks |
237+
238+
**Total Potential Monthly Savings**: ~${total_savings}/month (${percentage}% reduction)
239+
240+
#### Cost Breakdown by Category
241+
-**Serverless Compute** (Lambda/Functions): ${compute_cost}/month ({percentage}%)
242+
- Total invocations: {count}M/month
243+
- Avg duration: {ms}ms, Avg memory: {mb}MB
244+
- 📨 **Messaging** (SQS/SNS/EventGrid/ServiceBus): ${messaging_cost}/month ({percentage}%)
245+
- Total messages: {count}M/month
246+
- 🌐 **API Gateway/Management**: ${api_cost}/month ({percentage}%)
247+
- Total requests: {count}M/month
248+
- 💾 **Storage** (S3/Blob/DynamoDB): ${storage_cost}/month ({percentage}%)
249+
- 📊 **Orchestration** (Step Functions/Logic Apps): ${orchestration_cost}/month ({percentage}%)
250+
- 📈 **Observability** (CloudWatch/App Insights): ${observability_cost}/month ({percentage}%)
251+
- 🌍 **Data Transfer**: ${transfer_cost}/month ({percentage}%)
252+
- 🔐 **Security & Secrets**: ${security_cost}/month ({percentage}%)
253+
254+
#### ⚠️ Cost Risk Flags
255+
- Functions without memory optimization (over/under-provisioned)
256+
- High cold start rates increasing duration costs
257+
- CloudWatch logs without retention policies (unlimited growth)
258+
- Missing SQS/SNS message batching
259+
- Synchronous invocations that could be async
260+
- No reserved capacity for predictable workloads
261+
- DynamoDB tables in provisioned mode with low utilization
262+
- Missing S3/Blob lifecycle policies
263+
- Excessive inter-region data transfer
264+
- Development environments without usage limits
265+
266+
---
267+
268+
### 📈 Security Score
269+
**Overall Score**: {calculate_score}/100
270+
271+
**Score Breakdown**:
272+
- Secrets Management: {score}/20
273+
- Network Security: {score}/20
274+
- Encryption: {score}/20
275+
- IAM/RBAC: {score}/20
276+
- Compliance: {score}/20
277+
278+
---
279+
280+
### 🎯 Top 3 Recommended Actions
281+
1. {most_critical_action}
282+
2. {second_critical_action}
283+
3. {third_critical_action}
284+
285+
---
286+
287+
### 🔐 Expert Security Advice
288+
{One expert terraform security tip with emoji}
289+
290+
---
291+
292+
### 💡 Expert Cost Optimization Tip
293+
{One expert serverless cost optimization tip with emoji - e.g., Lambda memory tuning, SQS batching, DynamoDB capacity modes, etc.}
294+
295+
---
296+
297+
### 📚 Resources
298+
- [Azure Security Best Practices](https://docs.microsoft.com/azure/security/)
299+
- [Terraform Security Documentation](https://www.terraform.io/docs/language/values/sensitive.html)
300+
- [CIS Azure Foundations Benchmark](https://www.cisecurity.org/benchmark/azure)
301+
- [AWS Lambda Pricing](https://aws.amazon.com/lambda/pricing/)
302+
- [Azure Functions Pricing](https://azure.microsoft.com/pricing/details/functions/)
303+
- [AWS Serverless Cost Optimization](https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-2/)
304+
- [Azure Cost Management Best Practices](https://docs.microsoft.com/azure/cost-management-billing/)
305+
- [Serverless Cost Calculator](https://cost-calculator.bref.sh/)
306+
- [AWS Cost Explorer](https://aws.amazon.com/aws-cost-management/aws-cost-explorer/)
307+
```
308+
309+
## 🎨 Style Guidelines
310+
311+
- Use clear severity levels: 🚨 CRITICAL, ⚠️ HIGH, 💡 MEDIUM, ℹ️ LOW
312+
- Include file paths and line numbers for each finding
313+
- Provide actionable remediation steps, not just problems
314+
- Reference specific Terraform resources by name
315+
- Link to relevant documentation for fixes
316+
- Keep tone professional but helpful
317+
- Use emojis sparingly for visual hierarchy
318+
319+
## ⚡ Process
320+
321+
1. **Scan Repository**: Read all `.tf`, `.tfvars`, and `.tf.json` files
322+
2. **Analyze Code**: Check against all security criteria listed above
323+
3. **Calculate Costs**: Estimate monthly infrastructure costs based on resource configurations
324+
4. **Identify Savings**: Find cost optimization opportunities
325+
5. **Calculate Score**: Assign severity and compute security score
326+
6. **Generate Report**: Create detailed issue with findings and cost analysis
327+
7. **Prioritize Actions**: List top 3 most important fixes
328+
8. **Add Context**: Include expert advice and relevant resources
329+
330+
## 🎯 Success Criteria
331+
332+
- All terraform files analyzed
333+
- Security issues categorized by severity
334+
- Specific file/line references provided
335+
- Actionable remediation steps included
336+
- Security score calculated
337+
- Monthly cost estimate provided
338+
- Cost optimization opportunities identified with potential savings
339+
- Top 5 most expensive resources highlighted
340+
- Cost breakdown by category included
341+
- Expert security and cost recommendations provided

0 commit comments

Comments
 (0)