- ๐๏ธ July 19, 2019 โ a former AWS employee uses a misconfigured WAF rule in Capital One's infrastructure to mount an SSRF attack against the EC2 metadata service
- ๐ซ The IAM role attached to the WAF has wildcard
s3:Get*ands3:List*permissions across 700+ buckets - ๐พ Attacker exfiltrates 106 million records โ names, addresses, credit scores, 140,000 Social Security numbers
- ๐ฐ Settlement + remediation: ~$190 million
- ๐ง The vulnerable WAF, the over-privileged IAM role, and the exposed metadata endpoint were all declared in Terraform โ and never scanned
- ๐ช By 2020 Capital One had Checkov in their pipeline. Two years and $190M too late
๐ค Think: Lecture 5 covered SAST scanning your application code. What scans your infrastructure code before
terraform applylights it on fire?
| # | ๐ Outcome |
|---|---|
| 1 | โ Define Infrastructure-as-Code and explain why it created a new class of vulnerability |
| 2 | โ Recognize the top IaC misconfiguration categories (CIS / NIST mapping) |
| 3 | โ Run Checkov against Terraform + Pulumi and read its JSON output |
| 4 | โ Run KICS against an Ansible playbook and triage the Rego-based findings |
| 5 | โ Explain why tfsec was retired and where IaC scanning lives in Trivy today |
| 6 | โ Write a custom Checkov policy in YAML for a project-specific rule |
graph LR
L4["๐ L4 CI/CD<br/>Pipelines"] --> L6
L5["๐งช L5 SAST/DAST<br/>App code"] --> L6
L6["๐๏ธ L6 IaC scan<br/>Infra code (here)"] --> L7["๐ฆ L7 Container<br/>Image scan"]
L6 -.feeds.-> L9["๐ L9 Policy-as-Code<br/>at admission"]
style L6 fill:#FF9800,color:#fff
- ๐ Building on L4 (CI/CD): IaC scanning runs as a pipeline stage โ same gates, new file types
- ๐ Building on L5 (SAST): same idea (analyze static text), but the language is HCL/YAML/Python-Pulumi and the bugs are misconfigurations, not memory corruption
- ๐ฃ๏ธ Setting up L9: Conftest/Rego in Lecture 9 reuses the policy-as-code idea you meet here
๐ฌ "Treat infrastructure the same way you treat application code: version it, test it, review it, deploy it from a pipeline." โ Kief Morris, Infrastructure as Code (O'Reilly, 2nd ed., 2020)
| ๐ท๏ธ Tool | ๐ Language | ๐ฏ Model | ๐๏ธ Origin |
|---|---|---|---|
| ๐ฆ Terraform / OpenTofu | HCL | Declarative, state-driven | HashiCorp 2014; OpenTofu fork September 2023 |
| ๐ช Pulumi | Python/TS/Go/.NET | Declarative via real code | Joe Duffy + team, 2017 |
| ๐ด Ansible | YAML | Imperative push (SSH) | Michael DeHaan, 2012, acquired by Red Hat 2015 |
| โ๏ธ CloudFormation | YAML/JSON | AWS-native declarative | AWS, 2011 |
| โธ๏ธ Helm | Templated YAML | K8s package manager | 2016 (Deis); CNCF Graduated 2020 |
- ๐ช All five are static text files. Every one of them can be scanned before it ships.
flowchart LR
Dev[๐ฉโ๐ป Developer] -->|git push| Repo[๐ฆ Git repo]
Repo -->|terraform apply| Cloud[โ๏ธ Cloud provider]
Cloud -->|provisions| Resource[๐ชฃ S3 bucket, IAM role, SG...]
DevMistake[๐ฑ Typo: '0.0.0.0/0'] -.-> Repo
Resource -.-> Internet[๐ World-readable]
style DevMistake fill:#F44336,color:#fff
style Internet fill:#F44336,color:#fff
- โก Mistakes that used to be one engineer ร one resource are now one git push ร N replicas
- ๐ง IBM's 2024 Cost of a Data Breach report attributes ~45% of cloud breaches to misconfiguration โ more than any other root cause
- ๐ฏ The whole point of IaC scanning: catch the typo before
terraform applydoes it to 200 buckets
๐ค Think: Lecture 5's SAST checks application code. IaC scanning checks infrastructure code. Same shift-left philosophy; different file type.
These are the categories every scanner ships rules for. Memorize them โ they make the news.
| ๐จ Category | ๐ฅ Typical mistake | ๐ก๏ธ Mitigation |
|---|---|---|
| ๐ Public network exposure | cidr_blocks = ["0.0.0.0/0"] on SSH/RDP |
Restrict CIDR or use bastion/SSM |
| ๐ Hard-coded secrets | password = "admin123" in HCL |
Vault / cloud secret manager (links to L3) |
| ๐ชฃ Public storage | S3 bucket without block_public_access |
Default-deny ACL + bucket policy |
| ๐ง Over-privileged IAM | "Action": "*" / "Resource": "*" |
Least privilege + permission boundaries |
| ๐ Unencrypted at rest | EBS/RDS/S3 without encryption = true |
Encrypt-by-default + customer-managed keys |
| ๐ No logging | CloudTrail/VPC flow logs disabled | Centralized log destination + retention SLA |
| ๐ Cross-account trust | Principal = "*" in resource policy |
Specific account IDs only |
| ๐๏ธ Old TLS | min_tls_version = "1.0" |
TLS 1.2+ enforced |
- ๐ These map directly to the CIS Benchmarks (Center for Internet Security) and NIST 800-53 controls; every scanner ships them as rule IDs like
CKV_AWS_19
graph TB
subgraph "Active in 2026"
CK[โ
Checkov<br/>Bridgecrew/Palo Alto<br/>3.x, 2,500+ rules]
KI[โ
KICS<br/>Checkmarx<br/>2,400+ Rego queries]
TV[โ
Trivy IaC mode<br/>Aqua<br/>via 'trivy config']
TR[โ
Terrascan<br/>Tenable]
end
subgraph "Retired"
TS[โ tfsec<br/>archived, merged into Trivy<br/>Feb 2023]
end
style CK fill:#4CAF50,color:#fff
style KI fill:#4CAF50,color:#fff
style TV fill:#4CAF50,color:#fff
style TS fill:#9E9E9E,color:#fff
- ๐ชฆ tfsec is dead. Aqua consolidated tfsec into Trivy in February 2023. Last release v1.28.14 was a dependency-CVE fix only. New scans go through
trivy config <path>โ same rule heritage, broader format coverage - ๐ฏ This course pins Checkov 3.x for Task 1 (Terraform + Pulumi) and KICS for Task 2 (Ansible) โ both free, OSS, and represent the two dominant rule-language families
- ๐ข Built by Bridgecrew (acquired by Palo Alto Networks, March 2021); open-sourced 2019
- ๐ Written in Python (
pip install checkov); ships rules in YAML + Python - ๐ข Latest major: Checkov 3.x (2026) โ 2,500+ built-in policies, 800+ graph-based checks
- ๐ Scans: Terraform, OpenTofu, CloudFormation, Kubernetes, Helm, Dockerfile, GitHub Actions, ARM, Bicep, OpenAPI, Pulumi, Ansible (basic)
# Quick start used in the lab
pip install checkov
checkov -d ./terraform/ --output cli --output json --output-file-path results| ๐ Output | ๐ฏ Meaning |
|---|---|
--output cli |
Human-readable, colored summary |
--output json |
Machine-readable, importable to DefectDojo (L10) |
--output sarif |
GitHub Code Scanning format |
--skip-check CKV_AWS_19 |
Skip a rule (justify in PR description) |
- ๐ง Each finding ships with fix guidance โ Checkov is one of the few scanners that points you at a remediation line, not just a problem
Check: CKV_AWS_18: "Ensure the S3 bucket has access logging enabled"
FAILED for resource: aws_s3_bucket.user_uploads
File: /modules/storage/main.tf:14-22
Guide: https://docs.bridgecrew.io/docs/s3_13-enable-logging
| ๐ท๏ธ Element | ๐ฏ Meaning |
|---|---|
CKV_AWS_18 |
Stable rule ID โ use it for suppress lists |
| FAILED | One of PASSED / FAILED / SKIPPED |
| Resource | The HCL block that triggered |
| File + line | Exact remediation location |
| Guide | Bridgecrew's narrative explanation |
- ๐ง Critical reading skill: when reviewing a Checkov report, sort by rule ID frequency first โ one missing default in a module replicates as 30 findings; fix the module, fix all 30
- ๐ข Built by Checkmarx, open-sourced November 2020; written in Go
- ๐ Rules in Rego (same language as OPA โ directly relevant to Lecture 9)
- ๐ข Latest stable: 2.x (last release March 2025) โ 2,400+ Rego queries
- ๐ Scans: Terraform, K8s, Ansible, Docker/Compose, CloudFormation, OpenAPI, Helm, Bicep, Pulumi, Crossplane, GitHub Workflows, gRPC
# Used for Task 2 (Ansible) in the lab
docker run -v "$PWD:/path" checkmarx/kics:latest \
scan -p /path/ansible/ -o /path/results --report-formats json,sarif- ๐ Checkov vs KICS โ when to use which?
- Checkov has deeper Terraform-specific checks (graph relationships across resources)
- KICS has wider language coverage (Ansible, Helm templates, OpenAPI) and a uniform Rego rule format โ easier to write a custom rule once it works for one input type
flowchart LR
R[๐ Policy in Rego/YAML/Python] --> S[๐ Scanner]
F[๐ Your IaC file] --> S
S --> O[๐ Allowed / Denied + reason]
style R fill:#9C27B0,color:#fff
style S fill:#FF9800,color:#fff
- ๐ก Policy-as-Code = your security rules live in version control, are reviewed in PRs, and execute deterministically in CI
- ๐ช Both Checkov and KICS implement this; the same idea powers Conftest (Lecture 9) and Gatekeeper (admission control)
- ๐ Bonus task in Lab 6 asks you to write a Checkov custom policy โ your first Policy-as-Code rule
- โ This lecture introduces PaC for IaC; Lecture 9 expands it to runtime admission control. Don't try to do both in your head
metadata:
id: "CKV2_CUSTOM_1"
name: "Ensure S3 buckets have lifecycle policy"
category: "BACKUP_AND_RECOVERY"
severity: "MEDIUM"
definition:
and:
- cond_type: "filter"
attribute: "resource_type"
value: ["aws_s3_bucket"]
operator: "within"
- cond_type: "connection"
resource_types: ["aws_s3_bucket_lifecycle_configuration"]
connected_resource_types: ["aws_s3_bucket"]
operator: "exists"| ๐งฉ Section | ๐ฏ What it does |
|---|---|
metadata.id |
CKV2_* prefix for graph (cross-resource) rules; CKV_* for single-resource |
category |
Used for Checkov's default policy groups |
definition.and |
All conditions must hold; supports or, not |
cond_type: connection |
"Is this resource referenced by another?" โ the graph engine |
- ๐ง This is exactly the bonus task in the lab. Read it twice; we'll write one together in office hours
# .github/workflows/iac-scan.yml โ extends what you built in L4
name: IaC Scan
on: [pull_request]
jobs:
checkov:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@b4ffde6...
- uses: bridgecrewio/checkov-action@v12 # pin to digest in real life
with:
directory: terraform/
framework: terraform
output_format: sarif
output_file_path: results/
- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: results/results.sarif- ๐ช Three layers, same pipeline:
- PR scan (this job) โ fail the PR on HIGH+
- Nightly full scan on
mainโ catches new rules added to the scanner - Drift detection (Lecture 9 covers this) โ compares declared state to actual cloud
- ๐ง
continue-on-error: trueis a smell. Recall from Lecture 4: if you can't fail the build, you're not gating โ you're decorating
- ๐ Pulumi programs are real code (Python/TS/Go/.NET) that emits a declarative state graph
- ๐งช Static analyzers can scan two layers:
- The source code (looks like normal Python โ SAST tools see it as Python)
- The rendered state (
pulumi preview --json) โ what will actually be created
- ๐ฏ Checkov scans the rendered state (
pulumi previewJSON), not your TypeScript directly โ which is exactly right, because IaC misconfigs live in the resource graph, not the loop that built it
๐ฌ "Pulumi's superpower is that you write infrastructure in your favorite language. Pulumi's superpower is also that you can write a
forloop that provisions 500 buckets." โ paraphrasing the Pulumi team at KubeCon 2023
- ๐๏ธ February 20, 2018 โ RedLock researchers find a Kubernetes admin console on Tesla's AWS, internet-exposed, no authentication
- ๐ช Attackers had been using it to mine Monero, dialing CPU to stay under radar
- ๐ Root cause: Terraform module spun up the EKS cluster with
endpoint_public_access = trueand no IAM auth configuration - ๐ก๏ธ A Checkov scan (rule
CKV_AWS_38or equivalent today) would have flagged the public endpoint - ๐ง Tesla's response was fast โ the deeper lesson is how easy this is to ship. Every EKS module's first version since 2018 has defaulted to private; the rule exists because the default wasn't private
- ๐๏ธ October 2019 โ Imperva discloses a 2018 breach traced to a misconfigured snapshot
- ๐งช A pre-prod database snapshot is created with an embedded AWS API key
- ๐ชฃ The snapshot's S3 bucket lacked default-deny ACL; attacker enumerates and exfiltrates customer email + hashed passwords
- ๐ช Two IaC rules would have caught this:
CKV_AWS_18(S3 logging) โ would have shown the accessCKV_AWS_56(S3 public access block) โ would have prevented the access
- ๐ญ Imperva is a security company. No one is immune to misconfiguration. This is precisely why the scanner runs in CI, not in someone's head
The first scan of a real codebase will find hundreds of issues. A program rule of thumb (matches Lecture 5's SAST triage):
| ๐ช Phase | ๐ฏ What you do | ๐ Timeline |
|---|---|---|
| 0๏ธโฃ Baseline | Scan, count by severity, don't fix yet | Day 1 |
| 1๏ธโฃ Triage | Sort by rule ID frequency; group by module | Week 1 |
| 2๏ธโฃ Module fixes | Fix the top 5 modules โ kills 60-80% of findings | Weeks 2-3 |
| 3๏ธโฃ Gate | Add Checkov to PR; fail on HIGH+ new findings only | Week 4 |
| 4๏ธโฃ Burndown | Suppress existing findings with explicit expiry; track in DefectDojo (L10) | Ongoing |
- ๐ฏ Don't try to fix everything in week one. A blocked CI on day 2 makes the security team an obstacle, not a partner
- ๐ช The gate-on-new pattern (also called "diff scanning" or "delta gating") is how mature programs avoid bankruptcy. Same discipline you saw in SAST (Lecture 5)
- ๐ฆ Module ownership pattern: platform team ships hardened modules (e.g.
s3_secure) that wrap raw providers with safe defaults; application teams consume modules, not raw resources - ๐ก๏ธ Pre-commit hook (extending L3): run
checkov -d . --quietbefore commit; same scanner, earlier - ๐งช Drift detection is the topic of L9 โ a scanner only checks what you declared; cloud changes can still happen out-of-band (root console, mythical Tuesday 4pm hotfix)
๐ค Think: A scanner can prove your IaC is safe. Can a scanner prove your cloud is safe? (Trick: only if it also reads live cloud state โ which Trivy, Checkov, and Prowler now do, but with different trade-offs.)
- ๐งช Lab 6 (this week):
- Task 1 (6 pts): Checkov on a Terraform + Pulumi sample with planted misconfigs
- Task 2 (4 pts): KICS on an Ansible playbook; compare ruleset coverage to Checkov
- Bonus (2 pts): Write a custom Checkov policy โ your first PaC rule
- ๐ Lecture 7 (next week): Container & Kubernetes Security โ Trivy on the Juice Shop image, Pod Security Standards, baseline K8s hardening. The next layer of the stack
- ๐ช You're now scanning code (L5), infra (L6); next we scan the artifact itself
Books:
| ๐ Book | โ๏ธ Why |
|---|---|
| Terraform: Up & Running โ Yevgeniy Brikman (O'Reilly, 3rd ed. 2022) | Ch. 10 "Production-Grade Terraform Code" covers module testing + policy |
| Infrastructure as Code โ Kief Morris (O'Reilly, 2nd ed. 2020) | Ch. 7 "Configuration Registries" + ch. 11 "Testing Infrastructure" โ broad framing |
| Securing DevOps โ Julien Vehent (Manning, 2018) | Ch. 3 "Hardening AWS" maps cloud misconfigs to specific scanner rules |
| Pulumi: Continuous Deployment in the Cloud โ Will Boyd (Pulumi, free e-book) | Best free intro to scanning Pulumi state graphs |
Talks & specs:
- ๐ฅ "Securing Infrastructure as Code" โ Barak Schoster (Bridgecrew/Checkov), Black Hat 2020
- ๐ฅ "From tfsec to Trivy: Consolidating IaC Scanning" โ Aqua team, KubeCon NA 2023
- ๐ CIS Benchmarks โ the source of most rules
- ๐ Checkov rule index โ every
CKV_AWS_*with description - ๐ KICS query catalogue โ all 2,400+ Rego queries
Takeaways:
| # | ๐ง Insight |
|---|---|
| 1 | IaC turned single-host typos into 200-replica disasters. Scanning is the cheapest insurance. |
| 2 | Misconfiguration is the leading cloud breach cause โ and the most automatable to prevent. |
| 3 | Use Checkov for Terraform-heavy estates; KICS for multi-language. Trivy now covers the tfsec heritage. |
| 4 | Fix at the module level, not at the resource level โ one bug fix can close 30 findings. |
| 5 | Day-one full scan is a learning exercise. Gate on new is the operational pattern. |
| 6 | Custom policies turn your team's tribal knowledge into a CI-enforced rule. Write the bonus-task policy seriously โ it's how programs scale. |
๐ฌ "The cloud is just someone else's computer โ and now you're declaring it as text. Read your declarations before AWS does." โ paraphrased from too many KubeCon hallway tracks to count