Skip to content

Latest commit

ย 

History

History
397 lines (302 loc) ยท 17.9 KB

File metadata and controls

397 lines (302 loc) ยท 17.9 KB

๐Ÿ“Œ Lecture 10 โ€” Vulnerability Management: From 1 000 Findings to a Working Program


๐Ÿ“ Slide 1 โ€“ ๐Ÿ“Š The Capstone Lecture

  • ๐Ÿชœ Over the past 9 weeks you've produced finding files from:
    • L4 SBOM/SCA (Grype, Trivy)
    • L5 SAST/DAST (Semgrep, ZAP)
    • L6 IaC scan (Checkov, KICS)
    • L7 Container scan (Trivy image + config)
    • L8 Supply chain (Cosign verification logs)
    • L9 Runtime detection (Falco alerts) + admission (Conftest)
  • ๐Ÿ“‚ If you ran them all on Juice Shop, you'd see 400+ raw findings
  • ๐Ÿง  The hardest engineering question of DevSecOps is not "how do we scan?" โ€” it's "now what?" Lecture 10 answers it

๐Ÿค” Think: Lecture 9 introduced MTTR and vuln-age as program metrics. This lecture is the workflow that produces those numbers โ€” without it, you have data, not a program.


๐Ÿ“ Slide 2 โ€“ ๐ŸŽฏ Learning Outcomes

# ๐ŸŽ“ Outcome
1 โœ… Walk the vulnerability management lifecycle: Discovery โ†’ Triage โ†’ Remediation โ†’ Reporting โ†’ Improvement
2 โœ… Pick a severity score: when to use CVSS v4.0, when to use EPSS โ€” and why a single score is never enough
3 โœ… Import scanner outputs into DefectDojo and dedupe across tools
4 โœ… Apply an SLA matrix and compute the program metrics that matter (MTTD/MTTR/vuln-age/backlog trend)
5 โœ… Build the interview-ready 5-minute walkthrough of your DevSecOps program

๐Ÿ“ Slide 3 โ€“ ๐Ÿ—บ๏ธ Where Lecture 10 Sits

graph LR
    L4["๐Ÿ“‹ L4 SBOM"] --> L10
    L5["๐Ÿงช L5 SAST/DAST"] --> L10
    L6["๐Ÿ—๏ธ L6 IaC"] --> L10
    L7["๐Ÿ“ฆ L7 Container"] --> L10
    L8["๐Ÿ” L8 Supply chain"] --> L10
    L9["๐Ÿ“Š L9 Runtime+Conftest"] --> L10["๐ŸŽฏ L10 Triage & Program<br/>(here โ€” capstone)"]

    style L10 fill:#FF9800,color:#fff
Loading
  • ๐Ÿชœ Building on every prior lab. L10 is the integration lab โ€” every finding from L4โ€“L9 lands in DefectDojo here
  • ๐ŸŽฏ Lab 10 alignment: Task 1 (DefectDojo setup + import all prior reports), Task 2 (build governance report + program metrics), Bonus (5-minute interview walkthrough โ€” the deliverable an employer will want to hear)

๐Ÿ“ Slide 4 โ€“ ๐Ÿ”„ The Vulnerability Management Lifecycle

flowchart LR
    Disc[๐Ÿ”Ž Discovery] --> Tri[๐Ÿท๏ธ Triage]
    Tri --> Rem[๐Ÿฉน Remediation]
    Rem --> Rep[๐Ÿ“Š Reporting]
    Rep --> Imp[๐Ÿ“ˆ Improvement]
    Imp -.feeds back.-> Disc

    style Disc fill:#2196F3,color:#fff
    style Tri fill:#FF9800,color:#fff
    style Rem fill:#4CAF50,color:#fff
    style Rep fill:#9C27B0,color:#fff
    style Imp fill:#F44336,color:#fff
Loading
๐Ÿชœ Phase ๐ŸŽฏ What happens โฑ๏ธ Cadence
Discovery Scanners run, findings produced Per PR + nightly
Triage Dedup, severity, ownership, SLA assignment Daily (security on-call)
Remediation Fix, suppress with reason, or accept with expiry Per-finding by SLA
Reporting Metrics, audit artifacts, exec dashboard Weekly/monthly
Improvement Tune rules, expand coverage, mature next OWASP SAMM practice Quarterly
  • ๐Ÿชœ The feedback arrow is the program. A "discovery only" pipeline is a scan farm; the lifecycle is what makes it a program

๐Ÿ“ Slide 5 โ€“ ๐ŸŽš๏ธ Severity: CVSS v4.0 in Brief

๐Ÿ’ฌ "CVSS was never meant to be a single-number priority. It's a severity vocabulary." โ€” FIRST CVSS SIG, 2023 onboarding talk

  • ๐Ÿ—“๏ธ CVSS v4.0 โ€” released November 2023; NVD publishing v4.0 alongside v3.1 since early 2026
  • ๐Ÿงฉ Four metric groups:
    • ๐Ÿ›๏ธ Base โ€” intrinsic + immutable (attack vector, complexity, privileges, impact)
    • ๐ŸŒ Threat โ€” exploit maturity, threat intelligence (replaces older "Temporal")
    • ๐Ÿข Environmental โ€” your asset value, your existing mitigations
    • ๐Ÿ”ฌ Supplemental โ€” non-scored helper context (e.g., "automatable", "safety")
  • ๐ŸŽฏ What this means in practice:
    • Base score alone is one input, not a verdict
    • Combining with environmental (is this asset critical to YOU?) personalizes the score
    • CVSS doesn't predict exploitation โ€” for that, use EPSS

๐Ÿ“ Slide 6 โ€“ ๐Ÿ“ˆ EPSS: The Probability of Exploitation

  • ๐Ÿ›๏ธ EPSS = Exploit Prediction Scoring System. Maintained by FIRST (Forum of Incident Response and Security Teams)
  • ๐Ÿงฎ A daily probability score (0.0โ€“1.0) that a given CVE will be exploited in the wild in the next 30 days
  • ๐Ÿชœ Built from a machine-learning model trained on:
    • Public exploit code availability (PoC on GitHub, ExploitDB)
    • Known CVE chatter on social media
    • Real exploit telemetry from large IDS/IPS networks
  • ๐Ÿ“Š Distribution: ~95% of CVEs have EPSS < 0.10 (likely never exploited). The 5% with EPSS > 0.50 are where the action is

๐Ÿค” Think: Your scanner returned 100 CVSS-9 findings. EPSS shows 95 of them at <0.05 and 5 at >0.80. Which five do you fix this week?


๐Ÿ“ Slide 7 โ€“ ๐ŸŽฏ CVSS + EPSS = Modern Prioritization

quadrantChart
    title ๐Ÿชœ The 2x2 prioritization matrix
    x-axis Low EPSS --> High EPSS
    y-axis Low CVSS --> High CVSS
    quadrant-1 Critical & Likely Now
    quadrant-2 Critical, Unlikely
    quadrant-3 Background noise
    quadrant-4 Sleeper threat
Loading
๐ŸŽฏ Quadrant ๐Ÿ“‹ Action
High CVSS + High EPSS Fix this week. SLA-overdriven
High CVSS + Low EPSS Plan + track. Most "criticals" live here
Low CVSS + High EPSS Watch closely โ€” fast exploitation can outpace severity
Low CVSS + Low EPSS Batch with normal maintenance
  • ๐Ÿชœ DefectDojo 2026 ingests both CVSS and EPSS and exposes them in the Rules Engine for auto-prioritization
  • ๐Ÿง  Two-axis triage is the 2026 best practice. Single-axis CVSS-only triage causes patch fatigue

๐Ÿ“ Slide 8 โ€“ ๐Ÿšฆ The SLA Matrix (Recap From L9)

๐Ÿšจ Severity ๐Ÿฉน Fix SLA ๐Ÿ“‹ Owner ๐Ÿ“ฃ Escalation
๐Ÿ”ด Critical (CVSS 9โ€“10 OR EPSS > 0.50) 24h On-call + Security Lead Page on creation
๐ŸŸ  High (7โ€“8.9) 7 days Service team Slack + ticket
๐ŸŸก Medium (4โ€“6.9) 30 days Service team Backlog grooming
๐Ÿ”ต Low (0.1โ€“3.9) 90 days / accept Tech lead Quarterly review
  • ๐Ÿงญ The SLA matrix is your defense for risk acceptance โ€” accepting a Medium = explicit 30-day exposure, signed off
  • ๐Ÿชœ Without an SLA matrix, every finding becomes "P3 โ€” someday"
  • ๐Ÿชœ In Lab 10 you'll define the matrix in DefectDojo and apply it to imported findings

๐Ÿ“ Slide 9 โ€“ ๐Ÿ™ DefectDojo: The Triage Hub

  • ๐Ÿข OWASP project since 2015; open-source (BSD)
  • ๐Ÿ Python/Django; latest v2.58.x (May 2026)
  • ๐ŸŽฏ What it does:
    • Imports ~150 scanner formats (Trivy, Semgrep, ZAP, Grype, Checkov, KICS, Cosign verification, custom JSON)
    • Deduplicates across tools (same CVE found by Trivy and Grype = one finding)
    • Applies the SLA matrix
    • Tracks every finding's state through the lifecycle
    • Computes program metrics (MTTD, MTTR, vuln-age, backlog trend)
    • Exposes a JIRA-style API for tickets
# Lab 10 Task 1 starts here
git clone https://github.com/DefectDojo/django-DefectDojo
docker compose up -d
# UI at http://localhost:8080 (admin password printed by initializer)

๐Ÿ“ Slide 10 โ€“ ๐Ÿชœ The Importer Pattern

# Lab 10 uses this script to ingest every prior lab's report
curl -X POST "$DD_URL/api/v2/import-scan/" \
  -H "Authorization: Token $DD_TOKEN" \
  -F "scan_type=Trivy Scan" \
  -F "engagement=$ENG_ID" \
  -F "file=@labs/lab7/juice-shop-trivy.json"

# Same shape for Semgrep, ZAP, Grype, Checkov, KICS, Conftest, ...
  • ๐Ÿชœ Every importer follows the same pattern: --scan_type + --file + engagement context
  • ๐Ÿง  DefectDojo's killer feature is that deduplication is automatic โ€” the same CVE reported by Trivy and Grype becomes one finding with two pieces of evidence

๐Ÿ“ Slide 11 โ€“ ๐Ÿงฎ Dedup, Annotated

graph LR
    T[Trivy: CVE-2024-21626 in runc] --> Dedup[๐Ÿ”„ DefectDojo dedup<br/>by CVE+component+service]
    G[Grype: CVE-2024-21626 in runc] --> Dedup
    K[Trivy K8s: CVE-2024-21626 cluster-wide] --> Dedup
    Dedup --> One[1 finding<br/>severity=critical<br/>EPSS=0.84<br/>evidence: 3 tools]

    style Dedup fill:#FF9800,color:#fff
    style One fill:#4CAF50,color:#fff
Loading
  • ๐Ÿชœ Dedup keys (configurable in DefectDojo):
    • CVE ID
    • Vulnerability ID + affected component
    • File path + line (for SAST)
    • URL path + parameter (for DAST)
  • ๐Ÿชœ Why three tools finding the same CVE matters: โ†‘ confidence, โ†“ noise. You triage the finding, not the tool output

๐Ÿ“ Slide 12 โ€“ ๐Ÿฉน Remediation States

๐Ÿท๏ธ State ๐ŸŽฏ Meaning ๐Ÿชœ When
Active Open, in SLA window Default for new findings
Verified A human has confirmed it's a real issue After triage
False Positive Confirmed not exploitable Suppress with reason
Risk Accepted Real but accepted; MUST have expiry Explicit risk acceptance
Mitigated Fixed via code or config change Verification re-scan passes
Inactive Out-of-scope or duplicate Cleanup
  • ๐Ÿšจ Risk Accepted with no expiry is the silent program killer โ€” DefectDojo enforces an expiry on every acceptance (configurable). Lab 10 Task 2 will show you how
  • ๐Ÿง  In code review terms: "False Positive" needs a written justification โ€” the WHY that future you will read in a year

๐Ÿ“ Slide 13 โ€“ ๐Ÿ“Š The Metrics That Matter

graph TB
    M[๐Ÿ“Š Program metrics]
    M --> MTTD[โฑ๏ธ MTTD<br/>Mean Time To Detect]
    M --> MTTR[๐Ÿฉน MTTR<br/>Mean Time To Remediate]
    M --> Age[โŒ› Vuln age<br/>distribution]
    M --> BT[๐Ÿ“ˆ Backlog trend]
    M --> SLA[๐Ÿšฆ SLA compliance %]

    style M fill:#FF9800,color:#fff
Loading
๐Ÿ“ Metric ๐Ÿงฎ Formula ๐ŸŽฏ What it answers
MTTD avg(detected_time โˆ’ introduced_time) How fast does our pipeline find issues?
MTTR avg(closed_time โˆ’ detected_time) How fast do we fix?
Vuln age now โˆ’ first_seen, by finding What's our debt distribution?
Backlog trend open(t) โˆ’ open(tโˆ’ฮ”) Are we keeping up?
SLA compliance % closed within their severity SLA Are we triaging by risk, or by panic?
  • ๐Ÿชœ DefectDojo computes all five out of the box. You don't write SQL; you read dashboards
  • ๐Ÿง  Anti-metrics you'll be tempted to measure (don't): scans run, alerts fired, tools deployed. These reward activity, not outcomes (Lecture 9 warned about this)

๐Ÿ“ Slide 14 โ€“ ๐Ÿ“‹ Governance Reporting

  • ๐Ÿชœ By Week 10 you'll need to produce a governance report that an exec or auditor could read

Required sections (Lab 10 Task 2):

๐Ÿ“‘ Section ๐ŸŽฏ Contents
Executive summary 3-sentence state of the program
Findings by severity Open Critical/High/Medium/Low counts
Findings by source Which scanner produced what; coverage gaps
MTTR + age distribution The 5 metrics above
SLA compliance % within SLA; outstanding overdue findings
Risk-accepted items Listed with expiry dates; due for re-review
Next-quarter goals One concrete SAMM ladder step (from Lecture 9)
  • ๐Ÿชœ A 1-page exec summary + 5-page detail = the standard. Don't write 30 pages; no one will read them. The exec summary is what gets cited in compliance audits

๐Ÿ“ Slide 15 โ€“ ๐ŸŽค The 5-Minute Interview Walkthrough

  • ๐ŸŽฏ Lab 10 Bonus: produce a 5-minute walkthrough script as if you were giving an SRE/DevSecOps interview at a real org
  • ๐Ÿชœ The canonical structure:
1. Context  (30s) โ€” "I built a DevSecOps program on OWASP Juice Shop..."
2. Layers   (90s) โ€” Show the diagram: pre-commit, CI, runtime
3. Findings (60s) โ€” "Here are the X criticals I closed; here's the one I risk-accepted, here's why"
4. Metrics  (60s) โ€” "MTTR 4 days; vuln-age median 7 days; SLA compliance 92%"
5. Next     (30s) โ€” "If I had another quarter, I'd ship reproducible builds + SLSA L3"
6. Q&A      (30s budget) โ€” Anticipate two questions
  • ๐Ÿง  This is the deliverable that gets you hired. Many DevSecOps interviews boil down to "talk me through your last program." Lab 10 produces exactly this script

๐Ÿ“ Slide 16 โ€“ ๐Ÿ”ฌ Case Study: Log4Shell Triage (December 2021)

  • ๐Ÿชœ The world's most-cited vuln-management exercise
  • ๐Ÿ—“๏ธ December 9, 2021 โ€” CVE-2021-44228 (Log4Shell) goes public. CVSS 10.0. EPSS spikes to 0.97 within 6 hours
โฑ๏ธ Time after disclosure ๐Ÿฉน Action
0h NVD entry published; PoC on GitHub
1h EPSS spikes; orgs start scanning
6h First mass-exploit campaigns observed in the wild
24h Apache patch 2.15 released
48h 2.16 fixes a regression in 2.15
1 week 2.17 fixes another bypass
2 weeks Most CISA-tracked orgs patched
3 months "Long tail" โ€” embedded Log4j in IoT, appliances, vendor products still vulnerable
  • ๐Ÿชœ What separated the teams who closed it in 24h vs 1 month:
    • ๐Ÿชœ An up-to-date SBOM (L4) โ€” answers "do we have it?" in seconds, not weeks
    • ๐Ÿชœ A working triage workflow (L10) โ€” finding โ†’ owner โ†’ fix โ†’ verify
    • ๐Ÿชœ A test deploy of the patched dep โ€” not blind force-push to prod
  • ๐Ÿง  If you had Lab 4 + Lab 10 already shipped, Log4Shell was a 1-day exercise. If you didn't, it was a quarter-long incident

๐Ÿ“ Slide 17 โ€“ ๐Ÿ“Š The "Where to Improve Next" Decision

After Lab 10 you'll have measured numbers โ€” open findings, MTTR, vuln-age. The question is what to improve next.

๐Ÿ“Š Symptom ๐ŸŽฏ Likely improvement
High MTTD on certain finding classes Add coverage (e.g., DAST nightly, not just PR)
High MTTR on Mediums Process problem โ€” assign owners earlier; tighten SLA
Increasing backlog Either you found more bugs (good) or you stopped fixing (bad) โ€” investigate
0 risk-accepted items Suspicious โ€” every program has some; if 0, you're not measuring
Risk-accepted with no expiry Refuse this in DefectDojo config; require expiry on every acceptance
Only one tool finds 80% Tool diversity is the resilience; add a second scanner of that class
  • ๐Ÿชœ This is OWASP SAMM in practice โ€” the maturity ladder from Lecture 9 becomes concrete data-driven decisions here
  • ๐Ÿง  One concrete next-quarter goal is the right cadence. Don't try to mature five practices simultaneously

๐Ÿ“ Slide 18 โ€“ ๐Ÿชœ Common Mistakes & Fixes

๐Ÿšจ Mistake ๐Ÿ› ๏ธ Fix
Scanning but never importing DefectDojo (or its replacement) โ€” without a central hub, you have 9 dashboards and no program
CVSS-only triage Add EPSS; the 2x2 matrix is the 2026 standard
"Risk Accepted" without expiry Refuse in DefectDojo rules โ€” every accept needs an expiry date
Same finding open for 6 months at HIGH The SLA matrix is the program; if Mediums sit for months, the matrix isn't enforced
Reporting in PDFs nobody reads Live dashboard + 1-page exec summary; the PDF was a Word-era artifact
Skipping the postmortem When a finding becomes an incident, the postmortem feeds Improvement back into Discovery

๐Ÿ“ Slide 19 โ€“ โญ๏ธ What You've Built + What's Next After This Course

  • ๐Ÿงช Lab 10 (this week):
    • Task 1 (6 pts): Spin up DefectDojo locally; import every report from Labs 4โ€“9
    • Task 2 (4 pts): Build a governance report with metrics + SLA + risk accept items
    • Bonus (2 pts): 5-minute walkthrough script for an interview
  • ๐ŸŽ“ By the end of this lab you'll have:
    • Every defensive practice from L1โ€“L9 ran against Juice Shop
    • Findings centralized in DefectDojo
    • Metrics + governance report ready
    • A walkthrough script you can use in your next interview
  • ๐Ÿš€ After this course:
    • Read Securing DevOps (Vehent) cover-to-cover; you'll recognize every chapter
    • Pick a real open-source project, run the same labs on it; that's your portfolio
    • Track CVE-2024-3094 (xz) postmortem reporting through 2026 โ€” there's still more to learn

๐Ÿ“ Slide 20 โ€“ ๐Ÿ“š Resources & Takeaways

Books:

๐Ÿ“– Book โœ๏ธ Why
Application Security Program Handbook โ€” Derek Fisher (Manning, 2023) Best single book on program metrics + SLAs
Building Secure & Reliable Systems โ€” Google (O'Reilly, 2020, free PDF) Ch. 21 "Postmortems" is the canonical reference
Securing DevOps โ€” Julien Vehent (Manning, 2018) Ch. 9โ€“10 cover the metrics + program loop directly
Resilience Engineering in Practice โ€” Hollnagel et al. (CRC, 2011) Where DevSecOps program improvement borrows its theory

Talks & specs:

Takeaways:

# ๐Ÿง  Insight
1 Discovery is the easy part. Triage + Remediation + Reporting + Improvement are the program.
2 CVSS = severity. EPSS = likelihood. Use both โ€” neither alone tells the full story.
3 The SLA matrix is the program. Without it, every finding is "someday."
4 DefectDojo dedupes across tools; you triage the finding, not the tool output.
5 "Risk Accepted" with no expiry is the silent program killer. Every accept needs a re-review date.
6 The 5-minute walkthrough script (Lab 10 Bonus) is your interview deliverable. Make it real.

๐Ÿ’ฌ "Vulnerability management is the discipline of knowing what you have, knowing what's wrong with it, and proving to someone else you fixed it on time." โ€” Derek Fisher, Application Security Program Handbook (2023). The end of this course; the start of your career.