Skip to content

Latest commit

ย 

History

History
400 lines (297 loc) ยท 19.4 KB

File metadata and controls

400 lines (297 loc) ยท 19.4 KB

๐Ÿ“Œ Lecture 9 โ€” Monitoring, Compliance & Maturity: From Findings to a Program


๐Ÿ“ Slide 1 โ€“ ๐Ÿšจ The Alerts Nobody Read

  • ๐Ÿ—“๏ธ November 30, 2013 โ€” Target's FireEye intrusion detection system fires on POS-malware traffic to Russia
  • ๐Ÿ›’ Same week: a second alert, same exfiltration pattern, same destination
  • ๐Ÿ“ง Both alerts route to a SOC in Bangalore. They flow up to HQ โ€” and get set aside
  • ๐Ÿ’ณ By the time anyone responds: 40 million card numbers + 70 million customer records are gone
  • ๐Ÿ’ฐ Final settlement: $18.5M to 47 states. Bryan Krebs breaks the story two weeks later โ€” Target heard it from a journalist first
  • ๐Ÿ”ฅ The defense wasn't broken. The feedback loop was.

๐Ÿค” Think: If your scanner finds 200 criticals but nobody reads the report, did you detect anything? A finding without a workflow is noise.


๐Ÿ“ Slide 2 โ€“ ๐ŸŽฏ Learning Outcomes

# ๐ŸŽ“ Outcome
1 โœ… Explain why runtime detection complements (does not replace) shift-left checks
2 โœ… Write a Falco rule that detects a specific runtime behavior, using eBPF
3 โœ… Express deployment-hardening rules as Rego policies executed by Conftest
4 โœ… Choose security metrics that drive behavior change (MTTD, MTTR, vuln age)
5 โœ… Place a team on the OWASP SAMM maturity ladder and propose one concrete next step

๐Ÿ“ Slide 3 โ€“ ๐Ÿ—บ๏ธ Where Lecture 9 Sits in the Course

graph LR
    L5["๐Ÿงช L5 SAST/DAST<br/>Find pre-prod"] --> L9
    L6["๐Ÿ—๏ธ L6 IaC scan<br/>Find at apply"] --> L9
    L7["๐Ÿ“ฆ L7 Container scan<br/>Find at build"] --> L9
    L8["๐Ÿ” L8 Supply chain<br/>Trust the artifact"] --> L9
    L9["๐Ÿ“Š L9 Runtime + Program<br/>Detect, measure, mature"] --> L10["๐ŸŽฏ L10 Vuln mgmt<br/>Close the loop"]

    style L9 fill:#FF9800,color:#fff
    style L10 fill:#4CAF50,color:#fff
Loading
  • ๐Ÿชœ Lectures 1โ€“8 taught you to find issues at increasingly earlier stages
  • ๐Ÿƒ Lecture 9 covers what happens once code is running and how the program itself matures
  • ๐ŸŽ Lecture 10 will close the loop: triage every finding from Labs 4โ€“9 in DefectDojo

๐Ÿ“ Slide 4 โ€“ ๐Ÿ›ก๏ธ Shift-Left โ‰  Shift-Only-Left

๐Ÿ’ฌ "You can shift left as far as you want โ€” attackers still get to attack the running system." โ€” Liz Rice, Container Security (O'Reilly, 2020)

๐Ÿท๏ธ Stage ๐Ÿ” Checks ๐Ÿ› ๏ธ Tools (from this course) โŒ What it can't catch
๐Ÿ“ Pre-commit Secret scan, signed commits gitleaks, SSH signing (L3) Anything you don't commit
๐Ÿ—๏ธ Build SAST, SCA, image scan Semgrep, Grype, Trivy (L4,5,7) Vulns in transitive deps loaded at runtime
๐Ÿš€ Deploy Policy-as-code, supply-chain verify Conftest (this lecture), Cosign verify (L8) A compromised registry serving a different image
๐Ÿƒ Runtime Behavior detection, anomaly Falco (this lecture) Drift between IaC source and live cluster
  • ๐ŸŽฏ The point: each stage catches a different failure class. Runtime is the last line of defense โ€” and the only one that sees what an attacker actually does

๐Ÿ“ Slide 5 โ€“ ๐Ÿ‘๏ธ Runtime Detection โ€” The Mental Model

Static tools answer "could this be exploited?". Runtime tools answer "is this being exploited right now?"

flowchart LR
    K[๐Ÿง Kernel syscalls] -->|๐Ÿ eBPF probes| F[๐Ÿ”Ž Falco engine]
    F -->|Match rules| A[๐Ÿšจ Alerts]
    A -->|JSON| S[๐Ÿ“ฆ SIEM / file / stdout]
    A -->|gRPC| R[๐Ÿค– Response automation]

    style K fill:#607D8B,color:#fff
    style F fill:#FF9800,color:#fff
    style A fill:#F44336,color:#fff
Loading
  • ๐Ÿ eBPF = "extended Berkeley Packet Filter" โ€” sandboxed programs run in the kernel without loading a module
  • ๐ŸŽฏ Falco taps syscalls (process exec, file open, network connect) and matches them against a rule library
  • ๐Ÿ“œ Rule language is YAML; conditions are a small expression DSL over syscall fields

๐Ÿง  Why eBPF won: the older kernel-module driver required matching the host kernel version. eBPF is portable across recent kernels (5.8+ for the modern driver), runs in user-controlled bytecode, and is verifiable before load.


๐Ÿ“ Slide 6 โ€“ ๐Ÿฆ… Falco: A Short History

  • ๐Ÿข Created at Sysdig by Loris Degioanni (also co-author of Wireshark) in 2016
  • ๐Ÿ“œ Donated to the CNCF as a Sandbox project on October 10, 2018
  • ๐Ÿ“ˆ Promoted to Incubating on January 8, 2020
  • ๐ŸŽ“ Graduated to CNCF Graduated on February 29, 2024 โ€” alongside KEDA, joining only a handful of security-focused CNCF projects
  • ๐Ÿ”ข Course pins Falco v0.43.x (January 2026) โ€” the version your lab uses
  • ๐Ÿงฑ Three engines historically: legacy kernel module โ†’ eBPF probe โ†’ modern eBPF (default since 0.34). Lab uses modern eBPF.

๐Ÿ’ฌ "Falco isn't trying to be your IDS. It's trying to be the runtime equivalent of grep โ€” fast, predictable, and composable." โ€” Leonardo Grasso, Falco maintainer


๐Ÿ“ Slide 7 โ€“ ๐Ÿ“ Anatomy of a Falco Rule

- rule: Write to /etc by container
  desc: Container modifying system config under /etc
  condition: >
    open_write and
    container.id != host and
    fd.name startswith /etc/
  output: >
    Config write in container (user=%user.name container=%container.name
    file=%fd.name proc=%proc.cmdline)
  priority: WARNING
  tags: [container, drift, mitre_persistence]
๐Ÿท๏ธ Field ๐ŸŽฏ Purpose
rule Human-readable name (unique)
desc Why this rule exists
condition Boolean expression on syscall fields
output Alert message template, %field interpolated
priority EMERGENCY..DEBUG โ€” mostly used for routing
tags Free-form labels; common to map to MITRE ATT&CK techniques
  • ๐Ÿงช Macros (open_write, container_started, ...) ship with the default ruleset โ€” read /etc/falco/falco_rules.yaml once
  • ๐ŸŽฏ In the lab you'll add one custom rule; the default ruleset already covers ~200 conditions

๐Ÿ“ Slide 8 โ€“ ๐Ÿ”‰ Tuning Noise: the Rule That Cried Wolf

  • ๐Ÿšจ Default rules fire on legitimate behavior all the time โ€” apt-get update writes under /var/lib/dpkg, kubelet writes to /var/lib/kubelet
  • ๐Ÿค Tuning options, in order of preference:
    1. Refine the condition โ€” add an exception clause (and not proc.name=apt-get)
    2. Use the exceptions: block (Falco 0.28+) โ€” structured, easier to audit than long and not chains
    3. Disable the rule โ€” last resort, document why
  • ๐Ÿ“Š Signal-to-noise is the only metric that matters for a detection program. A rule that fires 200ร—/day with 0 incidents will be silenced โ€” by humans or by mute filters

๐Ÿค” Think: Why is "false positive" the wrong word for security detections? (Hint: a noisy true-positive is still useless if no one looks at it.)


๐Ÿ“ Slide 9 โ€“ ๐Ÿ“œ Policy-as-Code: Hardening Before Deploy

Falco catches behavior after it happens. Policy-as-code prevents bad config from ever being applied.

flowchart LR
    Y[๐Ÿ“„ K8s YAML] --> C[๐Ÿ” conftest test]
    C -->|Rego eval| P[๐Ÿ“œ policy/*.rego]
    P -->|โŒ fail| Block[โ›” CI fails / Admission denies]
    P -->|โœ… pass| Apply[โœ… kubectl apply]

    style C fill:#2196F3,color:#fff
    style P fill:#9C27B0,color:#fff
    style Block fill:#F44336,color:#fff
    style Apply fill:#4CAF50,color:#fff
Loading
  • ๐Ÿ”ง Conftest (CNCF, by Garet Hilliard, 2019) wraps OPA (Open Policy Agent) so you can run Rego policies against any structured file: YAML, JSON, HCL, Dockerfile, INI
  • ๐Ÿ”ข Course pins Conftest v0.68.2 (April 2026)
  • ๐Ÿ†š Conftest is CLI/CI; the same Rego runs server-side as a Gatekeeper or Kyverno webhook (Kyverno uses its own DSL, but the role is identical)

๐Ÿ“ Slide 10 โ€“ ๐Ÿงฎ Rego in 60 Seconds

package main

deny[msg] {
  input.kind == "Deployment"
  c := input.spec.template.spec.containers[_]
  c.securityContext.runAsNonRoot != true
  msg := sprintf("container %q must set runAsNonRoot: true", [c.name])
}
๐Ÿงฉ Construct ๐ŸŽฏ Meaning
package main Default package Conftest evaluates
deny[msg] { ... } A rule that, when body is true, adds msg to the deny set
input The parsed YAML/JSON document
[_] "For each element" โ€” implicit iteration
sprintf Built-in for formatted messages
  • ๐Ÿง  Rego is declarative: you write conditions for failure, not procedures
  • ๐Ÿ“š OPA documentation has a 30-minute interactive tutorial (play.openpolicyagent.org) โ€” worth doing before Lab 9 Task 2

๐Ÿ“ Slide 11 โ€“ ๐Ÿ“Š Security Metrics That Drive Behavior

๐Ÿ’ฌ "What gets measured gets managed." โ€” often attributed to Peter Drucker (no record of him saying it). Either way it's true for security programs.

๐Ÿ“ Metric ๐Ÿงฎ Formula ๐ŸŽฏ What it answers
โฑ๏ธ MTTD (Mean Time To Detect) avg(detect_time โˆ’ introduction_time) How fast does our pipeline find issues?
๐Ÿฉน MTTR (Mean Time To Remediate) avg(close_time โˆ’ detect_time) How fast do we fix?
โŒ› Vuln Age now โˆ’ first_seen, per finding What's our debt distribution?
๐Ÿ“ˆ Backlog Trend open_findings(t) โˆ’ open_findings(tโˆ’ฮ”) Are we keeping up with new findings?
๐ŸŽฏ SLA Compliance % findings closed within severity-based SLA Are we triaging by risk?
  • ๐Ÿšซ Anti-metrics (look impressive, change nothing): number of scans run, total alerts generated, lines of policy. They reward activity, not outcomes
  • โœ… The lab and Lecture 10 will compute MTTR + vuln-age from real DefectDojo data

๐Ÿ“ Slide 12 โ€“ ๐Ÿšฆ Severity-Based SLAs (an example matrix)

๐Ÿšจ Severity ๐Ÿฉน Fix SLA ๐Ÿ“‹ Owner ๐Ÿ“ฃ Escalation
๐Ÿ”ด Critical (CVSS 9โ€“10) 24h On-call + Security Lead Page on creation
๐ŸŸ  High (7โ€“8.9) 7 days Service team Slack channel + ticket
๐ŸŸก Medium (4โ€“6.9) 30 days Service team Backlog grooming
๐Ÿ”ต Low (0.1โ€“3.9) 90 days / accept Tech lead Quarterly review
  • ๐Ÿงญ Without an SLA matrix, every finding becomes "P3 โ€” someday"
  • ๐ŸŽฏ The matrix is also your defense for risk acceptance โ€” if you choose not to fix a Medium, you've explicitly accepted a 30-day exposure that the matrix says is acceptable

๐Ÿ“ Slide 13 โ€“ ๐Ÿš€ DORA Meets DevSecOps

The 2018 Accelerate book (Forsgren, Humble, Kim) defined four DORA metrics for engineering performance:

๐Ÿท๏ธ DORA metric ๐Ÿš€ Engineering meaning ๐Ÿ” DevSecOps adaptation
๐Ÿšข Deployment Frequency How often you ship How often you ship a security fix
โฑ๏ธ Lead Time for Changes Commit โ†’ prod Vuln discovery โ†’ patch in prod
โŒ Change Failure Rate % deploys causing prod incidents % security-fix deploys causing rollback
๐Ÿฉน MTTR (service) Time to restore service Time to remediate a vuln
  • ๐Ÿ“š The annual DORA report (Google Cloud since 2014) is the most cited engineering-performance research; the 2024 report added security practices as a top performance predictor
  • ๐Ÿงช Elite performers deploy >1ร—/day, have lead time <1 hour, change failure rate <15%, MTTR <1 hour โ€” security teams that match these numbers tend to ship patches in hours, not weeks

๐Ÿ“ Slide 14 โ€“ ๐Ÿ›๏ธ Compliance Frameworks โ€” A Survival Map

You won't implement a framework in this course. You should be able to recognize what each one cares about so you can talk to a compliance officer without freezing.

๐Ÿ“œ Framework ๐ŸŽฏ Scope ๐Ÿ”‘ What it cares about ๐Ÿ“… Key date
๐Ÿ‡ช๐Ÿ‡บ GDPR Personal data of EU residents Lawful basis, breach notification (72h), data subject rights Enforced 25 May 2018
๐Ÿ‡บ๐Ÿ‡ธ NIST CSF 2.0 US critical infra (voluntary, widely adopted) Govern, Identify, Protect, Detect, Respond, Recover Released February 2024 (added "Govern")
๐ŸŒ ISO/IEC 27001:2022 Information Security Management System Risk-based ISMS + Annex A controls Latest revision October 2022
๐Ÿ’ณ PCI DSS 4.0 Card payment data Network seg, encryption, log retention Mandatory from March 2025
  • ๐Ÿงญ Pattern: they all want the same things โ€” risk register, controls mapped to risks, logged evidence, periodic review. The vocabulary differs
  • ๐Ÿชœ GDPR's 72-hour breach notification is the single rule most likely to bite an engineering team unaware

๐Ÿ“ Slide 15 โ€“ ๐Ÿชœ OWASP SAMM โ€” Where Is Your Team?

The Software Assurance Maturity Model (OWASP project, originally by Pravir Chandra 2009; SAMM 2.0 released 2019) gives a 4-level maturity ladder across 5 business functions ร— 15 security practices.

graph LR
    L0["0๏ธโƒฃ No practice"] --> L1["1๏ธโƒฃ Initial<br/>Ad-hoc, person-dependent"]
    L1 --> L2["2๏ธโƒฃ Defined<br/>Documented, repeatable"]
    L2 --> L3["3๏ธโƒฃ Optimized<br/>Measured, continuously improved"]

    style L0 fill:#9E9E9E,color:#fff
    style L1 fill:#FF5722,color:#fff
    style L2 fill:#FFC107,color:#fff
    style L3 fill:#4CAF50,color:#fff
Loading
๐Ÿ›๏ธ Business function ๐Ÿงฉ Practices (3 each)
Governance Strategy & Metrics ยท Policy & Compliance ยท Education & Guidance
Design Threat Assessment ยท Security Requirements ยท Security Architecture
Implementation Secure Build ยท Secure Deployment ยท Defect Management
Verification Architecture Assessment ยท Requirements-Driven Testing ยท Security Testing
Operations Incident Management ยท Environment Management ยท Operational Management
  • ๐Ÿ†š BSIMM (Building Security In Maturity Model, Synopsys/Black Duck) does the same thing descriptively โ€” annual report on what real orgs do. Latest is BSIMM16 (January 2026, 111 orgs)
  • ๐Ÿงญ Use SAMM to set goals; read BSIMM to see what your industry peers actually do

๐Ÿ“ Slide 16 โ€“ ๐Ÿ”ฌ Case Study: Equifax (2017)

  • ๐Ÿ—“๏ธ March 7, 2017 โ€” Apache Struts CVE-2017-5638 published (CVSS 10.0). Patch available same day
  • ๐Ÿ›ก๏ธ Equifax's security team emails the patch directive across the org on March 9
  • ๐ŸŒ€ The vulnerable web portal was not on the inventory the directive used. It is missed
  • ๐Ÿ“ก Scans run two weeks later โ€” but the SSL certificate on the IDS was expired for 10 months. Encrypted attack traffic flows past the inspector unread
  • ๐Ÿ’ธ Attackers exfiltrate 147 million records between May and July; CEO and CISO resign; total cost > $1.4B
  • ๐Ÿง  What failed: inventory (Identify), patch process (Protect), monitoring (Detect), comms (Respond). NIST CSF functions in a row

๐Ÿค” Think: Which OWASP SAMM practice would have caught this earliest โ€” Defect Management, Environment Management, or Incident Management? (Trick: all three; one would have been enough.)


๐Ÿ“ Slide 17 โ€“ ๐Ÿ”ฌ Case Study: SolarWinds (2020)

  • ๐Ÿ—“๏ธ March 2020 โ€” attackers (later attributed to APT29 / Cozy Bear) inject SUNBURST into the SolarWinds Orion build pipeline
  • ๐Ÿ“ฆ Backdoored update ships to ~18,000 customers including DoD, Treasury, FireEye
  • ๐Ÿ‘๏ธ FireEye discovers it December 8, 2020 โ€” after the malware tries to enroll a second auth device for an employee. The MFA workflow alerts. That alert was read
  • ๐ŸŽฏ The supply chain check missed, but runtime + IAM monitoring caught it โ€” about 9 months in, but caught
  • ๐Ÿชœ This is why this course teaches you both L8 (signing/verification) and L9 (runtime detection) โ€” neither is sufficient alone

๐Ÿ“ Slide 18 โ€“ ๐Ÿ› ๏ธ A Working DevSecOps Program in One Diagram

flowchart TB
    subgraph "Shift-Left (Pre-Prod)"
        SAST[๐Ÿงช SAST L5]
        SCA[๐Ÿ“‹ SCA L4]
        IaC[๐Ÿ—๏ธ IaC scan L6]
        Img[๐Ÿ“ฆ Image scan L7]
        Sign[๐Ÿ” Sign L8]
    end
    subgraph "Shift-Right (Runtime)"
        Falco[๐Ÿฆ… Falco runtime L9]
        PaC[๐Ÿ“œ Conftest admission L9]
    end
    subgraph "Program Layer"
        DD[๐Ÿ™ DefectDojo L10]
        Met[๐Ÿ“Š MTTD/MTTR/age]
        SLA[๐Ÿšฆ SLA matrix]
        SAMM[๐Ÿชœ SAMM review]
    end
    SAST --> DD
    SCA --> DD
    IaC --> DD
    Img --> DD
    Sign -.verify.-> Falco
    PaC --> Falco
    Falco --> DD
    DD --> Met
    Met --> SLA
    SLA --> SAMM
    SAMM -.feeds back.-> SAST

    style DD fill:#FF9800,color:#fff
    style SAMM fill:#4CAF50,color:#fff
Loading
  • ๐ŸŽฏ Each box was a lab. Lecture 10 wires the program layer together
  • ๐Ÿชœ The feedback arrow is the whole point โ€” a security program that doesn't re-prioritize its scanning based on what it found is just a budget line

๐Ÿ“ Slide 19 โ€“ ๐Ÿ›ฃ๏ธ What's Next (Lecture 10)

Lecture 10 takes everything in the program layer above and walks the vulnerability management lifecycle:

  1. ๐Ÿ”Ž Discovery โ€” ingest from all the tools you've configured (Labs 4โ€“9)
  2. ๐Ÿท๏ธ Triage โ€” dedup, assign severity, push to SLA queue
  3. ๐Ÿฉน Remediation โ€” fix, suppress with reason, or accept with expiry
  4. ๐Ÿ“Š Reporting โ€” close the loop to the program metrics on the previous slide
  5. ๐Ÿชœ Improvement โ€” feed findings into next-cycle SAMM goals
  • ๐Ÿงช Lab 10 brings up DefectDojo v2.58.x locally and imports every report you've generated in Labs 4โ€“9
  • ๐Ÿ“š By the end of Lab 10 you'll have a single dashboard showing every CVE the course exposed you to โ€” and an MTTR distribution to argue about

๐Ÿ“ Slide 20 โ€“ ๐Ÿ“š Resources & Takeaways

Books (pick one to read this term):

๐Ÿ“– Book โœ๏ธ Why
Securing DevOps โ€” Julien Vehent (Manning, 2018) Runs through real Mozilla pipelines; ch. 5โ€“7 on monitoring are the closest match to this lecture
Container Security โ€” Liz Rice (O'Reilly, 2020) Best single chapter (ch. 11) on runtime security; explains why eBPF won
Accelerate โ€” Forsgren, Humble, Kim (IT Revolution, 2018) The DORA metrics, with the research methodology behind them
Building Secure & Reliable Systems โ€” Google (O'Reilly, 2020, free PDF) Chapter on detection + response from people running it at planetary scale

Talks:

  • ๐ŸŽฅ "What Happens When Falco Detects?" โ€” KubeCon EU 2024, Loris Degioanni
  • ๐ŸŽฅ "OPA: The Universal Policy Engine" โ€” Tim Hinrichs, Styra (2021)
  • ๐ŸŽฅ "The DevOps Handbook in 2024" โ€” Gene Kim, DevOps Enterprise Summit

Standards & specs:

Takeaways:

# ๐Ÿง  Insight
1 Shift-left finds; shift-right catches what shift-left missed. You need both.
2 A finding without an owner and an SLA is noise. The matrix is the program.
3 MTTD/MTTR/vuln-age beat scan counts. Reward outcomes, not activity.
4 SAMM tells you where to go; BSIMM tells you where your peers actually are.
5 Compliance is downstream of risk management, not the other way around.

๐Ÿ’ฌ "The goal of detection is response. The goal of response is learning. The goal of learning is preventing the next one." โ€” adapted from Richard Bejtlich, The Practice of Network Security Monitoring (No Starch, 2013)