- ๐๏ธ April 7, 2014 โ Codenomicon + Google researchers disclose CVE-2014-0160 in OpenSSL โ a read-out-of-bounds in the TLS heartbeat extension introduced two years earlier
- ๐ OpenSSL powers TLS for ~17% of all internet-facing servers at the time. Every one of them leaked 64KB of memory per heartbeat to anyone who asked
- ๐ซ What attackers extracted: private keys, session cookies, passwords, API tokens โ all silently, no log entry
- ๐ช A simple bounds check would have fixed it. A modern SAST scanner โ looking for
memcpywith attacker-controlled length โ would have caught it - ๐ช A modern fuzzer or DAST tool sending a malformed heartbeat would have seen the response leak memory
- ๐ Cost to the industry: estimated >$500M in cert rotation alone, plus uncountable downstream breaches
๐ค Think: This lecture is about the two tool classes that read the code at rest (SAST) and watch it in motion (DAST). Each one would have caught Heartbleed. The teaching question is why we need both.
| # | ๐ Outcome |
|---|---|
| 1 | โ Distinguish SAST, DAST, and IAST โ and choose when each is the right tool |
| 2 | โ Read a Semgrep finding and explain what its pattern matched |
| 3 | โ Run OWASP ZAP against a target in both baseline + full-scan modes |
| 4 | โ Configure authenticated DAST with the ZAP Automation Framework |
| 5 | โ Correlate a single bug across a SAST and a DAST report โ the strongest possible evidence |
graph LR
L4["๐ L4 CI/CD<br/>SBOM+SCA"] --> L5["๐งช L5 SAST/DAST<br/>(here)"]
L5 --> L6["๐๏ธ L6 IaC<br/>scan"]
L5 -.feeds.-> L10["๐ฏ L10 Vuln<br/>management"]
style L5 fill:#FF9800,color:#fff
- ๐ช Building on L4: SBOM/SCA finds vulnerable dependencies. SAST finds vulnerable first-party code. DAST finds vulnerable running behavior. Together, they answer different questions about the same artifact
- ๐ฏ Lab 5 alignment: Task 1 runs ZAP (unauth + auth) against Juice Shop; Task 2 runs Semgrep against Juice Shop's source; Bonus correlates a finding across both
graph TB
SAST[๐ SAST<br/>Static analysis<br/>Reads source code<br/>Pre-build]
DAST[๐ DAST<br/>Dynamic analysis<br/>Black-box on running app<br/>Post-deploy]
IAST[๐ฏ IAST<br/>Instruments running app<br/>Agent in the JVM/JS runtime<br/>QA + Prod]
SAST -->|finds| C[Code-level flaws<br/>SQLi patterns, hardcoded secrets, crypto misuse]
DAST -->|finds| R[Runtime flaws<br/>Auth bypass, headers, server misconfig]
IAST -->|finds| H[Both, with low FP<br/>but needs agent + license]
style SAST fill:#2196F3,color:#fff
style DAST fill:#FF9800,color:#fff
style IAST fill:#9C27B0,color:#fff
| ๐งช Type | ๐ What it sees | ๐ซ What it misses |
|---|---|---|
| SAST | Source code, every branch | Runtime config, auth state, deployed env |
| DAST | Real HTTP/protocol behavior | Code paths not exercised by the crawler |
| IAST | Tainted-data flow inside the app | Anything outside the instrumented runtime |
- ๐ช This course teaches SAST + DAST (free OSS tools). IAST is mentioned for awareness โ most IAST tools are commercial (Contrast Security, HCL AppScan, Veracode)
-
๐๏ธ Roots in static analysis research (Lattice theory, abstract interpretation โ Cousot & Cousot, 1977). Industrial SAST tools for security appeared in the early 2000s (Fortify 2002, Coverity 2002)
-
๐ฏ What modern SAST does: parses your code into an AST + dataflow graph, then matches patterns that indicate vulnerabilities
-
๐ ๏ธ Examples of what SAST finds in source:
- SQL string concatenation โ SQL injection
eval()on user-controlled input โ command injection- Hardcoded secrets (overlap with gitleaks)
- Cryptographic primitives misuse (MD5 for hashing, ECB mode, hardcoded keys)
- Insecure deserialization patterns (
pickle.loads(request.body))
-
๐ช The tradeoff: false positives. Modern SAST tools (Semgrep, CodeQL, Bandit) hover around 40โ60% FP rate. Better than 90% from earlier-generation tools, but the triage discipline (Lecture 10) is still essential
- ๐ข Created by r2c (Returned-to-Code, founded by Stanford alumni), open-sourced 2017 โ now Semgrep Inc.
- ๐ Implementation: Python + Rust core (uses
tree-sitterfor parsing) - ๐ 20+ languages supported with native parsers: Python, JS/TS, Go, Java, C/C++, Ruby, Rust, PHP, Kotlin, Swift, Scala, ...
- ๐ข Course pins Semgrep CE 1.x (latest stable as of April 2026)
- ๐ Free OSS edition; paid SaaS (Semgrep AppSec Platform) adds dashboards, secrets dataflow, AI triage
# Free + offline (this course)
pip install semgrep # course pins 1.x stable
semgrep --config=p/owasp-top-ten ./src/ # community ruleset- ๐ช Rule packs:
p/owasp-top-ten,p/security-audit,p/javascript,p/python,p/secrets. Mix and match - ๐ฏ The 2026 benchmark Semgrep CE: 87% true-positive rate, 42% false-positive rate. Use it. Tune it. Don't trust it blindly.
rules:
- id: python-sql-concat
message: SQL string concatenation may allow SQL injection
severity: ERROR
languages: [python]
pattern-either:
- pattern: cursor.execute("..." + $X)
- pattern: cursor.execute(f"...{$X}...")
- pattern: cursor.execute("...{}".format($X))
fix: |
Use parameterized queries:
cursor.execute("... %s ...", ($X,))| ๐งฉ Section | ๐ฏ What it does |
|---|---|
pattern-either |
Match any of the listed patterns |
$X |
Metavariable: matches an arbitrary expression |
fix |
What the auto-fix produces |
severity |
One of INFO / WARNING / ERROR โ used by CI gates |
- ๐ง Why Semgrep rules are revolutionary: they look like the code they're matching. A junior engineer can write a custom rule in an hour โ compare to writing a CodeQL query (a small DSL) or a Coverity checker (C)
| ๐จ Failure mode | ๐ก Why | ๐ ๏ธ Mitigation |
|---|---|---|
| False positives drown the team | Pattern fires on safe code | Tune rules; use diff-only scanning |
| False negatives (missed bugs) | The pattern doesn't match this idiom | Add a new rule when one is found |
| Scanner doesn't speak your DSL | Custom ORM, custom logger | Either write a custom rule or accept the gap |
| Reachability blindness | A vulnerable function exists but is never called | Move to taint-based SAST (CodeQL) or accept |
| Scaling: 1M+ lines of code | Scan time blows past CI timeout | Diff-scan only; nightly full-scan |
- ๐ง Diff scanning =
semgrep --baseline-ref origin/mainruns only against files changed in the PR. This is the only way SAST is sustainable on large repos. Standard pattern since 2022
-
๐ฏ DAST = Dynamic Application Security Testing. Treat the running app as a black box: send crafted HTTP requests, observe responses
-
๐ช What DAST sees that SAST can't:
- Authentication flow bugs (session fixation, broken MFA)
- Server misconfiguration (missing security headers, debug pages exposed)
- Information disclosure on error (stack traces in the response)
- Insecure redirect handling
- Rate-limit gaps
- TLS posture (cipher, cert)
-
๐ซ What DAST can't see:
- Code paths it can't reach (anything behind a paywall it can't bypass)
- Logic bugs that need real domain knowledge ("can a buyer mark an order as shipped?")
- Compiled-out branches
- ๐ข Created by Simon Bennetts in 2010 as a fork of Paros Proxy; the longest-running OWASP flagship project after the Top 10
- ๐ช As of 2024, OWASP ZAP is maintained by Checkmarx (Simon Bennetts joined Checkmarx; project remains OSS under OWASP)
- ๐ Java + plenty of add-ons; CLI + GUI + Docker image
- ๐ข Course pins ZAP v2.15.x (April 2026 stable)
- ๐ ๏ธ Two main scan modes:
- Baseline โ passive scan, no attacks (fast, ~1-2 min)
- Full scan โ active scan (sends payloads, may break the target โ staging only)
# Baseline scan (this is Lab 5 Task 1.1)
docker run -t -v "$PWD:/zap/wrk" ghcr.io/zaproxy/zaproxy:stable \
zap-baseline.py -t http://juice-shop:3000 -r baseline-report.htmlThe first DAST run scans only what an anonymous user sees. The real vulns live behind login.
# zap-auth.yaml โ ZAP Automation Framework
env:
contexts:
- name: juice-shop
urls: [http://juice-shop:3000]
authentication:
method: json
parameters:
loginPageUrl: http://juice-shop:3000/#/login
loginRequestUrl: http://juice-shop:3000/rest/user/login
loginRequestBody: '{"email": "admin@juice-sh.op", "password": "admin123"}'
verification:
method: response
loggedInRegex: "authentication"
users:
- name: admin
credentials:
username: admin@juice-sh.op
password: admin123
jobs:
- type: spider
parameters: { context: juice-shop, user: admin }
- type: activeScan
parameters: { context: juice-shop, user: admin }- ๐ช Lab 5 ships this config pre-written as plumbing โ students fill in the credentials and run the framework
- ๐ช Authenticated scan finds 10โ20ร more issues than unauth โ the math of attack surface
-
๐ช Recall Lecture 1: Juice Shop is the canonical "deliberately broken" web app, ~100 documented vulnerabilities
-
๐ช Why it's perfect for SAST+DAST learning:
- Real Node.js/Angular/SQLite stack โ Semgrep has rules for all of these
- Realistic auth (JWT, OAuth, MFA challenges)
- ZAP finds many but not all
- Semgrep finds many but not all
- The remainder is what teaches you why you need both
-
๐ง The Juice Shop scoreboard (
/#/score-board) lets you cross-check tool findings against ground truth โ did the scanner actually find the SQLi, or just the form field?
- ๐๏ธ March 28, 2018 โ Drupal discloses CVE-2018-7600. Severity: highly critical. CVSS 9.8
- ๐ The bug: Drupal renders form-field input through the Form API in a way that lets attackers achieve RCE via specially-crafted query strings
- ๐ช What SAST + DAST would have done:
- SAST: A taint-based scanner (Semgrep with
taint:rules, CodeQL) tracking user-input โ eval-like sinks would have flagged the path within the Form API code - DAST: A fuzzer hitting form endpoints with mutated input would have observed the RCE
- SAST: A taint-based scanner (Semgrep with
- ๐ Impact: within 24 hours of disclosure, mass scans hit Drupal sites worldwide. ~115,000 sites identified vulnerable within a week
- ๐ง The lesson: SAST and DAST both can catch a Drupalgeddon. Whether they will depends on rules + crawler depth. Diversity of tooling is the resilience
- ๐๏ธ January 11, 2024 โ GitLab discloses CVE-2023-7028 (CVSS 10.0): account takeover via password reset โ the reset email could be sent to an unverified email address controlled by the attacker
- ๐ The bug: the password-reset endpoint accepted multiple email addresses, sending the reset link to all of them โ including one the attacker injected
- ๐ช SAST? Probably misses it โ the bug is in business logic, not in a syntactic pattern
- ๐ช DAST? Would catch it if the authenticated scan exercises the password-reset flow and the rules check for email-validation logic โ most don't out of the box
- ๐ง The honest lesson: SAST + DAST are necessary but not sufficient. Some bugs need bug bounties or manual pentest. Threat modeling (L2) is the only structured way to discover this class
flowchart LR
S[๐ SAST hit:<br/>cursor.execute on user input<br/>at api/search.py:42] -->|correlate| C[๐ฏ High-confidence finding]
D[๐ DAST hit:<br/>SQLi confirmed at<br/>/search?q=' OR '1'='1] -->|correlate| C
C --> Action[๐ฅ Fix immediately<br/>real exploit + exact line]
style C fill:#F44336,color:#fff
- ๐ช Why correlation is the strongest signal:
- SAST alone = "this could be a bug"
- DAST alone = "this is a bug, but where?"
- Both = "this is a bug, at this line, with this payload"
- ๐ฏ Bonus task in Lab 5 produces exactly this kind of correlation report โ finding one vuln in Juice Shop with both tools and writing it up
- ๐ช Lab 10 (DefectDojo) automates the correlation across all tool outputs
# .github/workflows/sec.yml โ extends Lecture 4's pipeline
jobs:
sast:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@b4ffde6...
- uses: returntocorp/semgrep-action@v1
with:
config: p/owasp-top-ten
generateSarif: '1'
dast:
needs: [sast] # don't waste resources if SAST gates fail
runs-on: ubuntu-latest
services:
app:
image: ghcr.io/${{ github.repository }}/juice-shop:${{ github.sha }}
ports: [3000:3000]
steps:
- uses: zaproxy/action-baseline@v0.13.0
with:
target: http://localhost:3000- ๐ช The orchestration:
- SAST on PR โ fast, fail-on-error severity
- DAST baseline on PR โ passive, catches headers/config
- DAST full scan nightly โ active, against staging, finds the deep bugs
- ๐ง Don't run full-scan on every PR โ it'll take 30 min, will frustrate devs, and dev environments often can't withstand the load
-
๐ช IAST puts an agent inside the running app (JVM, .NET CLR, Node.js V8) and watches tainted data flow
-
๐ ๏ธ Tools: Contrast Security, HCL AppScan, Synopsys Seeker, Veracode IAST, plus open-source pieces (Aquila for Node.js)
-
๐ฏ Where IAST shines:
- Apps with complex internal flows where SAST can't track and DAST can't reach
- Fewer FPs than SAST (because the data actually flowed)
- Catches things at QA time, not just pre-deploy
-
๐ซ Why most teams skip it:
- Commercial licenses (mostly)
- Performance overhead (5โ20%)
- Language coverage gaps (great Java, weak others)
-
๐ช This course teaches SAST + DAST. IAST is a "know-it-exists" topic โ it's on senior interview questions
- ๐ฏ The first scan of a real codebase produces hundreds of findings. Don't fix-everything; triage first
- ๐ช A practical 3-bucket sort:
- ๐ด Confirmed exploitable (SAST + DAST agree, OR DAST alone with a successful payload) โ fix this week
- ๐ Plausible (SAST only, severity HIGH+) โ review in the next sprint
- ๐ต Low confidence / pattern noise โ tune the rule, batch-suppress with reason
- ๐ฆ Use diff scanning to keep new findings green and tackle the legacy backlog separately
- ๐งช The full triage workflow is the topic of Lecture 10; in Lab 5 you'll see the raw numbers
- ๐งช Lab 5 (this week):
- Task 1 (6 pts): Run ZAP baseline + authenticated scan against Juice Shop; analyze the report
- Task 2 (4 pts): Run Semgrep against Juice Shop's source with
p/owasp-top-ten+p/javascript - Bonus (2 pts): Find one vulnerability that appears in BOTH the Semgrep and ZAP reports; write the correlation up
- ๐ Lecture 6 (next week): IaC Security โ Checkov + KICS + Trivy on Terraform/Pulumi/Ansible. The next layer down: secure the infra your app runs on
Books:
| ๐ Book | โ๏ธ Why |
|---|---|
| Web Application Security โ Andrew Hoffman (O'Reilly, 2020) | The single most accessible book on what DAST is testing for |
| The Web Application Hacker's Handbook โ Stuttard & Pinto (2nd ed., Wiley, 2011) | Older but still the canonical reference for what DAST scanners try to do |
| Building Secure & Reliable Systems โ Google (O'Reilly, 2020, free PDF) | Ch. 11 "Disrupting Adversaries" on testing at scale |
Talks & specs:
- ๐ฅ "Semgrep: Easy Customization for Modern Codebases" โ Drew Dennison (r2c/Semgrep), Black Hat 2021
- ๐ฅ "DAST in 2024: Beyond the Spider" โ Simon Bennetts (Checkmarx/ZAP), AppSec EU 2024
- ๐ Semgrep Registry โ every public ruleset
- ๐ OWASP ZAP Documentation
- ๐ OWASP Benchmark โ ground-truth comparison of SAST tools
Takeaways:
| # | ๐ง Insight |
|---|---|
| 1 | SAST reads code at rest; DAST watches code in motion. Either alone has known gaps. |
| 2 | Semgrep rules look like the code they match โ anyone on your team can write one. |
| 3 | Authenticated DAST finds 10โ20ร more than unauth. Wire it up in Lab 5; don't skip it. |
| 4 | Correlation across SAST + DAST is the highest-confidence finding type. Lab 5 bonus produces this. |
| 5 | Diff-scan is the sustainability discipline โ gate new findings, schedule the backlog. |
| 6 | Heartbleed would have been caught by either tool with the right rule. Coverage > frequency. |
๐ฌ "The bug is in the code. The exploit is in the running app. To find both, look at both." โ paraphrased from every AppSec engineer ever.