Skip to content

Commit 0e739a1

Browse files
jandro996devflow.devflow-routing-intake
andauthored
Implement SCA Reachability runtime detection: report vulnerable classes and callsites via telemetry (#11352)
Implement SCA Reachability: detect vulnerable library classes at runtime Adds a new SCA Reachability subsystem that reports which vulnerable library classes were actually loaded at runtime, reducing false positives from static dependency scanning. Gated on DD_APPSEC_SCA_ENABLED. Key components: - Gradle task downloads GHSA enrichments from sca-reachability-database and generates sca_cves.json bundled in the agent jar at build time - ClassFileTransformer (observation-only) detects when vulnerable classes are loaded, resolves JAR versions via pom.properties, and checks semver ranges using ComparableVersion (Maven semantics) - ScaReachabilityCollector bridges the transformer and telemetry without circular dependencies, following the WafMetricCollector pattern - ScaReachabilityPeriodicAction reports hits on each app-dependencies-loaded heartbeat by adding reachability metadata to existing dependency entries Commit sca_cves.json as versioned resource; update generateScaCvesJson task The Gradle task now writes to src/main/resources/ and runs only when -PrefreshSca is passed or the file is absent, so CI builds never need network access to the private sca-reachability-database repo. Fix Path B classpath scan for Java 9+: fall back to java.class.path On Java 9+, the system classloader (jdk.internal.loader.ClassLoaders$AppClassLoader) no longer extends URLClassLoader, so the URLClassLoader chain walk misses all main classpath entries. Add a fallback that reads java.class.path to cover this case, deduplicating with a HashSet<URL> to avoid scanning the same JAR twice. Add Java 9+ test for Path B classpath fallback; make method package-private Test verifies: (1) system classloader is not URLClassLoader on Java 9+, and (2) findArtifactVersionInClasspath finds artifacts via java.class.path fallback. Applies to Java 9 and all subsequent JDKs (permanent JDK design change). Implement method-level symbol detection with ASM bytecode injection When sca_cves.json contains symbols with method != null, the transformer injects a static callback at method entry using ASM. The callback fires the first time the method is called and reports via ScaReachabilityCallback (bootstrap classloader, accessible from any application class). Key changes: - ScaReachabilityCallback in agent-bootstrap: bootstrap-visible callback with runtime dedup (vulnId|artifact|methodName) and handler registration - ScaReachabilityTransformer: injectMethodCallbacks() uses ByteBuddy ASM to inject INVOKESTATIC at first line number of each target method; processClass() routes class-level vs method-level symbols separately - ScaReachabilityHit: adds symbolName + line fields; existing constructor defaults to <clinit>/1 for class-level hits (backward compatible) - ScaReachabilityPeriodicAction: buildMetadataValue() now uses hit.symbolName() and hit.line() instead of hardcoded values - 6 tests: ASM injection, callback fires on method call only, dedup, multiple methods, safe method not reported, class-level unaffected Retransform classes for method-level detection: already-loaded and version-unresolved Two cases required deferred retransformation: 1. Classes already loaded at startup (before transformer registered): the bytecode callback cannot be injected without retransformClasses() 2. Classes where DependencyResolver returned empty deps at load time (version not yet resolvable): empty results are now not cached to allow retries ScaReachabilityTransformer now stores Instrumentation and exposes performPendingRetransforms() called on each telemetry heartbeat via a Runnable callback in ScaReachabilityCollector.periodicWorkCallback. Classes are queued via: - pendingRetransform (Class<?> queue) from checkAlreadyLoadedClasses - pendingRetransformNames (String set) from processClass on empty deps Fix: remove incorrect dedup from injectCallbacks; update invariants retransformClasses() always starts from the ORIGINAL class file bytes, not from the previously-transformed bytes. A dedup check in injectCallbacks() that blocked re-injection on the second pass caused the callback to be removed (the class was returned to its original, un-instrumented state). The authoritative dedup for method-level hits is ScaReachabilityCallback.reported (bootstrap-side), which persists across retransformations regardless of how many times transform() is called on the same class. Also update .claude-invariants.md: retransformClasses is now used (for method-level only), the cache constraint clarified, and the dedup invariant documents the two-level approach (transformer for class-level, bootstrap for method-level). pr-review: fix null guard, encapsulate periodicWorkCallback, update Javadoc, add retransform tests - performPendingRetransforms(): early return when instrumentation is null (unit test safety) - ScaReachabilityCollector: encapsulate periodicWorkCallback as private with getter/setter - ScaReachabilityTransformer class Javadoc: update dedup description from (vulnId,artifact) pair to (vulnId,artifact,symbolName) tuple; document two-level dedup strategy - Add 3 tests for performPendingRetransforms(): no-op with null inst, retransformClasses called for pending queue, no-op when both queues empty Fix two Codex review issues: java.nio in premain and transitive JAR resolution P1: Replace StandardCharsets.UTF_8 with "UTF-8" string in ScaCveDatabase.load(). java.nio.* is forbidden during premain (bootstrap_design_guidelines.md) because it can trigger premature provider initialization before the app configures the runtime. P2: Add classpath fallback in resolveVersionForArtifact() for entries where the vulnerable artifact is an aggregator/starter POM whose watched classes live in a transitive dependency JAR (e.g., spring-boot-starter-web watches @controller but @controller is defined in spring-context.jar, not the starter). The new helper first checks the class's own JAR, then falls back to findArtifactVersionInClasspath with a hit cache (classpathArtifactCache). processPathA uses the same helper for consistency. Refactor: extract CLASS_LEVEL_SYMBOL constant and reportClassLevelHitIfPresent helper - Add CLASS_LEVEL_SYMBOL = "<clinit>" constant to avoid magic string repetition (appeared 3 times in the same class; a typo would silently produce wrong symbol names) - Extract reportClassLevelHitIfPresent(entry, version, internalClassName) helper to unify identical class-level symbol matching loops in processPathA, processPathB, and processClass — all three now delegate to the single helper Move CLASS_LEVEL_SYMBOL to ScaReachabilityHit; fix misleading comment Move CLASS_LEVEL_SYMBOL = "<clinit>" to ScaReachabilityHit (internal-api) as a public constant so both the transformer (appsec) and the telemetry payload builder share the canonical definition without cross-module string duplication. The convenience constructor also uses the constant now. ScaReachabilityTransformer delegates to ScaReachabilityHit.CLASS_LEVEL_SYMBOL. Fix misleading comment in processClass: "We enqueue via classBeingRedefined is null here" → explains that classBeingRedefined is null on first class load, preventing direct Class<?> queuing, so scheduleRetransformByName is used instead. Move java.nio comment to usage site; add tests for transitive JAR fallback - ScaCveDatabase: move "java.nio.* forbidden in premain" comment from the imports block to inline at the InputStreamReader construction site (comments in imports are unusual and smola flags verbose placement) - ScaReachabilityTransformer.resolveVersionForArtifact: make package-private for testing; add 4 tests covering the two-step fallback: (1) version from classJarDeps directly (2) classpath fallback when classJarDeps is empty (transitive JAR case) (3) classpathArtifactCache hit on second call (4) null for absent artifact Remove dead visitCode() override and redundant CLASS_LEVEL_SYMBOL alias - Remove empty visitCode() in MethodEntryInjector: the method only called super.visitCode() and its comment was misleading — the actual no-debug-info fallback injection is handled by ensureInjected() in the visitInsn/visitVarInsn/ visitMethodInsn/visitFieldInsn overrides, not here - Remove private CLASS_LEVEL_SYMBOL alias in ScaReachabilityTransformer: the constant is used in exactly one place (reportClassLevelHitIfPresent) and ScaReachabilityHit.CLASS_LEVEL_SYMBOL is self-documenting at that site; the alias added a private field with no benefit after the constant was moved to ScaReachabilityHit in a previous commit Capture callsite for method-level hits (mirrors Python tracer) Per the RFC and Python implementation (dd-trace-py#17156), the telemetry payload path/symbol/line for method-level hits must report the APPLICATION FRAME that called the vulnerable method (the callsite), not the vulnerable method itself. ScaReachabilityCallback.onMethodHit() now walks Thread.getStackTrace() to find the first non-agent, non-JDK frame after the vulnerable class: ScaReachabilityCallback.onMethodHit (skip - us) com.foo.VulnerableClass.method (skip - vulnerable class) com.myapp.UserService.processRequest (CALLSITE - report this) The dotClassName/methodName params are still baked into the bytecode and used only for deduplication (vulnId|artifact|methodName key). The handler receives the callsite's class/method/line for telemetry. Fallback: if no application frame is found (e.g. called from JDK internals), reports the vulnerable symbol itself so the backend knows it was reached. Class-level hits (<clinit>) are unchanged — no callsite at class load time. Move callsite detection from bootstrap to ScaReachabilitySystem handler ScaReachabilityCallback (bootstrap) must stay minimal — complex logic does not belong there. Move findCallsite() to ScaReachabilitySystem which has access to internal-api utilities. The handler runs synchronously so the full call stack is still present: ScaReachabilitySystem handler ScaReachabilityCallback.onMethodHit <vulnerable method> <application callsite> ← reported Uses the same class-prefix predicate as AbstractStackWalker. isNotDatadogTraceStackElement (package-private, so replicated inline) to skip agent/JDK frames, consistent with the IAST trie-based filtering infrastructure used elsewhere in the codebase. Use AbstractStackWalker.isNotDatadogTraceStackElement for callsite filtering Make isNotDatadogTraceStackElement public in AbstractStackWalker so SCA Reachability can use the existing predicate directly rather than duplicating the 3 class-prefix conditions inline. Add tests for ScaReachabilitySystem.findCallsite(); document fallback path ScaReachabilitySystemCallsiteTest covers: - findCallsite returns null when vulnerable class is not on the stack (triggers fallback: handler reports the vulnerable symbol itself) - findCallsite skips the vulnerable class frame and returns the first non-agent frame above it (using java.lang.Thread as a non-agent class guaranteed to be at the top of getStackTrace()) Note on the method-level integration test: TargetClass is in com.datadog.appsec.sca.* (agent namespace) so AbstractStackWalker filters it as agent code and findCallsite() returns null. The test now documents this fallback behaviour explicitly. In production the vulnerable class is always a 3rd-party library (e.g. com.fasterxml.jackson.*) and the happy path fires correctly — verified by ScaReachabilitySystemCallsiteTest. Update ScaReachabilityHit Javadoc to reflect dual callsite/symbol semantics Move findCallsite() after start() — helpers after main public method Use ConcurrentHashMap.newKeySet() instead of verbose newSetFromMap idiom Lazy entryHasMethodLevelSymbol check — avoid stream alloc on normal path The stream().anyMatch() for detecting method-level symbols was computed for every entry unconditionally. It is only needed when version == null (deps not yet resolvable). Moving the check inside the version==null branch eliminates the stream allocation on the common path where the version resolves successfully. Remove Path B from startup scan — JDK symbols are false positive indicators JDK classes (e.g. java.sql.PreparedStatement, protectionDomain==null) are loaded by ANY app that uses JDBC, regardless of which driver is present. Using their presence to infer that a specific library (e.g. PostgreSQL) is "reachable" produces classpath-presence false positives, not runtime reachability signals. Entries that list JDK symbols (e.g. the PostgreSQL advisory) also include library-specific classes (e.g. org.postgresql.ds.PGSimpleDataSource) that Path A correctly detects when those classes are actually loaded. In checkAlreadyLoadedClasses(), classes with no code source (JDK/bootstrap) are now skipped silently. The invariants and KB are updated accordingly. Remove dead processPathB() — never called after Path B removal Fix dedup key to include class name for method-level hits The key vulnId|artifact|methodName collapses hits for different classes in the same artifact that share a method name (e.g. ClassA.parse and ClassB.parse would both map to the same key, suppressing the second). Fix: add dotClassName to the key → vulnId|artifact|dotClassName|methodName so each class+method combination is tracked independently. Add regression test that verifies both hits are reported when the same method name exists in two different vulnerable classes of the same artifact. Implement stateful RFC heartbeat model for SCA telemetry Replace stateless ScaReachabilityCollector (simple hit queue) with ScaReachabilityDependencyRegistry (stateful per-dependency CVE tracking) to comply with the RFC heartbeat specification: 1. When a class from a vulnerable version loads: registerCve() creates a CVE entry with reached=[] and marks the dependency as pending, so the next heartbeat reports metadata:[{"type":"reachability","value":"...reached:[]"}] — signalling the backend that SCA is monitoring this CVE before any symbol is called. 2. When a vulnerable method is called: recordHit() stores the first callsite (RFC: single occurrence sufficient) and marks the dependency as pending. 3. On each heartbeat: ScaReachabilityPeriodicAction drains pending dependencies and re-reports ALL CVEs for each dependency together (both with and without callsites), then clears pending. Empty heartbeat otherwise. Key invariant: whenever any CVE state changes, ALL CVEs for the same dependency are re-reported together so the backend has a complete picture. Add smoke test for SCA Reachability telemetry (APPSEC-62260) Verifies the RFC stateful heartbeat model end-to-end: - jackson-databind:2.6.0 (vulnerable, range < 2.6.7.3) appears in app-dependencies-loaded with metadata reachability entries - GHSA identifier present in the metadata value - reached[] contains a callsite after ObjectMapper is loaded at startup Uses the existing springboot smoke test app (already has jackson-databind:2.6.0) with DD_APPSEC_SCA_ENABLED=true added to JVM args. Add method-level symbols for jackson-databind deserialization CVEs For the 7 jackson-databind CVE entries, adds method-level symbols for the deserialization entry points that actually trigger gadget chains when polymorphic typing is enabled with untrusted input: ObjectMapper.readValue — primary deserialization entry point ObjectMapper.readValues — multiple-value deserialization ObjectReader.readValue — reader-based deserialization variant ObjectReader.readValues — reader-based multiple values Class-level symbols (method=null) are kept alongside the new method-level ones: class load detection signals the library is present; method detection signals the vulnerable code path was actually invoked. 26 method-level symbols added across 7 entries (ObjectMapper + ObjectReader × readValue + readValues × 7 GHSA entries). Add method-level symbols for xstream, log4j, snakeyaml, jackson-mapper-asl Adds method-level detection for 24 entries across 4 libraries where the deserialization/injection entry point is 100% certain: XStream (17 entries): fromXML — THE entry point for all XStream CVEs; triggers gadget chains when deserializing untrusted XML log4j-core (4 entries): info, error, warn, debug, trace, fatal, log — Log4Shell (GHSA-jfh8-c2jp-5v3q) triggers JNDI lookup when log messages contain ${jndi:...} patterns; any Logger method is an entry point snakeyaml (1 entry): load, loadAll — unsafe YAML deserialization; instantiates arbitrary Java classes from untrusted YAML input jackson-mapper-asl (1 entry): readValue, readValues — same deserialization pattern as jackson-databind, applies to the legacy 1.x mapper 56 method-level symbols added. Class-level symbols (method=null) are kept alongside the new method-level ones for dual detection coverage. Fix SCA smoke test, RFC compliance and add heartbeat flow tests - ScaReachabilitySmokeTest: fix find() to look for the entry with reachability metadata — the same dep appears twice (once from DependencyService without metadata, once from SCA with CVE data) - TelemetryRequestBody.writeDependency(): write metadata:[] even when list is empty — null means SCA disabled, empty list means SCA active but no CVEs detected yet (RFC: all deps get metadata:[] at startup) - sca_cves.json: remove class-level symbol from snakeyaml — Spring Boot loads Yaml at startup causing registerCve+recordHit to fire in the same request, preventing the reached:[] heartbeat from being observed - ScaReachabilityPeriodicActionTest: add rfcFullHeartbeatFlow test covering Heartbeats #2-#6 from the RFC spec - TelemetryRequestBodyDependencyMetadataTest: update to reflect that metadata:[] is written (not suppressed) when list is empty fix(smoke): add braces to if statement to satisfy CodeNarc IfStatementBraces rule fix(spotbugs): make periodicWorkCallback private, expose via getter refactor: replace Map<?,?> casts with typed Moshi DTOs in ScaCveDatabase cleanup: remove stale Path A/B terminology after Path B was removed chore: remove .claude-invariants.md from tracking, add to .gitignore fix(forbiddenapis): replace String#split() with pre-compiled Pattern.split() fix: remove class-level symbols from all xstream entries Same issue as snakeyaml: class-level hit fires when XStream loads and registers <clinit> as callsite. First-hit-wins then blocks fromXML() method-level callbacks. Keeping only fromXML gives real application callsite info. feat: emit metadata:[] for all deps in DependencyPeriodicAction when SCA enabled Per RFC: when DD_APPSEC_SCA_ENABLED=true every dependency must be reported with metadata:[] as the initial SCA-active signal. ScaReachabilityPeriodicAction later re-emits the same dependency with actual CVE metadata. The smoke test already handles the resulting duplicate entries by filtering for the entry that carries reachability metadata. fix(spotbugs): replace URL collections with URI to avoid DNS lookups (DMI_COLLECTION_OF_URLS) chore: remove dead ScaReachabilityCollector, fix stale Javadoc, drop redundant parameter refactor: unify dep reporting into ScaReachabilityPeriodicAction when SCA enabled - DependencyPeriodicAction reverts to original (SCA-unaware) - ScaReachabilityPeriodicAction now accepts DependencyService and takes over all app-dependencies-loaded emission when SCA is enabled: 1. Drains registry → map by artifact@version 2. Drains DependencyService → merges with CVE state or emits metadata:[] 3. Emits remaining registry snapshots (existing deps with state changes) - TelemetrySystem registers ScaReachabilityPeriodicAction instead of DependencyPeriodicAction when SCA is enabled — no duplicates possible - Tests cover all three merge paths fix(techdebt): static imports, remove inline java.util refs, replace em-dashes with hyphens fix: restore ScaReachabilityPeriodicActionTest; fix raw type Class[0] to Class<?>[0] fix(techdebt): move pendingRetransformNames to field section; json-escape buildMetadataValue; deduplicate recordHit fix(techdebt): extract depKey helper; delete empty test stub ScaReachabilityTransformerTest fix(thread-safety): use AtomicReference.compareAndSet for first-hit-wins in CveState Replace volatile ScaReachabilityHit hit with AtomicReference<ScaReachabilityHit>. The volatile check-then-assign (if hit == null -> hit = newHit) is not atomic: two threads calling recordHit for the same CVE via different methods could both observe null and both write, with the second overwriting the first. compareAndSet guarantees exactly one callsite is stored regardless of thread interleaving. Add ScaReachabilityDependencyRegistryTest with a 20-thread concurrency test that verifies exactly one callsite is recorded under simultaneous competing writes. fix: remove stale .claude-invariants.md reference from ScaReachabilityTransformer Javadoc fix: wrap onMethodHit in try/catch to prevent exception propagation to application code; fix assertTrue usage in test fix: use knownDeps to enrich CVE snapshots with source/hash from prior DependencyService drains When a CVE fires (class loaded + method called) and DependencyService has not yet resolved the dep in the same heartbeat, Step 3 previously emitted the dep without source/hash. The backend requires source/hash to correlate the entry with a known dependency, so it would silently ignore the CVE data. Fix: ScaReachabilityPeriodicAction now maintains a persistent knownDeps map across heartbeats. Each dep drained from DependencyService is stored there. Step 3 looks up knownDeps to enrich CVE snapshots; if the dep is not yet known (JAR still resolving), re-marks it as pending and retries next heartbeat instead of emitting without source/hash. Add ScaReachabilityDependencyRegistry.markPending() to re-mark a dep as pending without resetting CVE state, used by the retry logic. Tests cover: CVE-after-dep-resolved (uses knownDeps), CVE-before-dep-resolved (retries until dep known), simultaneous dep+CVE (Step 2 merge still works). fix: emit CVE data immediately in Step 3, use knownDeps only for source/hash enrichment The retry-until-resolved approach blocked emission entirely when DependencyService had not yet resolved a JAR, causing system tests to timeout (25s window). The backend can correlate entries by name:version alone, so blocking is not needed. New behavior: always emit in Step 3. Use knownDeps to enrich with source/hash when available (dep was resolved in a prior heartbeat); otherwise emit without source/hash. Subsequent emissions (e.g., after a method hit) will be enriched once knownDeps is populated from DependencyService, giving the backend the correlation data it needs. fix(sca): force snakeyaml class load in smoke test via PostConstruct The smoke test app uses application.properties (not application.yml), so Spring Boot never triggers snakeyaml loading. ScaReachabilityInit instantiates Yaml in @PostConstruct to ensure class-level CVE detection at startup. Also removes the unused ScaReachabilityCollector draft and reverts TelemetrySystem to register both DependencyPeriodicAction and ScaReachabilityPeriodicAction (restoring dual-action mode). Smoke test now passes in ~4s with the default 30s timeout. ci: retrigger CI refactor(sca): remove dead markPending, inline scheduleRetransformByName, fix recordHit TOCTOU - Remove unused markPending() - the retry-via-re-mark flow was superseded by the emit-immediately approach in Step 3 of ScaReachabilityPeriodicAction - Inline private single-statement scheduleRetransformByName() at its only call site - Fix non-atomic containsKey+get in recordHit: use computeIfAbsent directly, which eliminates the race and avoids the unintended pendingReport=true side effect that unconditional registerCve() would have introduced - Update stale Javadoc in ScaReachabilityPeriodicAction referencing the old re-mark-pending retry strategy fix(sca): register only ScaReachabilityPeriodicAction when SCA enabled (Option C) When DD_APPSEC_SCA_ENABLED=true, ScaReachabilityPeriodicAction handles all app-dependencies-loaded emission by merging DependencyService drains with CVE registry state. DependencyPeriodicAction is skipped to avoid: - Duplicate entries per dep per heartbeat - DependencyService queue being drained before ScaReachabilityPeriodicAction can see new JARs (which prevented knownDeps from being populated and caused CVE entries to always emit without source/hash) The previous attempt (commit b0421ab) failed system-tests due to a markPending retry in Step 3 that could delay CVE emission by one heartbeat cycle. That was fixed in 6925358 (immediate emit). The current code is safe. revert: restore pre-existing em dashes in GatewayBridge, ObjectIntrospection, StandardizedLogging fix: hoist dotClassName conversion outside inner loop in processClass fix(sca): deduplicate index entries per class when entry has multiple symbols for same class An entry with symbols [Yaml.load, Yaml.loadAll] indexed once per symbol causing the same ScaEntry to appear twice for org/yaml/snakeyaml/Yaml. processClass then iterated the duplicate and injected bytecode callbacks twice per method, resulting in redundant ScaReachabilityCallback.onMethodHit() calls on every invocation. Fix: track seen class names per entry with a HashSet; add regression test. fix(sca): include version in hit dedup keys to isolate multi-version classloader scenarios ScaReachabilityCallback.reported and ScaReachabilityTransformer.reportedHits keyed hits without the artifact version. In a multi-classloader setup (multiple WARs, OSGi) where two versions of the same library coexist, the first version's hit suppressed all subsequent versions for the same CVE and symbol. Both keys now include version: vulnId|artifact|version|class|method. fix(sca): skip intermediate library frames in callsite detection When the call chain is client -> lib_wrapper -> vulnerable_method, findCallsite was reporting lib_wrapper instead of the client frame. Add sca_stack_exclusion.trie (generated as ScaStackExclusionTrie) and apply it after the vulnerable-class boundary to skip known library packages, matching the approach used by IAST SinkModuleBase. Refactor findCallsite to a testable overload that accepts an explicit StackTraceElement[], and rewrite tests with synthetic stacks. fix(sca): add TODO for inner-class format in GhsaEnrichmentParser Document that replace('.', '/') would produce wrong internal names for inner classes if GHSA uses dot notation (e.g. Outer.Inner instead of Outer$Inner). Current database has no inner class entries so safe for now; track for when database team defines method-symbol format. refactor(sca): defer class processing off class-loading thread; cleanup Addresses Manuel's review comment: all heavyweight processing now runs on the telemetry heartbeat thread instead of inline during class loading. Changes in ScaReachabilityTransformer: - transform() on first load (classBeingRedefined == null) only enqueues a PendingClass(className, jarURL) record and returns null immediately. No JAR I/O (DependencyResolver.resolve), no version lookup, no hit reporting on the class-loading thread. Matches the pattern used by LocationsCollectingTransformer. - New processPendingClassEvents(): drains the queue on the telemetry heartbeat, performs all heavyweight work (JAR reads, version resolution, class-level hit reporting), and populates pendingRetransformNames for method-level symbols. Must run before performPendingRetransforms() so classes queued in one heartbeat are retransformed in the same cycle. - transform() on retransform (classBeingRedefined != null) takes the existing inline path (processClass 4-arg) to inject method-level bytecode - required by the ClassFileTransformer contract. - Tradeoff vs old inline approach: ~10s window at startup during which method calls go undetected. No behavioral difference for class-level (the only active symbols today). - performPendingRetransforms(): use contains()+removeAll() instead of remove() inside the getAllLoadedClasses() loop. Spring Boot fat JARs create multiple LaunchedURLClassLoader instances; the old remove() only matched the first instance, so the application classloader was never retransformed and method-level callbacks never fired. All classloader instances of the same class name now get the callback bytecode injected. Additional cleanups in the same files: - Rename void processClass(3-arg) to reportClassLevelHits to disambiguate it from the byte[]-returning processClass(4-arg) that injects bytecode. - Extract duplicated method-level symbol stream expression into private static helper hasMethodLevelSymbolForClass(List<ScaEntry>, String). - ScaReachabilitySystem.start() periodic work callback updated to call processPendingClassEvents() before performPendingRetransforms(). - ASM injection: replace SIPUSH with visitLdcInsn(line) so line numbers > 32767 (valid in generated code) produce correct bytecode instead of a truncated signed-short value. - Defer dotClassName allocation to the method-level symbol path where it is actually needed. - Update processClass(4-arg) Javadoc to reflect two-phase design: only called on retransform for method-level symbols; class-level hits were already reported by reportClassLevelHits in processPendingClassEvents. - Two new tests in ScaReachabilityMethodLevelTest verify the structural invariants: first load enqueues and returns null; retransform does not re-enqueue. refactor(sca): remove unused 4-arg convenience constructor from ScaReachabilityHit test(sca): add regression tests for multi-classloader retransform fix ScaReachabilityRetransformTest verifies that performPendingRetransforms() retransforms ALL classloader instances of a class, not just the first one. Simulates Spring Boot's multiple LaunchedURLClassLoader instances by returning the same Class<?> twice from getAllLoadedClasses() — with the old remove() approach only one entry was passed to retransformClasses(); with the fix (contains()+removeAll()) both are collected. Also tests the re-queue path: on retransformClasses() failure, all collected classes must be added back to pendingRetransform for the next heartbeat retry. Adds mockito to appsec test dependencies. Makes pendingRetransformNames package-private (consistent with pendingRetransform and pendingClassEvents). fix(sca): widen catch to Throwable in performPendingRetransforms InternalError and other JVM Errors from retransformClasses() would escape a catch(Exception) block and kill the telemetry thread silently. Matches the existing catch(Throwable) in transform() for the same reason. ci: retrigger pipeline fix(sca): log swallowed exceptions in ScaReachabilityCallback at debug level refactor(sca): move generateScaCvesJson into ScaEnrichmentsPlugin in buildSrc Move the inline generateScaCvesJson task from appsec/build.gradle into a proper Gradle plugin (ScaEnrichmentsPlugin) registered as 'dd-trace-java.sca-enrichments'. The plugin is simpler to test and removes ~90 lines of scripting from the build script. The plugin and committed sca_cves.json are a temporary solution — the symbol database will be delivered via Remote Config in a future iteration, at which point both will be removed. test(sca): add ScaEnrichmentsPluginTest; fix processResources wiring Use pluginManager.withPlugin("java") to defer the processResources dependency until after the java plugin has registered the task, avoiding a configuration-time failure when the plugin is applied before java. Add 4 integration tests using GradleFixture/TestKit: - task is SKIPPED when file exists and -PrefreshSca is absent - task attempts to run (not SKIPPED) when -PrefreshSca is set - task attempts to run (not SKIPPED) when file is absent - processResources depends on generateScaCvesJson fix(sca): scope processResources JSON minification to sca_cves.json only fix(sca): address jpbempel review comments - Replace FQN types with proper imports (Set, ConcurrentHashMap in ScaReachabilityCallback; Reader in ScaCveDatabase; Map.Entry in ScaReachabilityDependencyRegistry) - Narrow catch (Throwable) to catch (Exception) in onMethodHit: OOMError and StackOverflowError are JVM conditions that must propagate to the application; catch (Throwable) in performPendingRetransforms stays because retransformClasses() can throw InternalError by spec - Use Strings.getClassName() instead of manual replace('/', '.') - Guard getAllLoadedClasses() loops against null items (class unloading race) nit(sca): use project VisibleForTesting annotation in ScaReachabilityTransformer datadog.trace.api.internal.VisibleForTesting is available transitively via components:annotations (already on classpath via internal-api). Replace package-private comments with the annotation. nit(sca): add @VisibleForTesting to remaining package-private methods resolveVersionForArtifact and injectMethodCallbacks were still using the comment form; align with the annotation applied to the other testing-only members. perf(sca): use StackWalkerFactory for lazy stack evaluation in findCallsite Replace Thread.currentThread().getStackTrace() with StackWalkerFactory.INSTANCE.walk() so that on JDK9+ the stack is evaluated lazily: frames are walked one by one and evaluation stops as soon as the application callsite is found, without materializing the full stack array. Agent frame filtering is now handled by the walker itself (AbstractStackWalker.doFilterStack) rather than manually in the loop. The StackTraceElement[] overload used in tests is preserved; it delegates to the shared findCallsiteInStream helper with the agent filter applied manually on the array. fix(sca): address bric3 review comments on PR #11352 - Extract ASM injection into ScaMethodCallbackInjector (separate concerns) - Replace URLClassLoader walk with java.class.path scan (dead code on agent threads where AgentThreadFactory sets null context classloader; also not a URLClassLoader on Java 9+); add unit tests documenting both constraints - Replace stream with plain loop in hasMethodLevelSymbolForClass (avoids spliterator allocation on each class-load check) - Normalize class-level vs method-level symbol coexistence in ScaCveDatabase.toScaEntry(): drop class-level (method=null) symbol when method-level symbols exist for the same class in the same entry, preventing first-hit-wins hitRef from discarding the more specific callsite - Guard performPendingRetransforms with isModifiableClass() to avoid infinite retry loop for non-modifiable classes (JDK classes, primitive wrappers, etc.) - Disable SCA when telemetry or dependency collection is disabled (Agent.java): ScaReachabilityPeriodicAction is never registered in those cases, so pendingClassEvents would grow unbounded with no drain - Fix generateScaCvesJson up-to-date check: use upToDateWhen to force re-run when -PrefreshSca is set (onlyIf alone can be bypassed by Gradle's up-to-date check) - Add custom URL support to ScaEnrichmentsPlugin via -PscaEnrichmentsUrl - Add upper bound for ScaReachabilityDependencyRegistry (DD_APPSEC_SCA_MAX_TRACKED_DEPENDENCIES, default 1000) with warning log when cap is reached nit(sca): extract isCapExceeded helper, fix resetForTesting, use Strings.getClassName - Extract duplicated cap-exceeded guard into private isCapExceeded(String key) helper, called from both registerCve and recordHit - Add periodicWorkCallback = null to resetForTesting() to prevent test state leakage - Replace inline replace('/', '.') with Strings.getClassName for consistency with processClass test(sca): add registry unit tests; add @VisibleForTesting to addKnownDepForTesting - ScaReachabilityDependencyRegistryTest: add registerCve, drain semantics, cap enforcement (two tests: new keys rejected, existing keys still updated), and resetForTesting clears periodicWorkCallback - ScaReachabilityPeriodicAction: annotate addKnownDepForTesting with @VisibleForTesting fix(sca): prevent metadata:[] overwrite when JAR resolves after CVE was drained When DependencyService resolves a JAR in heartbeat N+1 but the CVE was already drained in heartbeat N (pendingReport=false), Step 2 previously emitted metadata:[] for that dep. If the backend uses last-write-wins semantics, this would silently erase the CVE state reported in HB N. Fix: add peekSnapshot() to ScaReachabilityDependencyRegistry (reads CVE state without clearing pendingReport), and call it in Step 2 when the JAR is newly resolved but no pending snapshot exists for that dep. Class-level-only CVEs are permanently affected (no method hit to re-trigger pendingReport), so the fix is necessary for correctness. Adds regression test: cveRegisteredBeforeDepResolved_step2PeeksCveStateInLaterHeartbeat nit(sca): remove dead isEmpty() guards and add @VisibleForTesting to resetForTesting entriesForClass() always returns a non-empty list or null (never an empty list), so the || entries.isEmpty() checks in transform(), checkAlreadyLoadedClasses(), and processPendingClassEvents() are dead code that could mislead readers. Also annotate resetForTesting() with @VisibleForTesting, consistent with the rest of the project's convention for package-private test helpers. Merge branch 'master' into alejandro.gonzalez/sca-reachability nit(sca): add TODO for future trie consideration in ScaCveDatabase.entriesForClass Co-authored-by: devflow.devflow-routing-intake <devflow.devflow-routing-intake@kubernetes.us1.ddbuild.io>
1 parent e5f383b commit 0e739a1

47 files changed

Lines changed: 4713 additions & 12 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/CODEOWNERS

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,8 @@
9090
/dd-trace-api/src/main/java/datadog/trace/api/EventTracker.java @DataDog/asm-java
9191
/internal-api/src/main/java/datadog/trace/api/gateway/ @DataDog/asm-java
9292
/internal-api/src/main/java/datadog/trace/api/http/ @DataDog/asm-java
93+
/internal-api/src/main/java/datadog/trace/api/telemetry/ScaReachability* @DataDog/asm-java
94+
/telemetry/src/main/java/datadog/telemetry/sca/ @DataDog/asm-java
9395
**/appsec/ @DataDog/asm-java
9496
**/*CallSite*.java @DataDog/asm-java
9597
**/*CallSite*.groovy @DataDog/asm-java

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,8 @@ out/
5353
# Claude Code local custom settings #
5454
#####################################
5555
.claude/*.local.*
56+
.claude-invariants.md
57+
.claude-status.md
5658

5759
# Vim #
5860
#######

buildSrc/build.gradle.kts

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,11 @@ gradlePlugin {
6363
id = "dd-trace-java.instrumentation-naming"
6464
implementationClass = "datadog.gradle.plugin.naming.InstrumentationNamingPlugin"
6565
}
66+
67+
create("sca-enrichments-plugin") {
68+
id = "dd-trace-java.sca-enrichments"
69+
implementationClass = "datadog.gradle.plugin.sca.ScaEnrichmentsPlugin"
70+
}
6671
}
6772
}
6873

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
package datadog.gradle.plugin.sca
2+
3+
import datadog.gradle.sca.GhsaEnrichmentParser
4+
import groovy.json.JsonOutput
5+
import groovy.json.JsonSlurper
6+
import java.net.HttpURLConnection
7+
import java.net.URL
8+
import org.gradle.api.GradleException
9+
import org.gradle.api.Plugin
10+
import org.gradle.api.Project
11+
12+
/**
13+
* Registers the [generateScaCvesJson] task that downloads GHSA enrichments from
14+
* `sca-reachability-database` and generates `sca_cves.json` bundled in the appsec JAR.
15+
*
16+
* This is a **temporary** build-time approach. The symbol database will be delivered
17+
* via Remote Config in a future iteration, at which point this plugin and the committed
18+
* `sca_cves.json` file will be removed.
19+
*
20+
* Usage: `apply plugin: 'dd-trace-java.sca-enrichments'`. The task runs only when
21+
* `-PrefreshSca` is passed or the output file is absent; CI uses the committed copy.
22+
*/
23+
@Suppress("unused")
24+
class ScaEnrichmentsPlugin : Plugin<Project> {
25+
26+
companion object {
27+
private const val SCA_ENRICHMENTS_API_DEFAULT =
28+
"https://api.github.com/repos/DataDog/sca-reachability-database/contents/enrichments"
29+
}
30+
31+
override fun apply(project: Project) {
32+
val outputFile = project.file("src/main/resources/sca_cves.json")
33+
34+
val generateTask =
35+
project.tasks.register("generateScaCvesJson") {
36+
description =
37+
"Downloads GHSA enrichments from sca-reachability-database and updates " +
38+
"src/main/resources/sca_cves.json. Run with -PrefreshSca to force a refresh. " +
39+
"Override the source URL with -PscaEnrichmentsUrl=<url>. " +
40+
"sca_cves.json is committed to the repo so CI does not need network access."
41+
group = "build"
42+
outputs.file(outputFile)
43+
// upToDateWhen: when -PrefreshSca is set, always consider outputs stale (force re-run).
44+
outputs.upToDateWhen { !project.hasProperty("refreshSca") }
45+
// onlyIf: skip entirely when the file already exists and no refresh was requested,
46+
// so that normal builds (no network, no -PrefreshSca) never touch GitHub.
47+
onlyIf { project.hasProperty("refreshSca") || !outputFile.exists() }
48+
49+
doLast {
50+
val token = System.getenv("GITHUB_TOKEN")
51+
val apiUrl =
52+
project.findProperty("scaEnrichmentsUrl")?.toString() ?: SCA_ENRICHMENTS_API_DEFAULT
53+
54+
logger.lifecycle("Fetching GHSA enrichment index from $apiUrl ...")
55+
@Suppress("UNCHECKED_CAST")
56+
val fileList = githubFetch(apiUrl, token) as List<Map<String, Any>>
57+
val ghsaFiles =
58+
fileList.filter {
59+
it["name"]?.toString()?.endsWith(".json") == true && it["type"] == "file"
60+
}
61+
logger.lifecycle("Found ${ghsaFiles.size} enrichment files")
62+
63+
val entries = mutableListOf<Any>()
64+
ghsaFiles.forEach { fileInfo ->
65+
val ghsaId = fileInfo["name"]!!.toString().removeSuffix(".json")
66+
val rawContent = githubFetchRaw(fileInfo["download_url"]!!.toString(), token)
67+
entries.addAll(GhsaEnrichmentParser.parse(ghsaId, rawContent))
68+
}
69+
70+
outputFile.writeText(JsonOutput.toJson(mapOf("version" to 1, "entries" to entries)))
71+
logger.lifecycle(
72+
"sca_cves.json: ${entries.size} entries from ${ghsaFiles.size} GHSA files")
73+
logger.lifecycle(
74+
"Remember to commit src/main/resources/sca_cves.json after updating the database.")
75+
}
76+
}
77+
78+
// Defer wiring until after the java plugin adds processResources.
79+
project.pluginManager.withPlugin("java") {
80+
project.tasks.named("processResources") {
81+
dependsOn(generateTask)
82+
doLast {
83+
// Minify only sca_cves.json — not all JSON files in the module output.
84+
project
85+
.fileTree(mapOf("dir" to outputs.files.asPath, "includes" to listOf("**/sca_cves.json")))
86+
.forEach { f -> f.writeText(JsonOutput.toJson(JsonSlurper().parse(f))) }
87+
}
88+
}
89+
}
90+
}
91+
92+
private fun githubConnect(url: String, token: String?): HttpURLConnection {
93+
val connection = URL(url).openConnection() as HttpURLConnection
94+
connection.setRequestProperty("Accept", "application/vnd.github+json")
95+
connection.setRequestProperty("X-GitHub-Api-Version", "2022-11-28")
96+
if (!token.isNullOrEmpty()) {
97+
connection.setRequestProperty("Authorization", "Bearer $token")
98+
}
99+
connection.connectTimeout = 10_000
100+
connection.readTimeout = 30_000
101+
val code = connection.responseCode
102+
if (code != 200) {
103+
throw GradleException(
104+
"GitHub API returned HTTP $code for $url.\n" +
105+
"Unauthenticated rate limit is 60 req/hr. Set GITHUB_TOKEN to raise it.")
106+
}
107+
return connection
108+
}
109+
110+
private fun githubFetch(url: String, token: String?): Any {
111+
val conn = githubConnect(url, token)
112+
return try {
113+
JsonSlurper().parse(conn.inputStream)
114+
} finally {
115+
conn.disconnect()
116+
}
117+
}
118+
119+
private fun githubFetchRaw(url: String, token: String?): String {
120+
val conn = githubConnect(url, token)
121+
return try {
122+
conn.inputStream.bufferedReader().readText()
123+
} finally {
124+
conn.disconnect()
125+
}
126+
}
127+
}
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
package datadog.gradle.sca
2+
3+
import com.fasterxml.jackson.databind.JsonNode
4+
import com.fasterxml.jackson.databind.ObjectMapper
5+
6+
/**
7+
* Parses GHSA enrichment JSON files from the sca-reachability-database into the internal
8+
* sca_cves.json format consumed by SCA Reachability at runtime.
9+
*
10+
* Key transformations:
11+
* - Filters entries to JVM language only
12+
* - Expands multi-package GHSA entries into N records (one per Maven artifact), because
13+
* each artifact may have different version ranges for the same set of class symbols
14+
* - Converts class FQNs to JVM internal format (slashes) so the ClassFileTransformer
15+
* can do O(1) map lookups without per-class string conversion
16+
* - Sets method=null for all symbols — field exists for forward compatibility when the
17+
* database adds method-level symbols in the future (see APPSEC-62260)
18+
*/
19+
object GhsaEnrichmentParser {
20+
21+
private val mapper = ObjectMapper()
22+
23+
/**
24+
* Parses a single GHSA enrichment file.
25+
*
26+
* @param ghsaId the GHSA identifier (e.g. "GHSA-645p-88qh-w398"), used as vuln_id
27+
* @param jsonContent the raw JSON content of the enrichment file
28+
* @return list of sca_cves.json entry maps, one per affected Maven artifact
29+
*/
30+
fun parse(ghsaId: String, jsonContent: String): List<Map<String, Any?>> {
31+
val root = mapper.readTree(jsonContent)
32+
require(root.isArray) { "GHSA enrichment file $ghsaId must be a JSON array, got ${root.nodeType}" }
33+
34+
val entries = mutableListOf<Map<String, Any?>>()
35+
36+
for (entry in root) {
37+
if (entry.path("language").asText() != "jvm") continue
38+
39+
val symbols = extractSymbols(entry)
40+
if (symbols.isEmpty()) continue
41+
42+
for (pkg in entry.path("package")) {
43+
if (pkg.path("ecosystem").asText() != "maven") continue
44+
val artifact = pkg.path("name").asText().takeIf { it.isNotEmpty() } ?: continue
45+
val versionRanges = pkg.path("version_range").map { it.asText() }
46+
47+
entries += mapOf(
48+
"vuln_id" to ghsaId,
49+
"artifact" to artifact,
50+
"version_ranges" to versionRanges,
51+
"symbols" to symbols,
52+
)
53+
}
54+
}
55+
56+
return entries
57+
}
58+
59+
private fun extractSymbols(entry: JsonNode): List<Map<String, Any?>> {
60+
val symbols = mutableListOf<Map<String, Any?>>()
61+
val imports = entry.path("ecosystem_specific").path("imports")
62+
if (imports.isMissingNode || !imports.isArray) return symbols
63+
64+
for (importGroup in imports) {
65+
for (symbol in importGroup.path("symbols")) {
66+
if (symbol.path("type").asText() != "class") continue
67+
val pkg = symbol.path("value").asText().takeIf { it.isNotEmpty() } ?: continue
68+
val name = symbol.path("name").asText().takeIf { it.isNotEmpty() } ?: continue
69+
70+
// JVM internal format (slashes) — avoids per-class conversion in the
71+
// ClassFileTransformer hot path at runtime.
72+
// TODO(APPSEC-62260): verify inner-class format when database adds method-level symbols.
73+
// If GHSA uses dot notation for inner classes (e.g. name="Outer.Inner"), the replace below
74+
// produces com/example/Outer/Inner instead of the correct com/example/Outer$Inner.
75+
// When the database team defines the format, update this to handle the $ separator.
76+
val internalName = "$pkg.$name".replace('.', '/')
77+
symbols += mapOf("class" to internalName, "method" to null)
78+
}
79+
}
80+
81+
return symbols
82+
}
83+
}
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
package datadog.gradle.plugin.sca
2+
3+
import datadog.gradle.plugin.GradleFixture
4+
import org.assertj.core.api.Assertions.assertThat
5+
import org.gradle.testkit.runner.TaskOutcome
6+
import org.junit.jupiter.api.BeforeEach
7+
import org.junit.jupiter.api.Test
8+
9+
class ScaEnrichmentsPluginTest : GradleFixture() {
10+
11+
@BeforeEach
12+
fun setup() {
13+
writeSettings("""rootProject.name = "test-appsec"""")
14+
writeRootProject(
15+
"""
16+
plugins {
17+
java
18+
id("dd-trace-java.sca-enrichments")
19+
}
20+
"""
21+
)
22+
}
23+
24+
@Test
25+
fun `generateScaCvesJson is SKIPPED when file exists and refreshSca is not set`() {
26+
file("src/main/resources/sca_cves.json").also {
27+
it.parentFile.mkdirs()
28+
it.writeText("{\"version\":1,\"entries\":[]}")
29+
}
30+
31+
val result = run("generateScaCvesJson")
32+
33+
assertThat(result.task(":generateScaCvesJson")?.outcome).isEqualTo(TaskOutcome.SKIPPED)
34+
}
35+
36+
@Test
37+
fun `generateScaCvesJson attempts to run when refreshSca is set even if file exists`() {
38+
file("src/main/resources/sca_cves.json").also {
39+
it.parentFile.mkdirs()
40+
it.writeText("{}")
41+
}
42+
43+
// With -PrefreshSca the onlyIf condition is true; task will fail at the GitHub fetch
44+
// (no network in tests) but must NOT be SKIPPED
45+
val result = run("generateScaCvesJson", "-PrefreshSca", expectFailure = true)
46+
47+
assertThat(result.task(":generateScaCvesJson")?.outcome)
48+
.isNotNull
49+
.isNotEqualTo(TaskOutcome.SKIPPED)
50+
}
51+
52+
@Test
53+
fun `generateScaCvesJson attempts to run when output file does not exist`() {
54+
// File absent: onlyIf returns true; task will fail at GitHub fetch but must not be SKIPPED
55+
val result = run("generateScaCvesJson", expectFailure = true)
56+
57+
assertThat(result.task(":generateScaCvesJson")?.outcome)
58+
.isNotNull
59+
.isNotEqualTo(TaskOutcome.SKIPPED)
60+
}
61+
62+
@Test
63+
fun `processResources depends on generateScaCvesJson`() {
64+
file("src/main/resources/sca_cves.json").also {
65+
it.parentFile.mkdirs()
66+
it.writeText("{\"version\":1,\"entries\":[]}")
67+
}
68+
69+
val result = run("processResources")
70+
71+
// generateScaCvesJson must appear as SKIPPED (file exists, no -PrefreshSca)
72+
assertThat(result.task(":generateScaCvesJson")?.outcome).isEqualTo(TaskOutcome.SKIPPED)
73+
assertThat(result.task(":processResources")?.outcome).isEqualTo(TaskOutcome.SUCCESS)
74+
}
75+
}

0 commit comments

Comments
 (0)