Skip to content

Commit 47d70b3

Browse files
authored
fix(appsec): Go runtime _dd.appsec.enabled missing on aws.lambda (#1213)
## Problem Two related bugs caused `_dd.appsec.enabled` and other AppSec tags to be absent from the `aws.lambda` invocation span for Go (and Java) runtimes when using extension-side App & API Protection. The symptom was flaky integration tests: sometimes all AppSec tags were missing, and even when that race was won, `_dd.appsec.enabled` was consistently absent from `aws.lambda` while the inferred trigger span (`aws.lambda.url`, `aws.httpapi`, etc.) always had it. ## Root causes ### Bug 1 — race condition: AppSec context deleted by placeholder span Go's tracer emits a placeholder `aws.lambda` span with `resource = "dd-tracer-serverless-span"` alongside its child spans in `/v0.4/traces`. `AppSecProcessor::service_entry_span_mut` was matching this placeholder (it has `name = "aws.lambda"` and `request_id` in meta). Depending on tokio scheduling, the placeholder could reach `Processor::process_span` _after_ `/runtime/invocation/response` had set `response_seen = true`. In that case, `process_span` would tag the placeholder — harmless, since `ChunkProcessor` drops it — but then **delete the AppSec context**. When `process_on_platform_runtime_done` later sent the extension-built `aws.lambda` span via `send_ctx_spans`, the context was gone and no tags were applied. **Fix:** filter placeholder spans out of `service_entry_span_mut` by excluding spans whose `resource == INVOCATION_SPAN_RESOURCE` (`"dd-tracer-serverless-span"`). These spans are always dropped before reaching the backend; tagging them is both pointless and harmful. ### Bug 2 — `_dd.appsec.enabled` never pre-set on the invocation span `enrich_ctx_at_platform_done` calls `inferrer.complete_inferred_spans`, which propagates `_dd.appsec.enabled` from the invocation span to the inferred trigger span via `propagate_appsec`. However, AppSec has not yet run on the invocation span at that point, so `_dd.appsec.enabled` is not in `invocation_span.metrics`. `propagate_appsec` falls back to the `serverless_appsec_enabled` config flag for the _inferred_ span (so it always got the tag) but **never sets it on the invocation span itself**. If AppSec's context was unavailable at flush time for any reason, `aws.lambda` shipped without the tag. This also explains an existing `// todo(duncanista): Add missing metric tags for ASM` comment at that exact location. **Fix:** pre-set `_dd.appsec.enabled = 1.0` on the invocation span in `enrich_ctx_at_platform_done` when AAP is enabled, before calling `complete_inferred_spans`. This ensures the inferred span inherits from the actual metric value rather than the config fallback, and guarantees the tag is present even when the AppSec security context cannot be found at flush time. ## Why Go/Java only Only Go and Java use the placeholder span pattern. Python and Node emit their `aws.lambda` span directly in their tracer payload with `resource = function_name`, so `service_entry_span_mut` correctly identifies it, and the context-deletion race cannot happen. ## Changes | File | Change | |---|---| | `bottlecap/src/traces/mod.rs` | Make `INVOCATION_SPAN_RESOURCE` `pub(crate)` | | `bottlecap/src/appsec/processor/mod.rs` | Filter placeholder spans in `service_entry_span_mut` | | `bottlecap/src/lifecycle/invocation/processor.rs` | Pre-set `_dd.appsec.enabled` on invocation span before `complete_inferred_spans` | ## Testing - Existing `appsec_processor_test` integration test passes. - The race condition is no longer reproducible in Go + Lambda URL integration tests. - `_dd.appsec.enabled` is now consistently present on the `aws.lambda` span.
1 parent 2c240b0 commit 47d70b3

3 files changed

Lines changed: 159 additions & 3 deletions

File tree

bottlecap/src/appsec/processor/mod.rs

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -151,10 +151,18 @@ impl Processor {
151151
/// Returns the first `aws.lambda` span from the provided trace, if one
152152
/// exists.
153153
///
154+
/// Placeholder spans (resource == `INVOCATION_SPAN_RESOURCE`) emitted by
155+
/// Go and Java tracers are excluded: they are always dropped by the chunk
156+
/// processor before reaching the backend, so tagging them would waste the
157+
/// `AppSec` context and trigger a premature context deletion that would leave
158+
/// the real, extension-built `aws.lambda` span untagged.
159+
///
154160
/// # Returns
155161
/// The span on which security information will be attached.
156162
pub fn service_entry_span_mut(trace: &mut [Span]) -> Option<&mut Span> {
157-
trace.iter_mut().find(|span| span.name == "aws.lambda")
163+
trace.iter_mut().find(|span| {
164+
span.name == "aws.lambda" && span.resource != crate::traces::INVOCATION_SPAN_RESOURCE
165+
})
158166
}
159167

160168
/// Processes an intercepted [`Span`].
@@ -812,4 +820,41 @@ mod tests {
812820
result
813821
);
814822
}
823+
824+
#[test]
825+
fn service_entry_span_mut_skips_placeholder_lambda_span() {
826+
let mut trace = vec![
827+
Span {
828+
name: "aws.lambda".into(),
829+
resource: crate::traces::INVOCATION_SPAN_RESOURCE.into(),
830+
span_id: 1,
831+
..Default::default()
832+
},
833+
Span {
834+
name: "aws.lambda".into(),
835+
resource: "real.lambda.invocation".into(),
836+
span_id: 2,
837+
..Default::default()
838+
},
839+
];
840+
841+
let selected = Processor::service_entry_span_mut(&mut trace)
842+
.expect("expected non-placeholder aws.lambda span");
843+
844+
assert_eq!(selected.name, "aws.lambda");
845+
assert_ne!(selected.resource, crate::traces::INVOCATION_SPAN_RESOURCE);
846+
assert_eq!(selected.span_id, 2);
847+
}
848+
849+
#[test]
850+
fn service_entry_span_mut_returns_none_for_only_placeholder() {
851+
let mut trace = vec![Span {
852+
name: "aws.lambda".into(),
853+
resource: crate::traces::INVOCATION_SPAN_RESOURCE.into(),
854+
span_id: 1,
855+
..Default::default()
856+
}];
857+
858+
assert!(Processor::service_entry_span_mut(&mut trace).is_none());
859+
}
815860
}

bottlecap/src/lifecycle/invocation/processor.rs

Lines changed: 112 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -615,7 +615,6 @@ impl Processor {
615615
);
616616
}
617617

618-
// todo(duncanista): Add missing metric tags for ASM
619618
// Add dynamic and trigger tags
620619
context
621620
.invocation_span
@@ -626,6 +625,19 @@ impl Processor {
626625
context.invocation_span.meta.extend(trigger_tags);
627626
}
628627

628+
// Ensure _dd.appsec.enabled is present on the invocation span when AAP is enabled.
629+
// complete_inferred_spans (called below) propagates this metric from the invocation
630+
// span to the inferred trigger span. AppSec's process_span will set it again from the
631+
// security context when it runs, but this baseline guarantees the tag is always present
632+
// even when the context cannot be found at flush time.
633+
if self.config.serverless_appsec_enabled {
634+
context
635+
.invocation_span
636+
.metrics
637+
.entry("_dd.appsec.enabled".to_string())
638+
.or_insert(1.0);
639+
}
640+
629641
self.inferrer
630642
.complete_inferred_spans(&context.invocation_span);
631643

@@ -2334,4 +2346,103 @@ mod tests {
23342346
"no contexts should be ready to send yet"
23352347
);
23362348
}
2349+
2350+
fn setup_appsec() -> Processor {
2351+
let aws_config = Arc::new(AwsConfig {
2352+
region: "us-east-1".into(),
2353+
aws_lwa_proxy_lambda_runtime_api: Some("***".into()),
2354+
function_name: "test-function".into(),
2355+
sandbox_init_time: Instant::now(),
2356+
runtime_api: "***".into(),
2357+
exec_wrapper: None,
2358+
initialization_type: "on-demand".into(),
2359+
});
2360+
let config = Arc::new(config::Config {
2361+
service: Some("test-service".to_string()),
2362+
serverless_appsec_enabled: true,
2363+
..config::Config::default()
2364+
});
2365+
let tags_provider = Arc::new(provider::Provider::new(
2366+
Arc::clone(&config),
2367+
LAMBDA_RUNTIME_SLUG.to_string(),
2368+
&HashMap::from([("function_arn".to_string(), "test-arn".to_string())]),
2369+
));
2370+
let (service, handle) =
2371+
dogstatsd::aggregator::AggregatorService::new(dogstatsd::metric::EMPTY_TAGS, 1024)
2372+
.expect("failed to create aggregator service");
2373+
tokio::spawn(service.run());
2374+
let propagator = Arc::new(DatadogCompositePropagator::new(Arc::clone(&config)));
2375+
let (durable_context_tx, _) = tokio::sync::mpsc::channel(1);
2376+
Processor::new(
2377+
tags_provider,
2378+
config,
2379+
aws_config,
2380+
handle,
2381+
propagator,
2382+
durable_context_tx,
2383+
)
2384+
}
2385+
2386+
#[tokio::test]
2387+
async fn enrich_ctx_sets_appsec_enabled_when_aap_enabled() {
2388+
let mut p = setup_appsec();
2389+
let request_id = String::from("req-appsec");
2390+
p.on_invoke_event(request_id.clone());
2391+
p.on_platform_start(request_id.clone(), chrono::Utc::now());
2392+
2393+
let ctx = p
2394+
.enrich_ctx_at_platform_done(&request_id, Status::Success)
2395+
.expect("context must be present");
2396+
2397+
assert_eq!(
2398+
ctx.invocation_span.metrics.get("_dd.appsec.enabled"),
2399+
Some(&1.0),
2400+
"_dd.appsec.enabled must be 1.0 when AAP is enabled"
2401+
);
2402+
}
2403+
2404+
#[tokio::test]
2405+
async fn enrich_ctx_does_not_set_appsec_enabled_when_aap_disabled() {
2406+
let mut p = setup();
2407+
let request_id = String::from("req-no-appsec");
2408+
p.on_invoke_event(request_id.clone());
2409+
p.on_platform_start(request_id.clone(), chrono::Utc::now());
2410+
2411+
let ctx = p
2412+
.enrich_ctx_at_platform_done(&request_id, Status::Success)
2413+
.expect("context must be present");
2414+
2415+
assert!(
2416+
!ctx.invocation_span
2417+
.metrics
2418+
.contains_key("_dd.appsec.enabled"),
2419+
"_dd.appsec.enabled must not be set when AAP is disabled"
2420+
);
2421+
}
2422+
2423+
#[tokio::test]
2424+
async fn enrich_ctx_does_not_override_existing_appsec_enabled() {
2425+
let mut p = setup_appsec();
2426+
let request_id = String::from("req-appsec-preset");
2427+
p.on_invoke_event(request_id.clone());
2428+
p.on_platform_start(request_id.clone(), chrono::Utc::now());
2429+
2430+
// Pre-set a different value to verify or_insert does not overwrite it.
2431+
p.context_buffer
2432+
.get_mut(&request_id)
2433+
.expect("context must exist")
2434+
.invocation_span
2435+
.metrics
2436+
.insert("_dd.appsec.enabled".to_string(), 0.0);
2437+
2438+
let ctx = p
2439+
.enrich_ctx_at_platform_done(&request_id, Status::Success)
2440+
.expect("context must be present");
2441+
2442+
assert_eq!(
2443+
ctx.invocation_span.metrics.get("_dd.appsec.enabled"),
2444+
Some(&0.0),
2445+
"pre-existing _dd.appsec.enabled value must not be overwritten"
2446+
);
2447+
}
23372448
}

bottlecap/src/traces/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ const DNS_LOCAL_HOST_ADDRESS_URL_PREFIX: &str = "127.0.0.1";
3838
const AWS_XRAY_DAEMON_ADDRESS_URL_PREFIX: &str = "169.254.79.129";
3939

4040
// Name of the placeholder invocation span set by Java and Go tracers
41-
const INVOCATION_SPAN_RESOURCE: &str = "dd-tracer-serverless-span";
41+
pub(crate) const INVOCATION_SPAN_RESOURCE: &str = "dd-tracer-serverless-span";
4242

4343
#[allow(clippy::doc_markdown)]
4444
/// Header used for additional tags when sending APM data to the Datadog intake

0 commit comments

Comments
 (0)