feat!: integrate obfuscation to trace exporter [APMSP-2764]#1819
feat!: integrate obfuscation to trace exporter [APMSP-2764]#1819
Conversation
📚 Documentation Check Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
🔒 Cargo Deny Results📦
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1819 +/- ##
==========================================
- Coverage 71.70% 71.61% -0.09%
==========================================
Files 429 429
Lines 67916 68095 +179
==========================================
+ Hits 48697 48767 +70
- Misses 19219 19328 +109
🚀 New features to boost your workflow:
|
|
✅ Tests 🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: 50c0469 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback! |
|
@codex review |
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f2621f3041
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| client_side_stats.store(Arc::new(StatsComputationStatus::Enabled { | ||
| stats_concentrator: stats_concentrator.clone(), | ||
| cancellation_token: cancellation_token.clone(), | ||
| obfuscation_active: new_obfuscation_active, | ||
| })); |
There was a problem hiding this comment.
Recreate stats worker when obfuscation mode flips
When /info changes obfuscation support, this branch only updates client_side_stats and leaves the running StatsExporter untouched. The exporter’s obfuscation_active flag is fixed at construction time in create_and_start_stats_worker, so it keeps sending the old datadog-obfuscation-version header state even after process_traces_for_stats switches to the new obfuscation behavior. In the toggle scenario (e.g., agent enables obfuscation after startup), stats payload resources and header version can diverge, which makes the agent interpret stats under the wrong format/version.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR integrates obfuscation support into the trace exporter for client-side stats computation. It enables SDKs to obfuscate spans before sending them to the agent when the agent supports it, reducing data transfer between SDKs and the agent.
Changes:
- Adds
obfuscate_resource_for_statsfunction to obfuscate resource names for SQL, Redis, Cassandra, and Valkey spans - Implements obfuscation state tracking through the stats computation lifecycle
- Creates an
ObfuscatedStatSpanwrapper to provide obfuscated resources to the stats concentrator without modifying original spans - Adds HTTP header
datadog-obfuscation-version: 1when obfuscation is active - Adds schema field for agent's obfuscation version support
- Fixes a typo in documentation ("will" → "while")
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| libdd-trace-stats/src/span_concentrator/mod.rs | Exports the stat_span module publicly for use in the stats exporter |
| libdd-trace-obfuscation/src/obfuscate.rs | Adds obfuscate_resource_for_stats function with comprehensive tests for SQL, Redis, Cassandra, and Valkey obfuscation |
| libdd-data-pipeline/src/trace_exporter/stats.rs | Implements obfuscation state management and applies obfuscation to spans during stats computation |
| libdd-data-pipeline/src/trace_exporter/mod.rs | Passes obfuscation state through the stats handler call chain |
| libdd-data-pipeline/src/stats_exporter.rs | Adds obfuscation header to HTTP requests when obfuscation is active, includes test coverage |
| libdd-data-pipeline/src/agent_info/schema.rs | Adds obfuscation_version field to capture agent capabilities |
| libdd-data-pipeline/Cargo.toml | Adds dependency on libdd-trace-obfuscation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…scation feature flag
…tate between StatsExporter and stats enum
This config field is needed for CSS obfuscation. In fact this is the only needed obfuscation config field AFAIK. ### What does this PR do? - adds a "sql_obfuscation_mode" field to /info endpoint for using it when obfuscating stats in libraries. ### Motivation - finish [integrating obfuscation to the trace exporter](DataDog/libdatadog#1819) ### Describe how you validated your changes - [x] unit test - [x] system test ### Additional Notes Co-authored-by: oscar.ledauphin <oscar.ledauphin@datadoghq.com>
| if obfuscation_active { | ||
| let span_type: &str = span.r#type.borrow(); | ||
| let resource: &str = span.resource.borrow(); | ||
| let dbms_hint: Option<&str> = span.meta.get("db.type").map(|v| v.borrow()); | ||
| let obfuscated_resource = | ||
| libdd_trace_obfuscation::obfuscate::obfuscate_resource_for_stats( | ||
| span_type, resource, dbms_hint, | ||
| ); | ||
| let wrapper = ObfuscatedStatSpan { | ||
| inner: span, | ||
| obfuscated_resource, | ||
| }; | ||
| stats_concentrator.add_span(&wrapper); | ||
| } else { | ||
| stats_concentrator.add_span(span); | ||
| } |
There was a problem hiding this comment.
I think the obfuscation should be done in the SpanConcentrator in the aggregation key computation. This will allow you to remove the ObfuscatedStatSpan
| client_side_stats.store(Arc::new(StatsComputationStatus::Enabled { | ||
| stats_concentrator: stats_concentrator.clone(), | ||
| cancellation_token: cancellation_token.clone(), | ||
| obfuscation_active: new_obfuscation_active, |
There was a problem hiding this comment.
If you want to avoid having to recreate the StatsExporter when obfuscation status is updated, you could use an Arc<AtomicBool> or ArcSwap<AtomicBool> (since it's not updated often). And share between the trace exporter and the stats exporter.
| let span_type: &str = span.r#type.borrow(); | ||
| let resource: &str = span.resource.borrow(); | ||
| let dbms_hint: Option<&str> = span.meta.get("db.type").map(|v| v.borrow()); | ||
| let sql_obfuscation_mode = get_agent_info() | ||
| .and_then(|info| { | ||
| info.info.config.as_ref().and_then(|config| { | ||
| config | ||
| .obfuscation | ||
| .as_ref() | ||
| .map(|obfuscation_cfg| obfuscation_cfg.sql_obfuscation_mode) | ||
| }) | ||
| }) | ||
| .unwrap_or_default(); | ||
| let config = StatsObfuscationConfig { | ||
| sql_obfuscation_mode, | ||
| }; | ||
| let obfuscated_resource = | ||
| libdd_trace_obfuscation::obfuscate::obfuscate_resource_for_stats( | ||
| span_type, resource, dbms_hint, config, | ||
| ); | ||
| let wrapper = ObfuscatedStatSpan { | ||
| inner: span, | ||
| obfuscated_resource, | ||
| }; |
There was a problem hiding this comment.
The obfuscation logic should live in the SpanConcentrator and be controlled by a obfuscation_enabled attribute in the SpanConcentrator (it can use the Arc from the stats exporter if it's easier). I don't think you need the ObfuscatedStatSpan either. This has the benefit of sharing the logic with serverless who also use the SpanConcentrator.
| { | ||
| obfuscation_active.load(Ordering::Relaxed) | ||
| } else { | ||
| unreachable!() |
There was a problem hiding this comment.
Use an error! log instead. We want to avoid panicking as much as possible.
| #[cfg(feature = "stats-obfuscation")] | ||
| let obfuscation_active = if let StatsComputationStatus::Enabled { | ||
| obfuscation_active, .. | ||
| } = &**status | ||
| { | ||
| obfuscation_active.load(Ordering::Relaxed) | ||
| } else { |
There was a problem hiding this comment.
This is a bit convoluted because of the stats-obfuscation feature, I don't have all the background but is it required ?
What does this PR do?
Integrate the work that has been done on obfuscate_span into the trace exporter
Motivation
Let the sdks obfuscate spans before sending them to the agent when possible, reducing the amount of data transfered from the sdks to the agent.
Additional Notes
Anything else we should know when reviewing?
How to test the change?
TODO
obfuscate more than just sql and redisnot for stats