Conversation
Rust's cdylib linker emits a version script with `local: *` that hides all non-Rust symbols, preventing `custom_labels_current_set_v2` from appearing in the dynamic symbol table. Without a dynsym entry, external readers (e.g. the eBPF profiler) cannot locate the thread-local slot. Add a supplementary version script with an explicit `global:` entry for the symbol, which takes precedence over the `local: *` wildcard. Also force lld explicitly, since merging multiple version scripts is not supported by GNU ld. Also adds a temporary dummy FFI wrapper around `ThreadContext::attach` to keep the TLSDESC access live during verification. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: dcf5548 | Docs | Datadog PR Page | Give us feedback! |
…mbol Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 05868a50b9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const SYMBOL: &str = "otel_thread_ctx_v1"; | ||
|
|
||
| fn cdylib_path() -> PathBuf { | ||
| PathBuf::from(env!("CDYLIB_PROFILE_DIR")).join("liblibdd_otel_thread_ctx_ffi.so") |
There was a problem hiding this comment.
Point ELF tests at Cargo's deps output directory
During cargo test, Cargo emits cdylib artifacts under target/<profile>/deps, but cdylib_path() currently looks in target/<profile>. That means readelf is invoked on a non-existent file in normal test runs, so these new ELF-property tests fail even when the library is built correctly. Resolve the path from the deps directory (or otherwise discover the real artifact path) so the assertions run against the actual .so.
Useful? React with 👍 / 👎.
e2bb632 to
e98d81f
Compare
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
e98d81f to
75add55
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1915 +/- ##
==========================================
- Coverage 71.74% 71.74% -0.01%
==========================================
Files 434 435 +1
Lines 69940 69994 +54
==========================================
+ Hits 50181 50214 +33
- Misses 19759 19780 +21
🚀 New features to boost your workflow:
|
This reverts commit abf827e.
📚 Documentation Check Results📦
|
🔒 Cargo Deny Results📦
|
ivoanjo
left a comment
There was a problem hiding this comment.
This looks quite reasonable. Some notes:
-
I would strongly recommend getting one of the dotnet folks (which I believe are intended first users of this API) to give a pass on the PR
-
I think we're missing an end-to-end example in C or C++. In particular, one that sets both the process and thread context, and thus can be picked up correctly by the external reader.
Right now we're kinda assuming that callers will "just know" how to put this together, and suspect that will cause a bunch of confusion. Thus, it would be very helpful to provide some docs + a fully working example that could be used as a model.
-
As I mentioned, I'm not convinced about the whole "expose the full structure", see my comment on why.
| // Export the TLSDESC thread-local variable to the dynamic symbol table so | ||
| // external readers (e.g. the eBPF profiler) can locate it. Rust's cdylib | ||
| // linker applies a version script with `local: *` that hides all symbols | ||
| // not explicitly whitelisted, and also causes lld to relax the TLSDESC |
There was a problem hiding this comment.
| // not explicitly whitelisted, and also causes lld to relax the TLSDESC | |
| // not explicitly allowlisted, and also causes lld to relax the TLSDESC |
:)
| // struct has the right total size. | ||
| #[repr(C)] | ||
| struct ThreadContextRecord { | ||
| pub struct ThreadContextRecord { |
There was a problem hiding this comment.
I'm not quite sure about this part. In particular, I don't think we're giving callers the tools that they need to directly use this without getting "burned" right now: no docs, no validation. ![]()
My "gut feeling" is: if callers haven't asked for this yet, let's not give it to them. 🤔
If they want to bang at the bits directly, I think in that situation they might as well prefer to reimplement it from scratch, e.g. why half-go through libdatadog if you want maximum control anyway?
There was a problem hiding this comment.
I'm not quite sure about this part. In particular, I don't think we're giving callers the tools that they need to directly use this without getting "burned" right now: no docs, no validation.
I would say that there is some documentation (and warnings!) in the documentation of attach. When talking with dotnet people, we thought it might make sense for performance to try to do everything on the SDK side for context switching, without going through native. I agree that then using libdatadog is questionable, but I think you'll still have to manage the whole TLSDESC/elf business. It might be reasonable to use libdatadog to attach the first context and then update it in place.
Also the structure is defined per the spec anyway, so it's not going to change and is "public" (as is, defined somewhere on the internet).
All of that being said, I agree that making it visible here is stronger, as we guarantee our FFI functions always return pointer to contexts as per the spec. Additionally, the in-place-update-from-the-sdk was really mentioned as "we can try that later in the future in mandated", so I would be totally fine reverting the changes to otel-thread-ctx, removing the bit about raw update, making the record type an entirely opaque pointer, and only discuss this again after there's a need for it, or at least some benchmarks.
Why do you dotnet guys think @gleocadie ?
There was a problem hiding this comment.
I think you'll still have to manage the whole TLSDESC/elf business.
I was thinking that if we're building a .so from C/C++, it's even easier to not use libdatadog for this part -- no need to fight linker, just declare a public thread-local symbol.
But yeah, as I mentioned above -- my suggestion is "gut feeling we're not gonna need it" -- if y'all want to try this path, we can, but I would suggest avoiding any "let's expose this now just-in-case-for-later" things ;)
There was a problem hiding this comment.
I was thinking that if we're building a .so from C/C++, it's even easier to not use libdatadog for this part -- no need to fight linker, just declare a public thread-local symbol.
That's true... although you still need the process context part, and re-implementing this in C/C++ starts to be a tad less trivial. But I agree that if you only need to swap the TLS slot, you're probably better off with a thin C/C++ shared lib than messing around with the Rust build 👍
There was a problem hiding this comment.
Yeap, I agree you'd still probably want to use the libdatadog process context impl, assuming whatever downstream consumer we're talking about is pulling libdatadog already as a dependency anyway.
What does this PR do?
This PR adds a basic FFI for the OTel thread-level context feature: create a new context, attach, detach, and update in place.
We also make
ThreadContextRecordpublic, or at least exposed in the FFI. The rationale is that:ThreadContextRecordis a way to document its expected memory layout.Generated C header
Motivation
OTel thread-level context has been implemented in #1791 in order to provide better interop with the OTel eBPF profiler. The first user is supposed to be dd-trace-rs, but it turns out the dotnet SDK people are interested in using it as well (and eventually other non-Rust SDKs will use it and thus require an FFI).
Additional Notes
N/A
How to test the change?
There's a test to check that the TLS symbol is properly handled. For real usage, we plan to check when integrating in dotnet (or whichever is the first SDK to use it).