Skip to content

feat: otel thread ctx FFI#1915

Open
yannham wants to merge 23 commits intomainfrom
yannham/otel-thread-ctx-ffi
Open

feat: otel thread ctx FFI#1915
yannham wants to merge 23 commits intomainfrom
yannham/otel-thread-ctx-ffi

Conversation

@yannham
Copy link
Copy Markdown
Contributor

@yannham yannham commented Apr 23, 2026

What does this PR do?

This PR adds a basic FFI for the OTel thread-level context feature: create a new context, attach, detach, and update in place.

We also make ThreadContextRecord public, or at least exposed in the FFI. The rationale is that:

  1. it's imposed by the spec, so it should not be a liability regarding breaking changes: we can't really touch it anyway.
  2. as mentioned in the doc of the FFI, there's a potential for SDK updating themselves the contexts without going through libdatadog at all after publication. In this usage mode, the export of the C struct ThreadContextRecord is a way to document its expected memory layout.
Generated C header
// Copyright 2026-Present Datadog, Inc. https://www.datadoghq.com/
// SPDX-License-Identifier: Apache-2.0


#ifndef DDOG_OTEL_THREAD_CTX_H
#define DDOG_OTEL_THREAD_CTX_H

#pragma once

#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>

/**
 * In-memory layout of a thread-level context.
 *
 * **CAUTION**: The structure MUST match exactly the OTel thread-level context specification.
 * It is read by external, out-of-process code. Do not re-order fields or modify in any way,
 * unless you know exactly what you're doing.
 *
 * # Synchronization
 *
 * Readers are async-signal handlers. The writer is always stopped while a reader runs.
 * Sharing memory with a signal handler still requires some form of synchronization, which is
 * achieved through atomics and compiler fence, using `valid` and/or the TLS slot as
 * synchronization points.
 *
 * - The writer stores `valid = 0` *before* modifying fields in-place, guarded by a fence.
 * - The writer stores `valid = 1` *after* all fields are populated, guarded by a fence.
 * - `valid` starts at `1` on construction and is never set to `0` except during an in-place
 *   update.
 */
typedef struct ddog_ThreadContextRecord {
  /**
   * Trace identifier; all-zeroes means "no trace".
   */
  uint8_t trace_id[16];
  /**
   * Span identifier.
   */
  uint8_t span_id[8];
  /**
   * Whether the record is ready/consistent. Always set to `1` except during in-place update
   * of the current record.
   */
  uint8_t valid;
  uint8_t _reserved;
  /**
   * Number of populated bytes in `attrs_data`.
   */
  uint16_t attrs_data_size;
  /**
   * Packed variable-length key-value records.
   *
   * It's a contiguous list of blocks with layout:
   *
   * 1. 1-byte `key_index`
   * 2. 1-byte `val_len`
   * 3. `val_len` bytes of a string value.
   *
   * # Size
   *
   * Currently, we always allocate the max recommended size. This potentially wastes a few
   * hundred bytes per thread, but it guarantees that we can modify the context in-place
   * without (re)allocation in the hot path. Having a hybrid scheme (starting smaller and
   * resizing up a few times) is not out of the question.
   */
  uint8_t attrs_data[ddog_MAX_ATTRS_DATA_SIZE];
} ddog_ThreadContextRecord;

#ifdef __cplusplus
extern "C" {
#endif // __cplusplus

/**
 * Allocate and initialise a new thread context.
 *
 * Returns a non-null owned handle that must eventually be released with
 * `ddog_otel_thread_ctx_free`.
 */
struct ddog_ThreadContextRecord *ddog_otel_thread_ctx_new(const uint8_t (*trace_id)[16],
                                                          const uint8_t (*span_id)[8],
                                                          const uint8_t (*local_root_span_id)[8]);

/**
 * Free an owned thread context.
 *
 * # Safety
 *
 * `ctx` must be a valid non-null pointer obtained from `ddog_otel_thread_ctx_new` or
 * `ddog_otel_thread_ctx_detach`, and must not be used after this call. In particular, `ctx`
 * must not be currently attached to a thread.
 */
void ddog_otel_thread_ctx_free(struct ddog_ThreadContextRecord *ctx);

/**
 * Attach `ctx` to the current thread. Returns the previously attached context if any, or null
 * otherwise.
 *
 * # Safety
 *
 * `ctx` must be a valid non-null pointer obtained from this API. Ownership of `ctx` is
 * transferred to the TLS slot: the caller must not drop `ctx` while it is still actively
 * attached.
 *
 * ## In-place update
 *
 * The preferred method to update the thread context in place is [ddog_otel_thread_ctx_update].
 *
 * If calling into native code is too costly, it is possible to update an attached context
 * directly in-memory without going through libdatadog (contexts are guaranteed to have a
 * stable address through their lifetime). **HOWEVER, IF DOING SO, PLEASE BE VERY CAUTIOUS OF
 * THE FOLLOWING POINTS**:
 *
 * 1. The update process requires a [seqlock](https://en.wikipedia.org/wiki/Seqlock)-like
 *    pattern: [ThreadContextRecord::valid] must be first set to `0` before the update and set
 *    to `1` again at the end. Additionally, depending on your language's memory model, you
 *    might need specific synchronization primitives (compiler fences, atomics, etc.), since
 *    the context can be read by an asynchronous signal handler at any point in time. See the
 *    [Otel thread context
 *    specification](https://github.com/open-telemetry/opentelemetry-specification/pull/4947)
 *    for more details.
 * 2. Only update the context from the thread it's attached to. Contexts are designed to be
 *    attached, written to and read from on the same thread (whether from signal code or
 *    program code). Thus, they are NOT thread-safe. Given the current specification, I don't
 *    think it's possible to safely update an attached context from a different thread, since
 *    the signal handler doesn't assume the context can be written to concurrently from another
 *    thread.
 */
struct ddog_ThreadContextRecord *ddog_otel_thread_ctx_attach(struct ddog_ThreadContextRecord *ctx);

/**
 * Remove the currently attached context from the TLS slot.
 *
 * Returns the detached context (caller now owns it and must release it with
 * `ddog_otel_thread_ctx_free`), or null if the slot was empty.
 */
struct ddog_ThreadContextRecord *ddog_otel_thread_ctx_detach(void);

/**
 * Update the currently attached context in-place.
 *
 * If no context is currently attached, one is created and attached, equivalent to calling
 * `ddog_otel_thread_ctx_new` followed by `ddog_otel_thread_ctx_attach`.
 */
void ddog_otel_thread_ctx_update(const uint8_t (*trace_id)[16],
                                 const uint8_t (*span_id)[8],
                                 const uint8_t (*local_root_span_id)[8]);

#ifdef __cplusplus
}  // extern "C"
#endif  // __cplusplus

#endif  /* DDOG_OTEL_THREAD_CTX_H */

Motivation

OTel thread-level context has been implemented in #1791 in order to provide better interop with the OTel eBPF profiler. The first user is supposed to be dd-trace-rs, but it turns out the dotnet SDK people are interested in using it as well (and eventually other non-Rust SDKs will use it and thus require an FFI).

Additional Notes

N/A

How to test the change?

There's a test to check that the TLS symbol is properly handled. For real usage, we plan to check when integrating in dotnet (or whichever is the first SDK to use it).

yannham and others added 6 commits April 22, 2026 18:18
Rust's cdylib linker emits a version script with `local: *` that hides
all non-Rust symbols, preventing `custom_labels_current_set_v2` from
appearing in the dynamic symbol table. Without a dynsym entry, external
readers (e.g. the eBPF profiler) cannot locate the thread-local slot.

Add a supplementary version script with an explicit `global:` entry for
the symbol, which takes precedence over the `local: *` wildcard. Also
force lld explicitly, since merging multiple version scripts is not
supported by GNU ld.

Also adds a temporary dummy FFI wrapper around `ThreadContext::attach`
to keep the TLSDESC access live during verification.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

  • Base Branch: origin/main
  • PR Branch: origin/yannham/otel-thread-ctx-ffi

Summary by Rule

Rule Base Branch PR Branch Change

Annotation Counts by File

File Base Branch PR Branch Change

Annotation Stats by Crate

Crate Base Branch PR Branch Change
clippy-annotation-reporter 5 5 No change (0%)
datadog-ffe-ffi 1 1 No change (0%)
datadog-ipc 21 21 No change (0%)
datadog-live-debugger 6 6 No change (0%)
datadog-live-debugger-ffi 10 10 No change (0%)
datadog-profiling-replayer 4 4 No change (0%)
datadog-remote-config 3 3 No change (0%)
datadog-sidecar 56 56 No change (0%)
libdd-common 10 10 No change (0%)
libdd-common-ffi 12 12 No change (0%)
libdd-data-pipeline 5 5 No change (0%)
libdd-ddsketch 2 2 No change (0%)
libdd-dogstatsd-client 1 1 No change (0%)
libdd-profiling 13 13 No change (0%)
libdd-telemetry 19 19 No change (0%)
libdd-tinybytes 4 4 No change (0%)
libdd-trace-normalization 2 2 No change (0%)
libdd-trace-obfuscation 8 8 No change (0%)
libdd-trace-stats 1 1 No change (0%)
libdd-trace-utils 15 15 No change (0%)
Total 198 198 No change (0%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

datadog-datadog-prod-us1-2 Bot commented Apr 23, 2026

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 25.64%
Overall Coverage: 71.74% (+0.00%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: dcf5548 | Docs | Datadog PR Page | Give us feedback!

@yannham yannham marked this pull request as ready for review April 23, 2026 14:48
@yannham yannham requested review from a team as code owners April 23, 2026 14:48
…mbol

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 05868a50b9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

const SYMBOL: &str = "otel_thread_ctx_v1";

fn cdylib_path() -> PathBuf {
PathBuf::from(env!("CDYLIB_PROFILE_DIR")).join("liblibdd_otel_thread_ctx_ffi.so")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Point ELF tests at Cargo's deps output directory

During cargo test, Cargo emits cdylib artifacts under target/<profile>/deps, but cdylib_path() currently looks in target/<profile>. That means readelf is invoked on a non-existent file in normal test runs, so these new ELF-property tests fail even when the library is built correctly. Resolve the path from the deps directory (or otherwise discover the real artifact path) so the assertions run against the actual .so.

Useful? React with 👍 / 👎.

@yannham yannham requested a review from ivoanjo April 23, 2026 15:42
@yannham yannham force-pushed the yannham/otel-thread-ctx-ffi branch from e2bb632 to e98d81f Compare April 23, 2026 16:06
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@yannham yannham force-pushed the yannham/otel-thread-ctx-ffi branch from e98d81f to 75add55 Compare April 23, 2026 16:36
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 24, 2026

Codecov Report

❌ Patch coverage is 25.64103% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.74%. Comparing base (530cd96) to head (dcf5548).
⚠️ Report is 9 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1915      +/-   ##
==========================================
- Coverage   71.74%   71.74%   -0.01%     
==========================================
  Files         434      435       +1     
  Lines       69940    69994      +54     
==========================================
+ Hits        50181    50214      +33     
- Misses      19759    19780      +21     
Components Coverage Δ
libdd-crashtracker 66.00% <ø> (+0.05%) ⬆️
libdd-crashtracker-ffi 34.47% <ø> (+0.38%) ⬆️
libdd-alloc 98.77% <ø> (ø)
libdd-data-pipeline 85.86% <ø> (+0.21%) ⬆️
libdd-data-pipeline-ffi 71.94% <ø> (+1.23%) ⬆️
libdd-common 79.41% <ø> (ø)
libdd-common-ffi 73.87% <ø> (ø)
libdd-telemetry 68.06% <ø> (ø)
libdd-telemetry-ffi 19.37% <ø> (ø)
libdd-dogstatsd-client 82.64% <ø> (ø)
datadog-ipc 76.31% <ø> (+0.04%) ⬆️
libdd-profiling 81.61% <ø> (ø)
libdd-profiling-ffi 64.36% <ø> (ø)
datadog-sidecar 29.34% <ø> (+0.22%) ⬆️
datdog-sidecar-ffi 8.41% <ø> (+1.21%) ⬆️
spawn-worker 54.69% <ø> (ø)
libdd-tinybytes 93.16% <ø> (ø)
libdd-trace-normalization 81.71% <ø> (ø)
libdd-trace-obfuscation 87.26% <ø> (ø)
libdd-trace-protobuf 68.25% <ø> (ø)
libdd-trace-utils 89.27% <ø> (+0.02%) ⬆️
datadog-tracer-flare 86.88% <ø> (ø)
libdd-log 74.69% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yannham yannham requested review from gleocadie April 24, 2026 09:52
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 27, 2026

📚 Documentation Check Results

⚠️ 283 documentation warning(s) found

📦 libdd-otel-thread-ctx-ffi - 276 warning(s)

📦 libdd-otel-thread-ctx - 2 warning(s)

📦 tools - 5 warning(s)


Updated: 2026-04-27 13:39:33 UTC | Commit: 3e47f45 | missing-docs job results

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 27, 2026

🔒 Cargo Deny Results

⚠️ 4 issue(s) found, showing only errors (advisories, bans, sources)

📦 libdd-otel-thread-ctx-ffi - 4 error(s)

Show output
error[unsound]: Rand is unsound with a custom logger using `rand::rng()`
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:104:1
    │
104 │ rand 0.8.5 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ unsound advisory detected
    │
    ├ ID: RUSTSEC-2026-0097
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0097
    ├ It has been reported (by @lopopolo) that the `rand` library is [unsound](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library) (i.e. that safe code using the public API can cause Undefined Behaviour) when all the following conditions are met:
      
      - The `log` and `thread_rng` features are enabled
      - A [custom logger](https://docs.rs/log/latest/log/#implementing-a-logger) is defined
      - The custom logger accesses `rand::rng()` (previously `rand::thread_rng()`) and calls any `TryRng` (previously `RngCore`) methods on `ThreadRng`
      - The `ThreadRng` (attempts to) reseed while called from the custom logger (this happens every 64 kB of generated data)
      - Trace-level logging is enabled or warn-level logging is enabled and the random source (the `getrandom` crate) is unable to provide a new seed
      
      `TryRng` (previously `RngCore`) methods for `ThreadRng` use `unsafe` code to cast `*mut BlockRng<ReseedingCore>` to `&mut BlockRng<ReseedingCore>`. When all the above conditions are met this results in an aliased mutable reference, violating the Stacked Borrows rules. Miri is able to detect this violation in sample code. Since construction of [aliased mutable references is Undefined Behaviour](https://doc.rust-lang.org/stable/nomicon/references.html), the behaviour of optimized builds is hard to predict.
    ├ Announcement: https://github.com/rust-random/rand/pull/1763
    ├ Solution: Upgrade to >=0.10.1 OR <0.10.0, >=0.9.3 OR <0.9.0, >=0.8.6 (try `cargo update -p rand`)
    ├ rand v0.8.5
      └── (dev) libdd-common v4.0.0
          └── libdd-common-ffi v32.0.0
              └── libdd-otel-thread-ctx-ffi v1.0.0

error[vulnerability]: Name constraints for URI names were incorrectly accepted
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:119:1
    │
119 │ rustls-webpki 0.103.10 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0098
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0098
    ├ Name constraints for URI names were ignored and therefore accepted.
      
      Note this library does not provide an API for asserting URI names, and URI name constraints are otherwise not implemented.  URI name constraints are now rejected unconditionally.
      
      Since name constraints are restrictions on otherwise properly-issued certificates, this bug is reachable only after signature verification and requires misissuance to exploit.
      
      This vulnerability is identified as [GHSA-965h-392x-2mh5](https://github.com/rustls/webpki/security/advisories/GHSA-965h-392x-2mh5). Thank you to @1seal for the report.
    ├ Solution: Upgrade to >=0.103.12, <0.104.0-alpha.1 OR >=0.104.0-alpha.6 (try `cargo update -p rustls-webpki`)
    ├ rustls-webpki v0.103.10
      └── rustls v0.23.37
          ├── hyper-rustls v0.27.7
          │   └── libdd-common v4.0.0
          │       └── libdd-common-ffi v32.0.0
          │           └── libdd-otel-thread-ctx-ffi v1.0.0
          ├── libdd-common v4.0.0 (*)
          └── tokio-rustls v0.26.0
              ├── hyper-rustls v0.27.7 (*)
              └── libdd-common v4.0.0 (*)

error[vulnerability]: Name constraints were accepted for certificates asserting a wildcard name
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:119:1
    │
119 │ rustls-webpki 0.103.10 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0099
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0099
    ├ Permitted subtree name constraints for DNS names were accepted for certificates asserting a wildcard name.
      
      This was incorrect because, given a name constraint of `accept.example.com`, `*.example.com` could feasibly allow a name of `reject.example.com` which is outside the constraint.
      This is very similar to [CVE-2025-61727](https://go.dev/issue/76442).
      
      Since name constraints are restrictions on otherwise properly-issued certificates, this bug is reachable only after signature verification and requires misissuance to exploit.
      
      This vulnerability is identified as [GHSA-xgp8-3hg3-c2mh](https://github.com/rustls/webpki/security/advisories/GHSA-xgp8-3hg3-c2mh). Thank you to @1seal for the report.
    ├ Solution: Upgrade to >=0.103.12, <0.104.0-alpha.1 OR >=0.104.0-alpha.6 (try `cargo update -p rustls-webpki`)
    ├ rustls-webpki v0.103.10
      └── rustls v0.23.37
          ├── hyper-rustls v0.27.7
          │   └── libdd-common v4.0.0
          │       └── libdd-common-ffi v32.0.0
          │           └── libdd-otel-thread-ctx-ffi v1.0.0
          ├── libdd-common v4.0.0 (*)
          └── tokio-rustls v0.26.0
              ├── hyper-rustls v0.27.7 (*)
              └── libdd-common v4.0.0 (*)

error[vulnerability]: Reachable panic in certificate revocation list parsing
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:119:1
    │
119 │ rustls-webpki 0.103.10 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0104
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0104
    ├ A panic was reachable when parsing certificate revocation lists via [`BorrowedCertRevocationList::from_der`]
      or [`OwnedCertRevocationList::from_der`].  This was the result of mishandling a syntactically valid empty
      `BIT STRING` appearing in the `onlySomeReasons` element of a `IssuingDistributionPoint` CRL extension.
      
      This panic is reachable prior to a CRL's signature being verified.
      
      Applications that do not use CRLs are not affected.
      
      Thank you to @tynus3 for the report.
    ├ Solution: Upgrade to >=0.103.13, <0.104.0-alpha.1 OR >=0.104.0-alpha.7 (try `cargo update -p rustls-webpki`)
    ├ rustls-webpki v0.103.10
      └── rustls v0.23.37
          ├── hyper-rustls v0.27.7
          │   └── libdd-common v4.0.0
          │       └── libdd-common-ffi v32.0.0
          │           └── libdd-otel-thread-ctx-ffi v1.0.0
          ├── libdd-common v4.0.0 (*)
          └── tokio-rustls v0.26.0
              ├── hyper-rustls v0.27.7 (*)
              └── libdd-common v4.0.0 (*)

advisories FAILED, bans ok, sources ok

📦 libdd-otel-thread-ctx - ✅ No issues

📦 tools - ✅ No issues


Updated: 2026-04-27 13:41:12 UTC | Commit: 3e47f45 | dependency-check job results

Copy link
Copy Markdown
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks quite reasonable. Some notes:

  1. I would strongly recommend getting one of the dotnet folks (which I believe are intended first users of this API) to give a pass on the PR

  2. I think we're missing an end-to-end example in C or C++. In particular, one that sets both the process and thread context, and thus can be picked up correctly by the external reader.

    Right now we're kinda assuming that callers will "just know" how to put this together, and suspect that will cause a bunch of confusion. Thus, it would be very helpful to provide some docs + a fully working example that could be used as a model.

  3. As I mentioned, I'm not convinced about the whole "expose the full structure", see my comment on why.

// Export the TLSDESC thread-local variable to the dynamic symbol table so
// external readers (e.g. the eBPF profiler) can locate it. Rust's cdylib
// linker applies a version script with `local: *` that hides all symbols
// not explicitly whitelisted, and also causes lld to relax the TLSDESC
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// not explicitly whitelisted, and also causes lld to relax the TLSDESC
// not explicitly allowlisted, and also causes lld to relax the TLSDESC

:)

// struct has the right total size.
#[repr(C)]
struct ThreadContextRecord {
pub struct ThreadContextRecord {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure about this part. In particular, I don't think we're giving callers the tools that they need to directly use this without getting "burned" right now: no docs, no validation. :hurtrealbad:

My "gut feeling" is: if callers haven't asked for this yet, let's not give it to them. 🤔

If they want to bang at the bits directly, I think in that situation they might as well prefer to reimplement it from scratch, e.g. why half-go through libdatadog if you want maximum control anyway?

Copy link
Copy Markdown
Contributor Author

@yannham yannham Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure about this part. In particular, I don't think we're giving callers the tools that they need to directly use this without getting "burned" right now: no docs, no validation. :hurtrealbad:

I would say that there is some documentation (and warnings!) in the documentation of attach. When talking with dotnet people, we thought it might make sense for performance to try to do everything on the SDK side for context switching, without going through native. I agree that then using libdatadog is questionable, but I think you'll still have to manage the whole TLSDESC/elf business. It might be reasonable to use libdatadog to attach the first context and then update it in place.

Also the structure is defined per the spec anyway, so it's not going to change and is "public" (as is, defined somewhere on the internet).

All of that being said, I agree that making it visible here is stronger, as we guarantee our FFI functions always return pointer to contexts as per the spec. Additionally, the in-place-update-from-the-sdk was really mentioned as "we can try that later in the future in mandated", so I would be totally fine reverting the changes to otel-thread-ctx, removing the bit about raw update, making the record type an entirely opaque pointer, and only discuss this again after there's a need for it, or at least some benchmarks.

Why do you dotnet guys think @gleocadie ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll still have to manage the whole TLSDESC/elf business.

I was thinking that if we're building a .so from C/C++, it's even easier to not use libdatadog for this part -- no need to fight linker, just declare a public thread-local symbol.

But yeah, as I mentioned above -- my suggestion is "gut feeling we're not gonna need it" -- if y'all want to try this path, we can, but I would suggest avoiding any "let's expose this now just-in-case-for-later" things ;)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that if we're building a .so from C/C++, it's even easier to not use libdatadog for this part -- no need to fight linker, just declare a public thread-local symbol.

That's true... although you still need the process context part, and re-implementing this in C/C++ starts to be a tad less trivial. But I agree that if you only need to swap the TLS slot, you're probably better off with a thin C/C++ shared lib than messing around with the Rust build 👍

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, I agree you'd still probably want to use the libdatadog process context impl, assuming whatever downstream consumer we're talking about is pulling libdatadog already as a dependency anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants