fix(sdk): deduplicate LogRecord attribute keys by antcybersec · Pull Request #3537 · open-telemetry/opentelemetry-rust

antcybersec · 2026-06-05T01:45:44Z

Summary

Deduplicate SdkLogRecord attribute keys in add_attribute using last-write-wins semantics.
Add GrowableArray::get_mut to update existing attribute slots without extra allocations.
Add unit tests covering inline capacity, batch add_attributes, and overflow paths.

Rationale

The OpenTelemetry spec requires exported attribute collections to contain only unique keys by default. The SDK previously used Vec::push, so duplicate keys could reach exporters (e.g. from tracing events, span-attribute enrichment, or custom appenders).

Per maintainer feedback on #3497, deduplication is eager at add_attribute time (default-on, no builder flag).

Test plan

cargo test -p opentelemetry-sdk --features testing,rt-tokio --lib logs::record::tests
cargo test -p opentelemetry-sdk --features testing,rt-tokio --lib logs::
cargo test -p opentelemetry-appender-tracing --lib
cargo clippy -p opentelemetry-sdk --lib -- -Dwarnings

linux-foundation-easycla · 2026-06-05T01:45:50Z

The committers listed above are authorized under a signed CLA.

✅ login: antcybersec / name: Anant Kumar (309e4c6, ab5f9c4)

codecov · 2026-06-05T01:48:42Z

Codecov Report

❌ Patch coverage is 97.22222% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.9%. Comparing base (2571776) to head (c04e507).
⚠️ Report is 17 commits behind head on main.

Files with missing lines	Patch %	Lines
opentelemetry-sdk/src/growable_array.rs	93.7%	1 Missing ⚠️
opentelemetry-sdk/src/logs/logger_provider.rs	94.4%	1 Missing ⚠️
opentelemetry-sdk/src/logs/record.rs	98.1%	1 Missing ⚠️

Additional details and impacted files

@@          Coverage Diff           @@
##            main   #3537    +/-   ##
======================================
  Coverage   82.9%   82.9%            
======================================
  Files        130     130            
  Lines      27350   27456   +106     
======================================
+ Hits       22675   22778   +103     
- Misses      4675    4678     +3

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cijothomas · 2026-06-05T04:32:16Z

-        self.attributes.push(Some((key.into(), value.into())));
+        let key = key.into();
+        let value = value.into();
+        for i in 0..self.attributes.len() {


can you add benchmarks and show before/after, so we can gauge the perf impact easily.
also, we need an opt-out of this behavior flag for users who'd rather not pay the perf penalty (and retain current behavior).

Add LoggerProviderBuilder::with_log_record_attribute_deduplication so users can disable last-write-wins dedup and keep push-only behavior. Add criterion benchmarks comparing dedup on vs off for typical (unique keys) and duplicate-key workloads. Addresses review on open-telemetry#3537.

Align bench header with latest local criterion run for PR open-telemetry#3537.

antcybersec · 2026-06-05T06:01:42Z

@cijothomas — addressed both review points in the latest commits:

Opt-out flag

LoggerProviderBuilder::with_log_record_attribute_deduplication(false) disables dedup and restores the previous push-only behavior. Default remains true (spec-compliant).

let provider = SdkLoggerProvider::builder()
    .with_log_record_attribute_deduplication(false)
    .build();

Benchmarks (before / after)

Run: cargo bench --bench log_record_attribute_dedup --features logs

Results from local criterion run (see also opentelemetry-sdk/benches/log_record_attribute_dedup.rs header):

Scenario	Dedup ON	Dedup OFF (before)	Overhead
1 unique attribute	27.2 ns	24.7 ns	~1.1x
5 unique attributes	135.5 ns	119.3 ns	~1.1x
9 unique attributes	301.2 ns	220.8 ns	~1.4x
5 writes, same key	31.9 ns	32.1 ns	~1.0x

Takeaway: For the typical case (≤5 unique keys, which matches the SDK's inline capacity of 5), overhead is small (~10%). It grows at larger attribute counts (9 attrs ~1.4x), which is the main case where users may want the opt-out.

Criterion output screenshot attached below for reference.

Enforce unique log record attribute keys at add_attribute time using last-write-wins semantics so exported payloads match the OpenTelemetry spec. Fixes open-telemetry#3497.

Add LoggerProviderBuilder::with_log_record_attribute_deduplication so users can disable last-write-wins dedup and keep push-only behavior. Add criterion benchmarks comparing dedup on vs off for typical (unique keys) and duplicate-key workloads. Addresses review on open-telemetry#3537.

cijothomas · 2026-06-05T15:28:40Z

+//! Results (unique keys — typical case, no duplicates in the batch):
+//! | Scenario                | Dedup ON | Dedup OFF (before) | Overhead |
+//! |-------------------------|----------|--------------------|----------|
+//! | add_1_unique_attribute  | 25.6 ns  | 25.9 ns            |  ~1.0x   |


if you don't mind could you run this few teams and see if this holds. I cloned an ran once and numbers are telling much more regression. (It does not mean we cannot take the PR - just want to quantify the impact)

cijothomas · 2026-06-05T15:31:13Z


 ## vNext

+- `SdkLogRecord::add_attribute` now deduplicates attribute keys (last-write-wins)


This is technically a regression, so need a bigger warning so users bumping version will also disable deduplication to get back previous behavior.
@open-telemetry/rust-maintainers Please weigh in to this change - I am okay to accept the regression and be spec-compliant. The opt-out is easy for end users who want to retain original performance.

@scottgerring PTAL.

looks good to me with updated changelog.

EDIT - Given the spec requires exported attribute collections to be unique, did we consider enforcing this closer to export/serialization instead, so the breaking behavior and cost are narrower? I understand SDK-level dedup gives processors/exporters a consistent view, but I want us to be explicit about accepting that GA-wide behavior/perf tradeoff.

Median of 3 criterion runs on Apple M4 Max per maintainer request.

antcybersec · 2026-06-05T17:39:27Z

Hi @cijothomas, ran the benchmarks 3 independent times on Apple M4 Max (macOS 25.3.0) and took the median — numbers are very stable across runs:

Unique keys (typical case — no duplicate writes)

Scenario	Run 1 (dedup ON / OFF)	Run 2 (dedup ON / OFF)	Run 3 (dedup ON / OFF)	Median ON	Median OFF	Overhead
1 attribute	25.5 / 24.9 ns	25.5 / 25.7 ns	25.3 / 25.5 ns	25.3 ns	25.5 ns	~1.0x
5 attributes	128.7 / 120.1 ns	130.9 / 128.2 ns	131.5 / 120.5 ns	130.7 ns	122.9 ns	~1.06x
9 attributes	302.3 / 221.9 ns	293.1 / 219.3 ns	295.5 / 219.7 ns	296.9 ns	220.3 ns	~1.35x

Repeated key (5 writes to the same key)

Run	Dedup ON	Dedup OFF
Run 1	31.9 ns	32.1 ns
Run 2	32.0 ns	32.1 ns
Run 3	32.1 ns	32.0 ns

Summary:

1 attribute: effectively no overhead (~1.0x) — the common case for most structured log lines
5 attributes: ~6% overhead — still negligible in practice
9 attributes (worst-case for inline storage, all unique): ~35% overhead — this is the honest number

The 9-attribute case does show meaningful regression when all 9 keys are unique (the linear scan grows). This is exactly the scenario the opt-out flag exists for: users with high-cardinality attribute sets who emit guaranteed-unique keys per record can call .with_log_record_attribute_deduplication(false) on their LoggerProviderBuilder and pay zero overhead.

The benchmark file header has been updated in the latest commit to reflect the median-of-3 numbers. Happy to adjust the benchmark configuration (longer measurement time, more samples) if you'd like even tighter confidence intervals.

Mark as breaking behavioral change with explicit migration snippet per maintainer feedback on PR open-telemetry#3537.

antcybersec · 2026-06-05T17:44:09Z

Good call @cijothomas — updated the CHANGELOG.md entry to make the breaking behavioral nature explicit with a Migration section so users bumping the version know exactly how to opt out.

The new entry reads:

Breaking behavioral change: SdkLogRecord::add_attribute now deduplicates attribute keys by default (last-write-wins), so exported log records conform to the OpenTelemetry specification requirement that attributes form a map of unique keys. This changes observable output for any code that previously called add_attribute with the same key more than once.

Migration: If you relied on the previous push-only behavior (e.g. for performance reasons or because downstream consumers tolerated duplicate keys), opt out via the provider builder:

let provider = SdkLoggerProvider::builder()
    .with_log_record_attribute_deduplication(false)
    .build();

And agree with your framing — spec-compliance is the right default, the opt-out is a one-liner for the rare case where it matters, and the benchmark numbers show the overhead is only meaningful at 9+ unique attributes per record.

cijothomas · 2026-06-11T00:11:16Z

@antcybersec I have triggered a perf run so we'll get to see the full set of current numbers and their regression. This is the first time since we added that infra, so need some time to see how to evaluate the results!

(Irrespective, I think you have good support for this direction - please wait couple days to let other maintaienrs/approvers also to chime in)

github-actions · 2026-07-01T04:45:50Z

Thank you for your contribution! This PR has been automatically marked as stale because it has not had activity in the last 14 days. This may be due to a delay in review on our side or awaiting a response from you; either is fine, and we appreciate your patience.

It will be closed in 14 days if no further activity occurs. Pushing a new commit or leaving a comment will remove the stale label and keep the PR open.

antcybersec requested a review from a team as a code owner June 5, 2026 01:45

cijothomas reviewed Jun 5, 2026

View reviewed changes

antcybersec pushed a commit to antcybersec/opentelemetry-rust that referenced this pull request Jun 5, 2026

docs(sdk): update log attribute dedup benchmark numbers

2bdee4a

Align bench header with latest local criterion run for PR open-telemetry#3537.

antcybersec force-pushed the fix/3497-log-record-attribute-dedup branch from 2bdee4a to 0dfe4b9 Compare June 5, 2026 06:03

antcybersec added 2 commits June 5, 2026 11:42

fix(sdk): deduplicate LogRecord attribute keys

ab5f9c4

Enforce unique log record attribute keys at add_attribute time using last-write-wins semantics so exported payloads match the OpenTelemetry spec. Fixes open-telemetry#3497.

antcybersec force-pushed the fix/3497-log-record-attribute-dedup branch from 0dfe4b9 to 309e4c6 Compare June 5, 2026 06:12

cijothomas reviewed Jun 5, 2026

View reviewed changes

bench(sdk): update dedup numbers from 3 stable runs

32347fd

Median of 3 criterion runs on Apple M4 Max per maintainer request.

docs(sdk): strengthen CHANGELOG warning for add_attribute dedup change

c04e507

Mark as breaking behavioral change with explicit migration snippet per maintainer feedback on PR open-telemetry#3537.

cijothomas added the performance label Jun 10, 2026

github-actions Bot added the Stale label Jul 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(sdk): deduplicate LogRecord attribute keys#3537

fix(sdk): deduplicate LogRecord attribute keys#3537
antcybersec wants to merge 4 commits into
open-telemetry:mainfrom
antcybersec:fix/3497-log-record-attribute-dedup

antcybersec commented Jun 5, 2026

Uh oh!

linux-foundation-easycla Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

cijothomas Jun 5, 2026

Uh oh!

antcybersec commented Jun 5, 2026 •

edited

Loading

Uh oh!

cijothomas Jun 5, 2026

Uh oh!

cijothomas Jun 5, 2026

Uh oh!

cijothomas Jun 5, 2026

Uh oh!

lalitb Jun 7, 2026 •

edited

Loading

Uh oh!

antcybersec commented Jun 5, 2026

Uh oh!

antcybersec commented Jun 5, 2026

Uh oh!

cijothomas commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		## vNext

		- `SdkLogRecord::add_attribute` now deduplicates attribute keys (last-write-wins)

Uh oh!

Conversation

antcybersec commented Jun 5, 2026

Summary

Rationale

Test plan

Uh oh!

linux-foundation-easycla Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cijothomas Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

antcybersec commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Opt-out flag

Benchmarks (before / after)

Uh oh!

cijothomas Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

cijothomas Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

cijothomas Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

lalitb Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antcybersec commented Jun 5, 2026

Unique keys (typical case — no duplicate writes)

Repeated key (5 writes to the same key)

Uh oh!

antcybersec commented Jun 5, 2026

Uh oh!

cijothomas commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

linux-foundation-easycla Bot commented Jun 5, 2026 •

edited

Loading

codecov Bot commented Jun 5, 2026 •

edited

Loading

antcybersec commented Jun 5, 2026 •

edited

Loading

lalitb Jun 7, 2026 •

edited

Loading