[API] Baggage - Spec alignment changes#7051
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7051 +/- ##
==========================================
+ Coverage 89.86% 90.01% +0.14%
==========================================
Files 276 276
Lines 14007 14193 +186
==========================================
+ Hits 12588 12776 +188
+ Misses 1419 1417 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
encoded value when injecting not '+' - replaced by tests: ValidateCharacterOutsideBaggageOctetIsPercentEncodedOnInject, RoundTripValueWithSpacePreservedAsSpace, ValidateSpecialCharactersInjectionForValue
…e comprehensive test
|
What I found while writing and editing the tests:
Before moving forward:
|
…dingly. keys will automatically not have these delimiters
|
@martincostello Could I get a quick initial review on these tests before I start working on the changes for the propogator? Thanks! |
|
Seems reasonable from a skim-read. |
Raw Benchmark DataBefore my changesAfter my changesNotable ChangesBefore After |
|
@Kielek as mentioned in #7009 Baggage needs more tests and requirements that need to be fixed. This PR is a collection of those missing changes that the spec mentions. This is a bigger LOC change so let me also give a more verbose description of the changes. Lot of this will quote my previous comment: Keys
Values
Let me know if you'd like more information. |
|
@nabutabu, after merges with main some tests started failing. Could you please check what wrong happened? What is more, at the current state (post main merge) I have asked Codex/gpt5.5 to review changes. The results [P1] Value encoding is not W3C percent encodingFile: Lines: 283-292 The encoder treats raw percent signs and non-ASCII characters as safe. W3C requires [P1] UrlDecode still converts plus to spaceFile: Lines: 326-330
[P2] Valid OWS around keys is rejectedFile: Lines: 209-214 The PR validates the raw key span before trimming. W3C allows OWS around commas and around the equals separator, so valid headers like [P2] Invalid keys are silently renamedFile: Lines: 283-292
Spec References
|
|
Im on it, will update as soon as I can! |
…e space with keys
…only checking no exceptions
[P1] Value encoding is not W3C percent encodingThe encoder was using negative sets (like InvalidCharArray and InvalidValueArray) which did not incorporate non-ascii values. I have changed the propogator to instead use positive sets to clearly indicate allowed values. Non-ASCII characters are encoded as UTF-8 bytes per RFC 3986 §2.1 [P1] UrlDecode still converts plus to spaceA custom decoder is needed when Extracting values. This is because:
[P2] Valid OWS around keys is rejectedKeys should be trimmed before being dropped. Changed. [P2] Invalid keys are silently renamedThis is a known consequence of this PR and is exactly why Benchmarks
Will commit the changes soon to represent the positive set changes |
- custom decoder for value when extracting - key is trimmed before validity check
|
Some minor, but still worth to fix comment related to UTF-16 chars. I am not sure how I could find this manually, but... The updated Expected wire encoding for U+1F600 is: This matters because the value is corrupted after an inject/extract round trip, and the emitted header does not follow the W3C requirement to percent-encode values outside W3C References
Relevant W3C points:
For U+1F600, the value is outside Test Patchdiff --git a/test/OpenTelemetry.Api.Tests/Context/Propagation/BaggagePropagatorTests.cs b/test/OpenTelemetry.Api.Tests/Context/Propagation/BaggagePropagatorTests.cs
--- a/test/OpenTelemetry.Api.Tests/Context/Propagation/BaggagePropagatorTests.cs
+++ b/test/OpenTelemetry.Api.Tests/Context/Propagation/BaggagePropagatorTests.cs
@@ -957,21 +957,24 @@ public void ValidateNonAsciiValueIsUtf8PercentEncodedOnInject()
- [Fact]
- public void RoundTripNonAsciiValuePreservesOriginalString()
+ [Theory]
+ [InlineData("caf\u00E9")]
+ [InlineData("emoji-\U0001F600")]
+ public void RoundTripNonAsciiValuePreservesOriginalString(string value)
{
var propagationContext = new PropagationContext(
default,
- new Baggage(new Dictionary<string, string> { { "key", "caf\u00E9" } }));
+ new Baggage(new Dictionary<string, string> { { "key", value } }));
var carrier = new Dictionary<string, string>();
this.baggage.Inject(propagationContext, carrier, Setter);
var extracted = this.baggage.Extract(default, carrier, Getter).Baggage.GetBaggage();
var entry = Assert.Single(extracted);
Assert.Equal("key", entry.Key);
- Assert.Equal("caf\u00E9", entry.Value);
+ Assert.Equal(value, entry.Value);
}
[Theory]
[InlineData("\u00E9", "%C3%A9")]
[InlineData("\u4E2D", "%E4%B8%AD")]
[InlineData("\u00A3", "%C2%A3")]
+ [InlineData("\U0001F600", "%F0%9F%98%80")]
public void ValidateNonAsciiUtf8EncodingIsCorrect(string character, string expectedEncoding) |
|
And hopefully last one, related to the I think the best way, will be to go with go-lang approach, but there is no-hard requirement for this. Following notes supported by Codex WhyBefore this PR, .NET used form-url encoding/decoding for both keys and values:
That gave .NET-to-.NET round-trip compatibility for invalid keys, but it was not W3C-compatible. The updated PR correctly stops decoding keys during extraction. However, injection still percent-encodes invalid key characters. That creates silent key mutation:
This is worse than dropping because the baggage item appears to survive propagation, but under a different key. It can also cause collisions, for example local keys Spec basisW3C defines baggage keys as
That grammar allows only ASCII token characters. Spaces, Also, Cross-language behaviorOpenTelemetry Go follows the behavior I think .NET should adopt:
Relevant Go source:
Java extraction also skips invalid keys and does not decode keys; only values are decoded:
RecommendationFor W3C baggage propagation, .NET should:
Suggested release-note wording:
|
|
I can't make diff comments at the moment due to a GitHub outage so I'll put them here in a PR comment:
char.IsAsciiDigit(c) || c is (>= 'A' and <= 'F') or (>= 'a' and <= 'f'); |
…r, simplify isHexDigit
…u/opentelemetry-dotnet into mulitple-baggage-spec-reconciles
…non-BMP characters like emoji's now
I agree, the Go-lang approach here is the best one. This is the trade-off based on which we have to make a decision, on
The change would look like: would be replaced by This change to follow the Go-implementation actually does not violate any of the issues that we are addressing as part of this PR:
Unrelated but worth mentioning, @cijothomas mentioned here that there are plans to sunset Propogators, but there's no timeline on when that happens. Regardless of whether we use the Go implementation or not, I think it's good idea to keep the test suite and skip tests wherever needed based on our decision. Edit: Go lang approach requires Edit: Commit: 9ccd2c1 changes current implementation to follow Go. Let me know if we wish to change it back! |
Towards #5689
Fixes #5500
Changes
Adds a more verbose test suite that ensures spec alignment and then changes the
BaggagePropogatorto follow these testsMerge requirement checklist
CHANGELOG.mdfiles updated for non-trivial changes