-
Notifications
You must be signed in to change notification settings - Fork 859
Decode the OTEL_RESOURCE_ATTRIBUTES environment variable according to the spec #6461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Decode the OTEL_RESOURCE_ATTRIBUTES environment variable according to the spec #6461
Conversation
| private static string DecodeValue(string baggageEncoded) | ||
| { | ||
| var bytes = new List<byte>(); | ||
| for (int i = 0; i < baggageEncoded.Length; i++) | ||
| { | ||
| if (baggageEncoded[i] == '%' && i + 2 < baggageEncoded.Length && IsHex(baggageEncoded[i + 1]) && IsHex(baggageEncoded[i + 2])) | ||
| { | ||
| string hex = baggageEncoded.Substring(i + 1, 2); | ||
| bytes.Add(Convert.ToByte(hex, 16)); | ||
|
|
||
| i += 2; | ||
| } | ||
| else if (baggageEncoded[i] == '%') | ||
| { | ||
| return baggageEncoded; // Bad percent triplet -> return original value | ||
| } | ||
| else | ||
| { | ||
| if (!IsBaggageOctet(baggageEncoded[i])) | ||
| { | ||
| return baggageEncoded; // non-encoded character not baggage octet encoded -> return original value | ||
| } | ||
|
|
||
| bytes.Add((byte)baggageEncoded[i]); | ||
| } | ||
| } | ||
|
|
||
| return new UTF8Encoding(false, false).GetString(bytes.ToArray()); | ||
| } | ||
|
|
||
| private static bool IsHex(char c) => | ||
| (c >= '0' && c <= '9') || | ||
| (c >= 'a' && c <= 'f') || | ||
| (c >= 'A' && c <= 'F'); | ||
|
|
||
| private static bool IsBaggageOctet(char c) => | ||
| c == 0x21 || | ||
| (c >= 0x23 && c <= 0x2B) || | ||
| (c >= 0x2D && c <= 0x3A) || | ||
| (c >= 0x3C && c <= 0x5B) || | ||
| (c >= 0x5D && c <= 0x7E); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use WebUtility.UrlDecode() here, rather than hand-roll our own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, but there is one difference. The UrlDecode method would replace + in the value string with space in the decoded string. This behavior is not part of the specification which uses only percent encoding (unlike application/x-www-form-urlencoded data, where + should be decoded into space). The current specification links to this.
However, client libraries in other languages seem to mostly use UrlDecode. If you prefer to follow their (incorrect 😄) approach I can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martincostello, I would consider this as a bug. See: #5689
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should go with the simpler option first (i.e. the slightly non-compliant WebUtility.UrlDecode() for consistency), then in a follow-up we can add a compliant implementation (maybe using the optimised/tested code from this method itself and amending as appropriate) as shared code, then update all the appropriate places at once to use it and fix the spec-drift all together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with both approaches. I don't feel qualified to decide this 😄 Could you tell me which decoding to use? @Kielek @martincostello
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hannahhaering, any option that you can start with bugfixing baggage propagator with proper de/encoding? Based on this we can adjust this PR.
I would like to avoid introducing intentional bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try to do this. I can start doing this in a few days (maybe earlier).
I will mark this PR as draft in the meantime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented a new helper class for percent encoding. I used this to encode and decode the baggage in the propagator and in the resource detector.
I also added some unit tests. I added a test for non-ascii character decoding, this makes a the linter script fail. Any way around this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kielek I updated the tests and the linter doesn't complain anymore. I think this PR is ready. Do you have any other remarks/comments?
test/OpenTelemetry.Tests/Resources/OtelEnvResourceDetectorTests.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Martin Costello <martin@martincostello.com>
| { | ||
| var attributes = new List<KeyValuePair<string, object>>(); | ||
|
|
||
| string[] rawAttributes = resourceAttributes.Split(AttributeListSplitter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These string[] and string allocations could be removed with span operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered using spans but decided against it since this method is only used for config parsing. Supporting mixed .NET targets (some without full span support) would require two versions, adding unnecessary complexity and reducing readability. I don’t think the trade-off is worth it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying this is the only place in the whole codebase that splits strings and shouldn't?
Polyfills can be easily created for targets that miss some features. I've done that more than once.
| var key = rawKeyValuePair.Substring(0, indexOfFirstEquals).Trim(); | ||
| var value = rawKeyValuePair.Substring(indexOfFirstEquals + 1).Trim(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string allocations could be removed with span operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| var key = rawKeyValuePair.Substring(0, indexOfFirstEquals).Trim(); | |
| var value = rawKeyValuePair.Substring(indexOfFirstEquals + 1).Trim(); | |
| var key = rawKeyValuePair.AsSpan(0, indexOfFirstEquals).Trim().ToString(); | |
| var value = rawKeyValuePair.AsSpan(indexOfFirstEquals + 1).Trim().ToString(); |
Co-authored-by: Piotr Kiełkowicz <pkiekowicz@splunk.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #6461 +/- ##
==========================================
- Coverage 86.69% 86.60% -0.09%
==========================================
Files 258 258
Lines 11910 11885 -25
==========================================
- Hits 10325 10293 -32
- Misses 1585 1592 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
|
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
|
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
| continue; | ||
| } | ||
|
|
||
| var splitKeyValue = keyValuePair.Split([KeyValueSplitter], 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that indexOfFirstEquals is already found above, might as well just use that.
|
|
||
| var splitKeyValue = keyValuePair.Split([KeyValueSplitter], 2); | ||
| var key = splitKeyValue[0].Trim(); | ||
| var value = splitKeyValue[1].Trim(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use spans to save some allocations. Same as #6461 (comment)
| currentLength >= MaxBaggageLength || currentItemCount >= MaxBaggageItems; | ||
|
|
||
| private static bool IsValidKeyValuePair(string key, string value) => | ||
| !string.IsNullOrEmpty(key) && !string.IsNullOrEmpty(value) && TokenRegex().IsMatch(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key is a token, which also has its limits:
- https://www.w3.org/TR/baggage/#key
- definition of token in [RFC7230]
| currentLength >= MaxBaggageLength || currentItemCount >= MaxBaggageItems; | ||
|
|
||
| private static bool IsValidKeyValuePair(string key, string value) => | ||
| !string.IsNullOrEmpty(key) && !string.IsNullOrEmpty(value) && TokenRegex().IsMatch(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid regex on hot path.
| currentLength >= MaxBaggageLength || currentItemCount >= MaxBaggageItems; | ||
|
|
||
| private static bool IsValidKeyValuePair(string key, string value) => | ||
| !string.IsNullOrEmpty(key) && !string.IsNullOrEmpty(value) && TokenRegex().IsMatch(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is empty value not allowed?
| } | ||
| } | ||
|
|
||
| return new UTF8Encoding(false, false).GetString(bytes.ToArray()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The baggage string has always been ASCII only before and after conversion. Why switching to List<byte> and UTF8Encoding, instead of keep using StringBuilder like in the original approach?
| #if NET | ||
| bytes.Add(byte.Parse(hex, System.Globalization.NumberStyles.HexNumber, System.Globalization.CultureInfo.InvariantCulture)); | ||
| #else | ||
| bytes.Add(Convert.ToByte(hex.ToString(), 16)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be a lot of string allocations to generate a string for each hex number in the hot path.
| } | ||
| else if (baggageEncoded[i] == '%') | ||
| { | ||
| return baggageEncoded; // Bad percent triplet -> return original value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec called out that it must be replaced with the replacement code point (U+FFFD). https://www.w3.org/TR/baggage/#value
When decoding the value, percent-encoded octet sequences that do not match the UTF-8 encoding scheme MUST be replaced with the replacement code point (U+FFFD).
Although it didn't say what to do for invalid encoded values, I personally I think it's better to keep non percent-encoded triplets as is. For reference, WebUtility.UrlDecode("123+abc%G1%") returns 123 abc%G1%.
Fixes #3395 #5479
Changes
Added percent-decoding of the
OTEL_RESOURCE_ATTRIBUTESvariable according to the spec, adhering to the W3C Baggage format. As in the Go implementation, I retained the original value in case decoding fails.Merge requirement checklist
CHANGELOG.mdfiles updated for non-trivial changes