Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add map and array attribute value type support #1656

Merged
merged 27 commits into from
Sep 16, 2020
Merged

Add map and array attribute value type support #1656

merged 27 commits into from
Sep 16, 2020

Conversation

kbrockhoff
Copy link
Member

Description: Map and array attribute value types which were recently added to the specification had not been implemented in the collector yet. This PR implements across all tracing receivers and exporters.

Link to tracking Issue: #1590

Testing: Added map and array attribute values to golden dataset. All unit and correctness tests continue to pass.

Documentation: Public methods have comments.

@codecov
Copy link

codecov bot commented Aug 27, 2020

Codecov Report

Merging #1656 into master will increase coverage by 0.22%.
The diff coverage is 95.52%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1656      +/-   ##
==========================================
+ Coverage   91.67%   91.89%   +0.22%     
==========================================
  Files         262      261       -1     
  Lines       18602    18681      +79     
==========================================
+ Hits        17053    17167     +114     
+ Misses       1117     1082      -35     
  Partials      432      432              
Impacted Files Coverage Δ
translator/trace/protospan_translation.go 92.02% <92.36%> (+72.02%) ⬆️
internal/goldendataset/generator_commons.go 90.00% <100.00%> (+1.53%) ⬆️
internal/goldendataset/resource_generator.go 98.41% <100.00%> (+0.18%) ⬆️
internal/goldendataset/span_generator.go 98.95% <100.00%> (+0.02%) ⬆️
translator/conventions/opentelemetry.go 100.00% <100.00%> (ø)
translator/internaldata/oc_to_resource.go 100.00% <100.00%> (ø)
translator/internaldata/resource_to_oc.go 87.87% <100.00%> (-4.72%) ⬇️
translator/internaldata/traces_to_oc.go 87.87% <100.00%> (+5.65%) ⬆️
translator/trace/jaeger/traces_to_jaegerproto.go 88.36% <100.00%> (+0.12%) ⬆️
translator/trace/zipkin/traces_to_zipkinv2.go 93.51% <100.00%> (+0.95%) ⬆️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6b4dce0...18f5ce4. Read the comment docs.

@tigrannajaryan tigrannajaryan self-assigned this Sep 2, 2020
testbed/testbed/test_case.go Outdated Show resolved Hide resolved
translator/conventions/opentelemetry.go Outdated Show resolved Hide resolved
translator/internaldata/oc_to_resource.go Outdated Show resolved Hide resolved
}

// DetermineValueType returns the native OTLP attribute type the string translates to.
func DetermineValueType(value string, omitSimpleTypes bool) pdata.AttributeValueType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand why this smartness exists in our translation. I was not aware we had this for Zipkin, and I don't understand why we extend this to other translations either.
Do you know why we had this for Zipkin and why we would want it for other formats? Jaeger has proper support for all data types we need, I don't know why we would need a smart conversion like this.

dest.UpsertBool(key, bVal)
case pdata.AttributeValueMAP:
var attrs map[string]interface{}
err := json.Unmarshal([]byte(val), &attrs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should try to treat every string that looks like a JSON as a JSON and convert it to a map. This is can be an unexpected behavior. The recommendation in the spec is for emitters to use JSON if they have no other way to represent structured data. But there is no recommendation in the spec for readers to try to treat it as a JSON. I do not think this is Collector's business. If backends want to do this they are free to but we are overloading Collector by doing this here.

@tigrannajaryan
Copy link
Member

@kbrockhoff OK, I think I understand where the problem is coming from.

You extended the correctness tests to include array and map values and for such values to pass correctly through our translations you had to make these autodetecting translators for formats that don't support arrays and maps, which is not only Zipkin but virtually everything else.

I understand why you did it, but I believe it is based on a slightly false premise: that no matter what combination of receiver and exporter you use the Collector will guarantee that the mirroring translations are always loseless. So if you generate OTLP data, translate to Zipking, send to Collector, which will internally translate it to OTLP again, then to Zipkin again, and the testbed will translate that one last time to OTLP to compare then we expect that the data is intact.

This is only partially correct premise: we only expect this to be to for representible data. If the particular format is not capable to represent the particular data type that OTLP can then we should not have an expectation of loseless translation.

In this particular case arrays and maps are not representible anywhere expect OTLP so we do not need to test for such data for any other format except OTLP.

I believe we need to slightly modify the correctness tests to account for this and generate data that contains values that are only needed for the particular format. For OTLP we will generate the full set of possible values, for other formats we will generate a subset.

I do not suggest to do this it in this PR since it may be a bit of bigger change. What I suggest to do in this PR is to avoid generating array and map values in the correctness tests for now. Just add support for arrays and maps elsewhere (as you did) but do not make them part of the correctness test goldensets.

In a later PR we can add an extended goldenset for OTLP only.

What do you think?

@tigrannajaryan
Copy link
Member

Just to clarify: I believe it is not necessary to do these complicated data type conversions for Jaeger and OC since they already have their own type system for attributes which is although not as rich as OTLP is good enough to limit our support to.

Seems the the smartness in Zipkin translations was driven by the bug fixes which I did not look into, but I am assuming it was necessary so we can keep it to avoid breaking stuff that we fixed.

@kbrockhoff
Copy link
Member Author

It was relatively simple to change validator to support array and map comparisons with attributes where it has been converted to strings so I left the array and map attributes in the golden dataset.

@kbrockhoff
Copy link
Member Author

I think we should support conversion of Zipkin string values to their respective types. It does make sense for it to be configurable. This can be done with a configuration parameter on the Zipkin receiver. Parameter would specify whether to leave all values as strings or do automatic conversions. If this is an agreeable approach, I will implement as part of this PR.

@tigrannajaryan
Copy link
Member

I think we should support conversion of Zipkin string values to their respective types. It does make sense for it to be configurable. This can be done with a configuration parameter on the Zipkin receiver. Parameter would specify whether to leave all values as strings or do automatic conversions. If this is an agreeable approach, I will implement as part of this PR.

I am not completely sure we want to do this, but even if we do I'd prefer it to be a separate PR. Large PRs are difficult to review.

Can we go with the simpler approach where we don't have the complicated automatic type conversions first and if we see the need to have them we add them in the future?

We can also discuss in the SIG meeting.

@kbrockhoff
Copy link
Member Author

I will change to the simpler approach

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to discuss this in the SIG meeting. I will continue replying to the comments offline but if you want to discuss this live we can setup a meeting before the next week's SIG meeting.

translator/internaldata/oc_to_resource.go Outdated Show resolved Hide resolved
Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now.
Please fix the conflict and we can merge.

@tigrannajaryan tigrannajaryan merged commit 471c4a6 into open-telemetry:master Sep 16, 2020
@tigrannajaryan
Copy link
Member

Thanks a lot @kbrockhoff !

MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this pull request Nov 11, 2021
to provide consistent naming across the code base, deprecate pusher in
favor of exporter naming convention.

Signed-off-by: ldelossa <ldelossa@redhat.com>

Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>
hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this pull request Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants