-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify Span status according to specification #224
Modify Span status according to specification #224
Conversation
tigrannajaryan
commented
Sep 25, 2020
•
edited
Loading
edited
- Renamed Status.code field to Status.deprecated_code.
- Renamed StatusCode enum to Status.DeprecatedStatusCode.
- Introduce new Status.code with corresponding StatusCode enum.
- Added guidelines about how to use the fields to ensure backward compatibility.
3544d3a
to
f2a978e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
f2a978e
to
77fffd8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this avoid being a breaking change of a stable API when the name of the field and its values are being changed?
Also, since JSON is allowed, renaming is not even wire-compatible. |
JSON is not stable |
// | ||
// This field is deprecated and is replaced by `code` field below. See backward | ||
// compatibility notes below. According to our stability guarantees this field | ||
// will be removed in 12 months, on Sep 26, 2021. All usage of old senders and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR but just noticed since it's defined very clearly here with a date - 12 months seems like a pretty short time for API stability post-GA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12 months is our formal definition of "stable": https://github.com/open-telemetry/community/blob/47813530864b9fe5a5146f466a58bd2bb94edc72/maturity-matrix.yaml#L57
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a new stability called "GA", that has stronger requirements.
Names of the fields are not exposed in OpenTelemetry API. Compatibility from protocol perspective means wire compatibility. Names of the fields do not exist on the wire level. We could keep the name of the field and of the enum values if there is a big desire to keep compatibility on the build level. I am not sure it is worth it and it was not part of the compatibility promise in the past (Collector codebase hides these details on purpose so that usage of OTLP protobuf messages in the codebase is not affected). I am open to re-considering this and it does not affect the logic. |
What about OTLP/JSON? |
@tigrannajaryan tooling does rely on these names, so it is a breaking change for every library that relies on the proto names to have to update. When these files (Common and Trace) were officially marked stable in August, I assumed that included stable for tooling, as well. What I'm not clear on is why this enum is now being deprecated at all, nor why it's being marked as deprecated by prefixing the word to everything. The comment directly above this code states It's unclear to me in https://github.com/open-telemetry/oteps/blob/master/text/trace/0136-error_flagging.md#status-mapping-schema if these codes are actually intended to be removed at this time.
What I did get from the OTEP was the addition of a code field I could be missing something as it's difficult to keep up. If so, I'd love to have more context on this PR. I greatly welcome the additions but I'm unclear on the motivation of the deprecation. |
@bryannaegele the OTEP's language may not be clear but the specification is clear: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#status There is only one code - the StatusCanonicalCode, there is no other code in the spec that we need to also support in the protocol. There is certainly no intent to have 2 different codes in the API or in the protocol. The OTEP's motivation was that the old enum did not make sense. This has been discussed in multiple meetings and in the OTEP.
I am not strongly opposed to keeping the old field name and old enum name and value names. The reason I used "deprecated_" as a prefix was to discourage its usage. If there is no strong oposition I am happy to modify this PR to restore the names of the deprecated field and use new names for the new field and enum. |
@tigrannajaryan That doc changed 7 days ago where the API spec clearly stated that these status codes were to be used and thus available to other uses, such as messaging https://github.com/open-telemetry/opentelemetry-specification/blame/3e66e6fcc3bb8070db7ccbdd30005ae9f35ac29c/specification/trace/semantic_conventions/http.md#L40-L63. So the ability to set the status with these semantic conventions were in the API spec and does introduce a breaking change. These status codes being set are also still in the referenced document for HTTP clients https://github.com/open-telemetry/opentelemetry-specification/blame/master/specification/trace/semantic_conventions/http.md#L109-L116. Pardon my confusion, but this does seem more than a change to the protos and in fact represents a breaking API change, as well. The Commons were marked stable, so I hope you can understand why I'm questioning these changes. |
If this change must be done for some reason, I'm simply asking what is the replacement for marking statuses on a span beyond |
I just caught up and found https://github.com/open-telemetry/opentelemetry-specification/pull/966/files from 4 days ago, so this conversation is apparently moot and we'll just have to deal with the breaking changes. |
The updated spec does not define anything for this. These has been discussed extensively in the spec SIG and error SIG. This PR is a consequence, it is not something new. This simply makes the corresponding change in the protocol.
Again, I am happy to keep the old field names in the protocol, but language SDK's have to deal with the spec change anyway. OTLP exporters in SDK will also have to comply with Compatibility requirements for senders that I added in this PR before GA. I don't think we want to release an SDK that has outdated understanding of Span Status. |
We can't all be in every SIG meeting :) We'll deal with it. As you noted, we're going to have to remove all this from the SDK anyhow, so it's better to just get the pain over with. I'm just looking for guidance at this point on how to guide users on how to substitute the functionality. For users that are already leveraging this functionality, what is the upgrade path for them? Are we just saying "Hey, error status codes aren't a thing anymore so you'll have to figure something out" or is there guidance/patterns we can provide? Asking as a user and a maintainer 😆 |
Maybe we should add a compatibility semantic convention to ease transition at least? |
@bryannaegele best to ask this in spec repo or gitter where it will get more eyeballs. |
@Oberon00 Can you please clarify what do you mean by this? (I was about to merge this PR, just noticed you comment, so will hold for now). |
I have put a bit more thought in to this and I believe the transition in codebases will be easier to do if we keep the names of the deprecated field and enum unchanged. I am going to revert that part of the change. @bryannaegele FYI. I am going to also wait for open-telemetry/opentelemetry-specification#1069 resolution and do it in one PR. There is no point in introducing the field with wrong names and risk someone start using it and then immediately break it. |
I updated this PR and removed "canonical" adjective to be inline with open-telemetry/opentelemetry-specification#1081 I am using "code value" since "code" is unfortunately already used by a deprecated field which have to keep around for now. |
I would like to know what others think about the field names. The new name looks awkward to me. I would rather prefer to rename the old fields and use I am going to open an unrelated PR which legitimizes field renaming as soon as it does not break wire compatibility (which is fine up until we say OTLP/JSON is stable). |
FYI, regarding field name guarantees: #225 |
Given that #225 got several approvals and is about to be merged I am inclined more and more to rename the fields since we no longer are limited by the compatibility guarantees. I am going to that unless I hear objections. |
1. Renamed Status.code field to Status.deprecated_code. 2. Renamed Status.code.StatusCode enum to Status.DeprecatedStatusCode. 3. Introduce new Status.code with corresponding StatusCanonicalCode enum. 4. Added guidelines about how to use the fields to ensure backward compatibility.
We want to keep the old field and enum names.
This reverts commit b3b01e1.
This reverts commit 6b9a7a4.
53cd0a6
to
bc0050f
Compare
@open-telemetry/specs-approvers @open-telemetry/specs-trace-approvers please review. I made a few changes back and forth in this PR. The PR description now reflects the latest state. |
@open-telemetry/specs-approvers @open-telemetry/specs-trace-approvers please review. Implementation in the Collector cannot proceed until this is merged. The Collector is currently non-compliant with the spec because of this. |
I think there are already enough approvals, aren't there? |
@Oberon00 I re-requested reviews. Approvers previously saw and approved a different version of this PR and I think differences are pretty significant, so it is important for them to confirm they are still OK. |
@tigrannajaryan Then you should dismiss the reviews in question to make this clearer. |
PR is significantly changed, please re-review.