Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add peer.service semantic convention to indicate the name of a target… #652

Merged
merged 23 commits into from
Jun 30, 2020

Conversation

anuraaga
Copy link
Contributor

… remote service.

This adds an attribute to indicate the name of a remote service. While a service will often know its own name locally, a client generally does not, only having something like a hostname. But a hostname doesn't have a 1:1 mapping with a remote service because services can be exposed on multiple hostnames. Allowing the client to provide a service name is especially important for services that are not themselves traced, often a database such as Redis. There would be no way to render features like a service graph for such cases without this attribute.

I know OpenTracing left the actual meaning of service somewhat ambiguous - I can see why since it is hard to define :) I tried to add a bit more explanation, but let me know if it makes sense, any tips that could make it more clear, or on the flip side whether it makes sense to just leave it ambiguous and up to users completely.

Fixes #648

@@ -67,6 +67,15 @@ If `net.transport` is `"unix"` or `"pipe"`, the absolute path to the file repres
If there is no such file (e.g., anonymous pipe),
the name should explicitly be set to the empty string to distinguish it from the case where the name is just unknown or not covered by the instrumentation.

## General remote service attributes

These attributes may be used for any remote operation that applies to a target service. Users will generally define what a service is, though instrumentation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mentioning "target" creates a bit of confusion, because peer.service attribute can be added both to client and server spans. I think we already should have a shared definition of what "peer" means, so it does not need to be repeated.

In addition, I am not sure we need to wordsmith the meaning of "service" either because it should have already been done for resource.service.name. The meaning of peer.service is identical, just for the remove peer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I tried simplifying, let me know if this looks like what you were going for.

@carlosalberto carlosalberto added the area:semantic-conventions Related to semantic conventions label Jun 12, 2020
@carlosalberto carlosalberto added the spec:trace Related to the specification/trace directory label Jun 19, 2020
Anuraag Agrawal and others added 4 commits June 22, 2020 17:33
Copy link
Contributor Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, finally updated

@@ -67,6 +67,15 @@ If `net.transport` is `"unix"` or `"pipe"`, the absolute path to the file repres
If there is no such file (e.g., anonymous pipe),
the name should explicitly be set to the empty string to distinguish it from the case where the name is just unknown or not covered by the instrumentation.

## General remote service attributes

These attributes may be used for any remote operation that applies to a target service. Users will generally define what a service is, though instrumentation
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I tried simplifying, let me know if this looks like what you were going for.

@@ -67,6 +67,15 @@ If `net.transport` is `"unix"` or `"pipe"`, the absolute path to the file repres
If there is no such file (e.g., anonymous pipe),
the name should explicitly be set to the empty string to distinguish it from the case where the name is just unknown or not covered by the instrumentation.

## General remote service attributes

These attributes may be used for any remote operation that applies to a service. Users will generally define what a service is, though instrumentation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify what should be done if a closer matching specific semantic convention already defines a dedicated attribute like the RPC semantic conventions with their rpc.service attribute.

a) Should only the dedicated, specific attribute be set?
b) Should both be set to the same value? (probably not)
c) Does peer.service have a different semantic? If so, which?
d) Should rpc.service (and others, if any) be removed and instead reference/require peer.service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm thinking I should remove the point about automatically falling back to anything. I consider this a knob for users to customize how their traces look since in the end they'll often know best. One example is where the same proto file is mounted in a proxy and backend, then this really needs to be set by a user such that they can be differentiated.

On the flip side, I can't think of a good reason to fallback anything automatic anymore.

Will your point be cleared up if I remove the point about fallback?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, removed the point about fallback, let me know if this could still use more detail

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anuraaga I was not concerned about the fallback wording but about the collision with existing, more specific attributes (at least rpc.service). This should be clarified. Which of the options I mentioned above a)-d) sounds most reasonable to you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd lean towards b) since this is basically a spot for users to customize - if they customized to be the same as the rpc service, we may as well preserve that as they probably have a reason for it. It's why I thought the fallback wording could be related - we could have some dedupe semantics when doing something automatic, but for this field since it's for users, I think we could give them free reign. Let me know if that makes sense, and if I can explain that better. I'm balancing vs not wordsmithing too much or would be happy to add some examples. @yurishkuro any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it would be the same value, it should be fine to just not set peer.service in this case as it doesn't add any value and it's optional anyway. If users have a named service that is one hierarchy level above the RPC endpoint's service identifier, setting different values could make sense. I could, for example, think of a ShopBackend service (= peer.service) that exposes a Pricing RPC endpoint (= rpc.service) with a getPrice method (= rpc.method) among other RPC services. It would then be up to the backend, however, to put them in the right hierarchy or if that's not supported prefer one of the service names over the other.
Since this is not obvious, please mention that both service names could be the same (if the RPC endpoint is directly exposed, suggesting to leave the peer.service out) or different (if there is a logical hierarchy encompassing one or multiple RPC endpoints, thus both attributes should be specified) either here or in the RPC semconv.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I went ahead and added a note to rpc.md :)


As an example, given a process deployed as `QuoteService`, this would be the name that goes into the `service.name` resource attribute which applies to the entire process.
This process could expose two RPC endpoints, one called `CurrencyQuotes` (= `rpc.service`) with a method called `getMeanRate` (= `rpc.method`) and the other endpoint called `StockQuotes` (= `rpc.service`) with two methods `getCurrentBid` and `getLastClose` (= `rpc.method`).

Generally, a user SHOULD not set `peer.service` to a fully qualified RPC service name as that will be redundant with `rpc.service`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Generally, a user SHOULD not set `peer.service` to a fully qualified RPC service name as that will be redundant with `rpc.service`.
Generally, a user SHOULD not set `peer.service` if it would be equal to the RPC service name as that would be redundant with `rpc.service`.

Or do we want users to set peer.service to the unqualified name of the RPC service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks or the suggestions - for this one, I don't think we should encourage anything and leave it to the user. Personally, I find an unqualified service name to be easier to use in monitoring than the fully qualified (fully qualified I use to ensure no code collisions in code, not since it looks nice in dashboards, it often doesn't).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is yet another difference between the fully qualified RPC service name (that is used to connect to the service for example) and the fully qualified name of the (Java/Go/...) class that implements the service.

Copy link
Contributor Author

@anuraaga anuraaga Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - but I don't think people generally design their proto package names based on what they want to see in monitoring. In practice, I often see Java shops use a long FQDN proto package, C++ shops use a one-worder or so, and I've seen the package change from a one-worder to a FQDN because code stopped compiling due to a collision. Letting the user decouple this programming concept from monitoring seems nice for them. It's also nice that we have something to fallback to when they didn't though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) s/SHOULD not/SHOULD NOT/ (this is how RFC keywords work)
b) I am not sure about this sentence at all. If anything, I would rather people not send rpc.service, but do send peer.service, as the latter can be used in dependency diagrams, but rpc.service is meh (instrumentation at Uber just sets span.name={rpc.service}::{rpc.method} and does not send those extra tags at all, there's little use for them).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arminru felt pretty strongly about adding this line. I'm ok with either though :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the sentence emphasizes the wrong thing. I am not bothered by redundancy, because it is coincidental. Earlier we explain that service name is a reference to a deployment unit, like "k8s service". So I agree with the SHOULD NOT part, but because peer.service and rpc.service are completely different things. The paragraph above this one explains it quite well with the QuoteService example.

Ideally we should have this explanation in resource#service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - didn't tweak resource.md, but I think I get what you're saying. I tweaked to remove the point about redundancy and tie into the above paragraph, hopefully this helps

specification/trace/semantic_conventions/span-general.md Outdated Show resolved Hide resolved
specification/trace/semantic_conventions/span-general.md Outdated Show resolved Hide resolved
Co-authored-by: Christian Neumüller <christian+github@neumueller.me>
Co-authored-by: Christian Neumüller <christian+github@neumueller.me>

| Attribute name | Notes and examples |
| :-------------- | :-------------------------------------------------------------------------------- |
| `peer.service` | The (hypothetical) [`service.name`](../../resource/semantic_conventions/README.md#service) of the remote service. SHOULD be equal to the actual `service.name` resource attribute of the remote service if any. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why "hypothetical"?

Copy link
Member

@Oberon00 Oberon00 Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the other end might not be instrumented at all, e.g. see the Redis example below. The redis process will probably not be instrumented with OpenTelemetry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not explain the term "hypothetical" - there's a real service on the other side, instrumented or not. If I am calling a service and, for example, use name X to discover that service, then peer.service=X and it's not hypothetical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I just meant the service.name resource attribute is hypothetical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave it simply as service.name without qualifiers, as the qualifier only makes the situation more confusing. If instrumentation doesn't know the peer service name, then the tag is optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the "hypothetical" is confusing, especially if you also read the next sentence, but I'm also OK with removing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and removed hypothetical, don't think it's confusing, but I guess it's not needed either so can reduce the text a bit.

Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>
CHANGELOG.md Outdated Show resolved Hide resolved
Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>
anuraaga and others added 2 commits June 30, 2020 16:03
Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>
Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>
Anuraag Agrawal and others added 2 commits June 30, 2020 16:32
Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>
Copy link
Contributor Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated TOC / cleanups - thanks for the help!

@bogdandrutu bogdandrutu merged commit b70565d into open-telemetry:master Jun 30, 2020
codeboten pushed a commit to codeboten/opentelemetry-specification that referenced this pull request Jul 20, 2020
open-telemetry#652)

* Add peer.service semantic convention to indicate the name of a target remote service.

* Apply suggestions from code review

Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>

* Simplify

* Remove fallback wording

* Needs to be configured using instrumentation

* CHANGELOG

* Clarify relationship with rpc.service and peer.service and some examples

* Clarify example

* Move peer.service / rpc.service relationship explanation to rpc doc.

* Apply suggestions from code review

Co-authored-by: Christian Neumüller <christian+github@neumueller.me>

* Update specification/trace/semantic_conventions/span-general.md

Co-authored-by: Christian Neumüller <christian+github@neumueller.me>

* Cleanup

* Tweak

* Update specification/trace/semantic_conventions/rpc.md

Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>

* Update CHANGELOG.md

Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>

* Apply suggestions from code review

Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>

* Update specification/trace/semantic_conventions/span-general.md

Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>

* TOC

* Apply suggestions from code review

Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>

Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>
Co-authored-by: Christian Neumüller <christian+github@neumueller.me>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions spec:trace Related to the specification/trace directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Semantic convention for a remote service name
6 participants