Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add process.runtime.name / version describing executing runtime #882

Merged
merged 22 commits into from
Sep 22, 2020

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Aug 26, 2020

Fixes #830

Changes

Adds attributes to fill the name and version of the runtime, e.g. openjdk 14.0.1, dotnet-framework 4.5, cpython 3.3, etc.

I tried to collect runtime names from various sources like wikipedia, sdkman, and random googling, but am not familiar enough with many of the languages to know how representative these are, let me know! Naming wise, I used a hyphen instead of a space for any name that had a space and otherwise just lowercased.

@Oberon00 Oberon00 changed the title Add process.runtime.name / version to capture information about execu… Add process.runtime.name / version describing executing runtime Aug 26, 2020
@Oberon00 Oberon00 added area:semantic-conventions Related to semantic conventions release:after-ga Not required before GA release, and not going to work on before GA spec:resource Related to the specification/resource directory labels Aug 26, 2020
| Value | Description |
| --- | --- |
| `nodejs` | NodeJS |
| `browser` | Web Browser |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the user agent string be reported in this case? Or some other browser name/version identification? Or would that name be part of the version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User agent parsing is very hairy, so my instinct is to not make the SDK fill in anything smarter here than browser, and we probably should put the user agent string as the process.runtime.version (unless any browser instrumentation developers know of a better one :) ). User agent will also be present in HTTP spans, but if we don't include it in Resource as well we'd miss it on internal spans. I'm not that familiar with browser instrumentation though so happy to hear more thoughts!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think explicitly saying that in case of browser the version SHOULD/MUST be set to the user agent (although https://en.wikipedia.org/wiki/User_agent#Deprecation_of_User-Agent_header; JS may have better APIs to get browser version information than the user agent string)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, let me know if it's clear, I don't know how well the new API is supported right now so I'm leaving it out for now but if it seems worth mentioning let me know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry had forgotten to git push >< Now it has my new line about browser.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not use the "user-agent" in the version. The runtime:

  • name - should probably be the name of the browser (Chrome/Firefox/etc)
  • version - should probably the version of that browser

Or maybe completely exclude browser from this list and have a dedicated section for browser, there may be other things we need.

I don't have big experience with browsers so I may be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As browser is a drastically different runtime (even the opentelemetry-js distribution is different for server vs browser) I definitely want to include it somewhere. I feel as if the current recommendation provides good information to backends, and is SHOULD - is it ok to iterate on the details per language? The only language I'm deeply familiar with here is Java, everything else is me fumbling through the dark to provide a starting point for others, but I think a need for polishing by each language SIG is a feature of every language except for Java here, not just this one. Should I go ahead and file an issue for each language to check this table after merging this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be good, or maybe just leave TODOs instead of providing things for all the other languages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added TODOs with linked issues. I hope what's here can be helpful for the language owners, though might not be but lean towards not clearing it out since it's at least something I think.

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we aim to have a comprehensive list of language runtimes in process.md? If so it would be probably useful to invite language SIG maintainers to add the lists for missing languages.

@@ -12,5 +12,84 @@
| process.command | The command used to launch the process (i.e. the command name). On Linux based systems, can be set to the zeroth string in `proc/[pid]/cmdline`. On Windows, can be set to the first parameter extracted from `GetCommandLineW`. | `cmd/otelcol` | See below |
| process.command_line | The full command used to launch the process. The value can be either a list of strings representing the ordered list of arguments, or a single string representing the full command. On Linux based systems, can be set to the list of null-delimited strings extracted from `proc/[pid]/cmdline`. On Windows, can be set to the result of `GetCommandLineW`. | Linux: `[ cmd/otecol, --config=config.yaml ]`, Windows: `cmd/otecol --config=config.yaml` | See below |
| process.owner | The username of the user that owns the process. | `root` | No |
| process.runtime.name | The name of the runtime of this process. For compiled native binaries with a significant runtime component, e.g., Go, this SHOULD be the name of the compiler. | `openjdk` | No |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For compiled native binaries with a significant runtime component, e.g., Go, this SHOULD be the name of the compiler

Note that we also have telemetry.sdk.language which already records the language, so it is good that we have the distinction clearly specified here.

significant runtime component

I do not fully understand the reference to a "significant" runtime. What's significant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am looking at Go's goroutines and GC as significant runtime for a compiled language, vs just a C++ app using only libc. Of course, if the C++ app links in some complex libraries that have functionalities similar to those, it's a bit hairy, but I think including version for Go compiler is important. Any suggestions on wodering to reflect this better are appreciated!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anuraaga I think there is no need for additional justification. It should be enough to say "For compiled native binaries this SHOULD be the name of the compiler."

Regardless of how complicated the runtime is, it is useful to know the compiler name/version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, removed that point!

specification/resource/semantic_conventions/process.md Outdated Show resolved Hide resolved
@anuraaga
Copy link
Contributor Author

Do we aim to have a comprehensive list of language runtimes in process.md? If so it would be probably useful to invite language SIG maintainers to add the lists for missing languages.

Sorry I thought I responded to this but can't find it - yeah I think having language SIGs add lists for missing languages would be great! For this iteration, I went through our current beta languages and ruby just because I know it has many types of runtimes.

Copy link
Member

@arminru arminru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @anuraaga for gathering pre-defined well-known values!
This will help instrumentations to be consistent (although this could likely be implemented directly by the SDKs) and allow back ends to use that information for other means than just simply displaying the plain string.
I think this PR is ready to be merged as-is, SIGs can always extend the tables if there are any missing runtimes.

specification/resource/semantic_conventions/process.md Outdated Show resolved Hide resolved
Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>

| Value | Description |
| --- | --- |
| `openjdk` | Oracle OpenJDK |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked this list against values returned by java.vendor system property? I suspect that will be the most common source of this information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found that e.g. zulu jdk has a vendor like Azul Systems (sorry on phone right now so don't remember the exact string but it was something like this and very different than what I wrote in this table either way). So I'm wavering between whether these are descriptive or not. We could define this as "the value of the java.vendor` property and be done with it. Or we can expect instrumentation to parse these strings into listed constants like listed here. I'm leaning towards the latter since I don't think there is any way to avoid a vendor string from changing despite being effectively the same JDK imementation, but don't have a strong preference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If in case of Java we want process.runtime.name to be mean "which java distribution is in use", then I think we should use java.vendor verbatim (not parsing) as part of process.runtime.name. I don't know if this is enough. E.g. in case of Zulu and Zing, do they both have Azul Systems as vendor? I would expect so. Then we need something else in addition to vendor name.

Copy link
Contributor Author

@anuraaga anuraaga Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for calling this out. I collected a bunch of properties.

https://gist.github.com/anuraaga/c79219a7e4401515c3de998f9d077be0

It looks like the only way to properly express the vendor distribution (which is important, some distributions have totally different runtime components like corretto, openj9) is a concatenation of java.vendor and java.vendor.version. But this doesn't really work because adoptopenjdk is AdoptOpenJDK AdoptOpenJDK in that case.

I'm starting to think I also need to add runtime.description attribute which contains a probably language-specific, descriptive name of a runtime, for cases where they still all fork off of a very large common component like the openjdk distributions. so

runtime.name = java.runtime.name
runtime.version = java.runtime.version
runtime.description = java.vendor + java.vendor.version

It's ok - but still not ideal because version strings aren't consistent either :/ But probably best we can do, backends can probably deal with this well enough.

OpenJDK Runtime Environment, 11.0.8+10, N/A 18.9 (18.9 I think is release date)
OpenJDK Runtime Environment, 11.0.8+10-LTS, Azul Systems, Inc. Zulu11.41+23-CA
OpenJDK Runtime Environment, 11.0.8+10, AdoptOpenJDK AdoptOpenJDK (sigh)
Android Runtime, 0.9, Android Project 1.2.3 (guessing based on https://android.googlesource.com/platform/dalvik/+/gingerbread/vm/Properties.c#206)

How does this sound?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is much better, but still misses the info that IBM J9 VM was used :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point - how about java.vm.vendor concatenated with java.vm.version? It seems it will remove AdoptOpenJDK from their OpenJ9 JVM but it's probably a better handling of the corner case.

@@ -12,5 +12,85 @@
| process.command | The command used to launch the process (i.e. the command name). On Linux based systems, can be set to the zeroth string in `proc/[pid]/cmdline`. On Windows, can be set to the first parameter extracted from `GetCommandLineW`. | `cmd/otelcol` | See below |
| process.command_line | The full command used to launch the process. The value can be either a list of strings representing the ordered list of arguments, or a single string representing the full command. On Linux based systems, can be set to the list of null-delimited strings extracted from `proc/[pid]/cmdline`. On Windows, can be set to the result of `GetCommandLineW`. | Linux: `[ cmd/otecol, --config=config.yaml ]`, Windows: `cmd/otecol --config=config.yaml` | See below |
| process.owner | The username of the user that owns the process. | `root` | No |
| process.runtime.name | The name of the runtime of this process. For compiled native binaries, this SHOULD be the name of the compiler. | `openjdk` | No |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid confusion: is Java a "compiled" in the sense of this line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anuraaga ping :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally updating this PR :) I think no, I use compiled native to try to give that impression. Is there a more clear term?

@Oberon00
Copy link
Member

Oberon00 commented Sep 11, 2020

A question: Why is this in the process namespace? We already discovered in #789 (comment) that just because something is part of a process, it is not necessarily sensible to put it in the process namespace (e.g. we also don't use process.telemetry.sdk). So I'd suggest naming this just runtime.* instead of process.runtime.* unless there are other arguments.

EDIT: It seems that (nearly?) all resources could be sensibly put under the process namespace (e.g. process.k8s.pod.name: the name of the Kubernets pod in which the process runs -- yet we named it just k8s.pod.name). So it seems we should be careful when to use process and when not, otherwise it becomes random.

@anuraaga
Copy link
Contributor Author

@Oberon00 Thanks that's a great question. I would say the biggest reason to adding to another namespace was because I wasn't sure this is such a significant attribute that it "deserves" a root namespace - in particular I think it means adding another .md file just for this so wanted to avoid it. This isn't a great reason :) Maybe we can add a point somewhere that top-level namespaces are expected to grow so there's no need to worry about that.

@Oberon00
Copy link
Member

I wasn't sure this is such a significant attribute that it "deserves" a root namespace

I don't know if significance is a criterion to decide whether to use a sub- or root-namespace, but in any case I think runtime is quite significant. I would not be surprised if further attributes appear in the future either.

If we don't want jvm to be a top-level namespace, for example, we might one day have runtime.jvm.gc_strategy.

@anuraaga
Copy link
Contributor Author

@Oberon00 Thanks for the pointer!

Comment on lines 102 to 103
| `rubymri` | Ruby MRI |
| `yarv` | YARV |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically speaking, these are the same thing. YARV is the bytecode interpreter for MRI.

| `rubymri` | Ruby MRI |
| `yarv` | YARV |
| `graalvm` | GraalVM |
| `ironruby` | IronRuby |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support IronRuby in OTel Ruby.

Suggested change
| `ironruby` | IronRuby |

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that matter? I guess we don't support many of the things here (yet). But maybe some 3rd party vendor will at some point.

However, I'm wondering if ironruby can even be considered a runtime in it's own right. It is more like an addon for a .NET CLR, isn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically speaking, I don't know of anyone using it, or the other implementations below. There's a lot of abandoned alternative Ruby implementations out there. It doesn't seem valuable to catalog them all here.

The active implementations that I know of are JRuby, TruffleRuby and MRI. Although JRuby and TruffleRuby can both be considered "addons" for the JVM, it is important to distinguish them - the implementation technologies and functionality are quite different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, it feels more useful to have a few well known examples here rather than an exhaustive list of all possible values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although JRuby and TruffleRuby can both be considered "addons" for the JVM, it is important to distinguish them

SGTM, but in that case, information about the JVM would still be interesting, right? I wonder how we could represent that in semantic conventions. Using an array for each value? E.g. process.runtime.name=["OpenJDK Runtime Environment", "JRuby"]

Comment on lines 107 to 111
| `macruby` | MacRuby |
| `maglev` | MagLev |
| `mruby` | Mruby |
| `rubinius` | Rubinius |
| `rubymotion` | RubyMotion |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these are supported by OTel Ruby.

Suggested change
| `macruby` | MacRuby |
| `maglev` | MagLev |
| `mruby` | Mruby |
| `rubinius` | Rubinius |
| `rubymotion` | RubyMotion |

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should not matter, we are not writing these semantic conventions solely for our current implementations 😃

| --- | --- |
| `rubymri` | Ruby MRI |
| `yarv` | YARV |
| `graalvm` | GraalVM |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be TruffleRuby rather than GraalVM.


***Ruby Runtimes:***

TODO(<https://github.com/open-telemetry/opentelemetry-ruby/issues/388>): Confirm the contents here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically speaking, I think we should use the following mapping:

Name Constant
process.runtime.name RUBY_ENGINE
process.runtime.version RUBY_VERSION
process.runtime.description RUBY_DESCRIPTION

For MRI 2.7.1, for example, this looks like:

Name Value
process.runtime.name ruby
process.runtime.version 2.7.1
process.runtime.description ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-darwin19]

For TruffleRuby this looks like:

Name Value
process.runtime.name truffleruby
process.runtime.version 2.6.2
process.runtime.description truffleruby (Shopify) 20.0.0-dev-92ed3059, like ruby 2.6.2, GraalVM CE Native [x86_64-darwin]

Anuraag Agrawal and others added 5 commits September 18, 2020 12:46
@anuraaga
Copy link
Contributor Author

@fbogsany @tsloughter Thanks! I've updated erlang / ruby.

Maintainers, I'm hoping we can merge this and let follow up PRs take care of the remaining TODOs by the way :) Seems to have many approvals.

Comment on lines +42 to +45
| Value | Description |
| --- | --- |
| `gc` | Go compiler |
| `gccgo` | GCC Go frontend |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem to cover the Go language implementations that I'm aware of. I'm not sure we've ever tested with gccgo, but as an example it is fine.

Version information can be obtained from the runtime.Version() function and should indicate either a semver identifier or a commit hash and date in the case of a non-release build of the toolchain.

@bogdandrutu bogdandrutu merged commit 59f56b0 into open-telemetry:master Sep 22, 2020
tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-specification that referenced this pull request Nov 5, 2020
This topic has come up at least 3 times now. I believe a clarification is
warranted. The clarification is aligned with previous recommendations:
open-telemetry#789 (comment)
open-telemetry#882 (comment)
open-telemetry#1194 (comment)
bogdandrutu pushed a commit that referenced this pull request Nov 10, 2020
This topic has come up at least 3 times now. I believe a clarification is
warranted. The clarification is aligned with previous recommendations:
#789 (comment)
#882 (comment)
#1194 (comment)
jack-berg pushed a commit to jack-berg/semantic-conventions that referenced this pull request Nov 7, 2023
AlexanderWert pushed a commit to jack-berg/semantic-conventions that referenced this pull request Nov 10, 2023
jack-berg pushed a commit to jack-berg/semantic-conventions that referenced this pull request Nov 16, 2023
jack-berg pushed a commit to jack-berg/semantic-conventions that referenced this pull request Nov 16, 2023
jsuereth pushed a commit to jack-berg/semantic-conventions that referenced this pull request Nov 16, 2023
jsuereth pushed a commit to jack-berg/semantic-conventions that referenced this pull request Nov 16, 2023
jsuereth pushed a commit to open-telemetry/semantic-conventions that referenced this pull request Nov 16, 2023
jsuereth pushed a commit to open-telemetry/semantic-conventions that referenced this pull request Nov 16, 2023
joaopgrassi pushed a commit to dynatrace-oss-contrib/semantic-conventions that referenced this pull request Jan 10, 2024
joaopgrassi pushed a commit to dynatrace-oss-contrib/semantic-conventions that referenced this pull request Jan 10, 2024
pyohannes pushed a commit to pyohannes/semantic-conventions that referenced this pull request Jan 17, 2024
pyohannes pushed a commit to pyohannes/semantic-conventions that referenced this pull request Jan 17, 2024
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions release:after-ga Not required before GA release, and not going to work on before GA spec:resource Related to the specification/resource directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Semantic attributes for language runtime
10 participants