-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes: How to populate resource attributes based on attributes, labels and transformation #1756
base: main
Are you sure you want to change the base?
Kubernetes: How to populate resource attributes based on attributes, labels and transformation #1756
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the overall direction. This makes a lot of sense
We briefly discussed this during the k8s SemConv WG meeting. There is no consensus on the document yet, but we decided to recommend moving it from the |
docs/resource/k8s.md
Outdated
- `pod.annotation[resource.opentelemetry.io/service.version]` | ||
- `pod.label[app.kubernetes.io/version]` (well-known label | ||
[app.kubernetes.io/version](https://kubernetes.io/docs/reference/labels-annotations-taints/#app-kubernetes-io-version)) | ||
- `if (contains(container docker image tag, '/') == false) container docker image tag` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we bail if the tag has /
? The service attribute conventions don't define a format for version. If none of the other options work, wouldn't using the tag be better (not sure what the sdk defaults to)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure about this part of the spec.
I took this from the way the operator logic - collector does it a bit differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've found the reason - it was a guard for a port number separator as can be seen here: https://github.com/distribution/reference/blob/284a39eaf3368476e0a4c6114a0eec61220acdd9/reference_test.go#L250
I've updated the spec to refer to that library - which is already used by the collector for container.image.id
-
... with the notable difference that I think we should include the digest if available, it's the best source of truth for the service version
docs/resource/k8s.md
Outdated
Choose the first value found: | ||
|
||
- `pod.annotation[resource.opentelemetry.io/service.instance.id]` | ||
- `concat([k8s.namespace.name, k8s.pod.name, k8s.container.name], '.')` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed service.instance.id
in the semconv meeting and had some doubts about the scenario where a container is restarted without a pod recreate (e.g. kubelet restarting a container for OOM). The confusion was whether this id should change to reflect that a new instance of the container was started which is what would happen i think if we left this to an sdk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention for service.instance.id
states that it MUST be unique for each instance of the same service.namespace,service.name pair (in other words service.namespace,service.name,service.instance.id triplet MUST be globally unique).
For a pod which is owned by a statefulset this is not the case. I guess the question is whether this id should be used to differentiate between different instances of a service at a given time or across all instances ever run. In case of a statefulset, the id is not unique enough to differentiate between instances of a pod with same sts ordinal but k8s will only run one instance of a pod with a given ordinal (the id will remain same after an sts pod is recreated). If we want a different id across pod recreates, switching to pod uid instead of name would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've generally interpreted "globally unique" here to mean unique at a given time, rather than unique across all time. IMO this is somewhat moot since SDKs are going to start generating unique IDs for service.instance.id anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is somewhat moot since SDKs are going to start generating unique IDs for service.instance.id anyways.
Are sdks going to ignore any service.instance.id
values passed in with the env var OTEL_RESOURCE_ATTRIBUTES
? My understanding is that the documented values are currently being calculated and set in this var by operator when auto-instrumenting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is somewhat moot since SDKs are going to start generating unique IDs for service.instance.id anyways.
Are sdks going to ignore any
service.instance.id
values passed in with the env varOTEL_RESOURCE_ATTRIBUTES
? My understanding is that the documented values are currently being calculated and set in this var by operator when auto-instrumenting.
no, SDKs must honor whatever attributes are set using OTEL_RESOURCE_ATTRIBUTES
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've generally interpreted "globally unique" here to mean unique at a given time, rather than unique across all time.
that is also my understanding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @jsuereth for the feedback on this question, which I have now included in the PR itself:
Note that the container restart count is not included in the
service.instance.id
calculation, because it makes
troubleshooting harder when the ID changes on every restart, e.g. in a crash loop.
@open-telemetry/specs-semconv-approvers the service page is automatically generated - how can I move the content of this PR to the service page? |
maybe as subsection(s) under https://github.com/open-telemetry/semantic-conventions/blob/main/docs/resource/README.md#service |
…d prod (deployment.envirnoment.name)
604cdc9
to
d850c85
Compare
@jsuereth @tigrannajaryan thanks for the positive feedback in the entities SIG call. A added the notes about the object hierarchy for |
@jinja2 thanks for the detailed review - everything should be addressed now |
Fixes #236
This is a new take on the issue. #349 has been created before and not completed.
So what's the difference now?
service.instance.id
has been completed - which was effectively a blocker before.Merge requirement checklist
[chore]