Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environment variables to provide K8s container and pod name #4140

Open
iskiselev opened this issue Jul 11, 2024 · 7 comments
Open

Environment variables to provide K8s container and pod name #4140

iskiselev opened this issue Jul 11, 2024 · 7 comments
Assignees
Labels
sig-issue spec:miscellaneous For issues that don't match any other spec label triage:deciding:tc-inbox

Comments

@iskiselev
Copy link

iskiselev commented Jul 11, 2024

What are you trying to achieve?
There is work to implement container id detector for K8s environment for dotnet open-telemetry/opentelemetry-dotnet-contrib#1699 and js open-telemetry/opentelemetry-js-contrib#1962. To make possible resolution of container id, resource detector need to know container name and pod name. There is no standard way to pass that information into pods.

Can we define otel-specific environment variable to pass that information so that different language detectors may work the same?
Suggested names either:

  • KUBERNETES_CONTAINER_NAME and KUBERNETES_POD_NAME
  • OTEL_KUBERNETES_CONTAINER_NAME and OTEL_KUBERNETES_POD_NAME

What did you expect to see?
Environment variable name added to https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/sdk-environment-variables.md

Additional context.
The container.id resource can be fetched using k8sattributeprocessor, but it requires to use collector and do additional configuration. Using k8sattributeprocessor is much more efficient if many pods need to resolve container.id.
Implementation of logic in resource detector use http call to k8s service to resolve container.id (and proper rights should be granted). That approach is relatively slow and may conflict with requirement

Resource detection logic is expected to complete quickly since this code will be
run during application initialization. Errors should be handled as specified in

Otel semantic convection already specified resource attribute for required data: k8s.pod.name and k8s.container.name. There is no well-defined way to use resource from other resource detectors - and we still don't know how they can be set. We can try hack with setting them through OTEL_RESOURCE_ATTRIBUTES environment variable, but after it the code would require duplicate standard mechanics of extracting resource from that environment variable and additional resource would be included in each trace, which may be not desirable. There is no way in language agent to filter out some resources from OTEL_RESOURCE_ATTRIBUTES (still can be achieved with collectors - but in that case k8sattributeprocessor would be better approach.)

@danielgblanco
Copy link
Contributor

FYI there's a moratorium on adding any new env variables, see #2891 (comment)

@iskiselev
Copy link
Author

@danielgblanco, looks like we should not be affected by strong form of moratorium - as we are asking to add 2 environment variables with primitive string type with predefined name.
I'm not sure if it should be blocked as "non-essential use-case".

@dashpole
Copy link
Contributor

dashpole commented Aug 7, 2024

Having in-process resource detectors make k8s api calls seems like a poor UX. It requires additional k8s rbac permissions, and individual pod get requests can generate a lot of traffic for the k8s apiserver at scale (especially if workloads are crash-looping). I would recommend this behavior be opt-in if we do add it.

@jsuereth
Copy link
Contributor

jsuereth commented Aug 7, 2024

The current entities WG proposal for improving resource detection would include a generic mechanism for injecting a set of resource attributes via env variables such that you would NOT conflict with other resource detection.

See: https://docs.google.com/document/d/1QvkLQra8nPst740sSTwAhehMtLxP7q3XP4TBH-qrwEQ/edit

@danielgblanco
Copy link
Contributor

I agree @dashpole and, in fact, this is the way we (widely) handle it at scale at Skyscanner where our deployment pipelines inject env vars that our OTel SDK config bundle then translates into resource attributes. With up to 100k pods running and high pod churn we try to avoid the K8s resource detector and favour env vars where we can. We still use the k8sattributeprocessor in Collectors, but only for certain cases where we need to get attributes from pod IPs pushing telemetry data. Looking forward to the proposal from the Entities WG!

@jsuereth
Copy link
Contributor

Here's the official OTEP: open-telemetry/oteps#264

@jsuereth
Copy link
Contributor

To expand on my previous comment - The TC believes we need a feature to solve this issue but it may not take the form proposed here. This will be completed as part of the Entities WG. See the OTEP proposal above for how we think this will be solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig-issue spec:miscellaneous For issues that don't match any other spec label triage:deciding:tc-inbox
Projects
Status: No status
Development

No branches or pull requests

6 participants