Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Resources.Container] Add Kubernetes support #1699

Open
wants to merge 39 commits into
base: main
Choose a base branch
from

Conversation

joegoldman2
Copy link
Contributor

Fixes #1562.
Only tested manually at the moment.

For significant contributions please make sure you have completed the following items:

  • Appropriate CHANGELOG.md updated for non-trivial changes
  • Design discussion issue #
  • Changes in public API reviewed

Copy link

codecov bot commented Apr 27, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 44 lines in your changes missing coverage. Please review.

Project coverage is 71.03%. Comparing base (71655ce) to head (0c3bad3).
Report is 359 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1699      +/-   ##
==========================================
- Coverage   73.91%   71.03%   -2.88%     
==========================================
  Files         267      315      +48     
  Lines        9615    11743    +2128     
==========================================
+ Hits         7107     8342    +1235     
- Misses       2508     3401     +893     
Flag Coverage Δ
unittests-Exporter.Geneva 53.20% <ø> (?)
unittests-Exporter.InfluxDB 94.65% <ø> (?)
unittests-Exporter.Instana 71.24% <ø> (?)
unittests-Exporter.OneCollector 91.51% <ø> (?)
unittests-Exporter.Stackdriver 75.73% <ø> (?)
unittests-Extensions 79.33% <ø> (?)
unittests-Extensions.AWS 77.24% <ø> (?)
unittests-Extensions.Enrichment 100.00% <ø> (?)
unittests-Instrumentation.AWS 80.59% <ø> (?)
unittests-Instrumentation.AWSLambda 87.96% <ø> (?)
unittests-Instrumentation.AspNet 74.85% <ø> (?)
unittests-Instrumentation.AspNetCore 85.27% <ø> (?)
unittests-Instrumentation.ElasticsearchClient 79.87% <ø> (?)
unittests-Instrumentation.EntityFrameworkCore 55.49% <ø> (?)
unittests-Instrumentation.EventCounters 76.36% <ø> (?)
unittests-Instrumentation.GrpcNetClient 79.61% <ø> (?)
unittests-Instrumentation.Hangfire 93.58% <ø> (?)
unittests-Instrumentation.Http 81.08% <ø> (?)
unittests-Instrumentation.Owin 83.43% <ø> (?)
unittests-Instrumentation.Process 100.00% <ø> (?)
unittests-Instrumentation.Quartz 78.94% <ø> (?)
unittests-Instrumentation.Runtime 100.00% <ø> (?)
unittests-Instrumentation.SqlClient 91.89% <ø> (?)
unittests-Instrumentation.StackExchangeRedis 68.02% <ø> (?)
unittests-Instrumentation.Wcf 48.91% <ø> (?)
unittests-PersistentStorage 65.44% <ø> (?)
unittests-Resources.AWS 74.08% <100.00%> (?)
unittests-Resources.Azure 78.35% <ø> (?)
unittests-Resources.Container 57.72% <49.42%> (?)
unittests-Resources.Gcp 72.54% <ø> (?)
unittests-Resources.Host 51.16% <ø> (?)
unittests-Resources.Process 81.81% <ø> (?)
unittests-Resources.ProcessRuntime 82.35% <ø> (?)
unittests-Sampler.AWS 87.97% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/OpenTelemetry.Resources.AWS/AWSEKSDetector.cs 57.89% <100.00%> (ø)
...y.Resources.Container/Models/K8sContainerStatus.cs 100.00% <100.00%> (ø)
...OpenTelemetry.Resources.Container/Models/K8sPod.cs 100.00% <100.00%> (ø)
...lemetry.Resources.Container/Models/K8sPodStatus.cs 100.00% <100.00%> (ø)
...esources.Container/ContainerResourceEventSource.cs 21.42% <21.42%> (ø)
...Telemetry.Resources.Container/ContainerDetector.cs 75.94% <73.46%> (ø)
...elemetry.Resources.Container/K8sMetadataFetcher.cs 0.00% <0.00%> (ø)

... and 311 files with indirect coverage changes

@joegoldman2 joegoldman2 changed the title Add Kubernetes support in Container Resource Detector [ResourceDetectors.Container] Add Kubernetes support Apr 27, 2024
@@ -18,6 +21,13 @@ public class ContainerResourceDetector : IResourceDetector
private const string Filepath = "/proc/self/cgroup";
private const string FilepathV2 = "/proc/self/mountinfo";
private const string Hostname = "hostname";
private const string K8sServiceHostKey = "KUBERNETES_SERVICE_HOST";
private const string K8sServicePortKey = "KUBERNETES_SERVICE_PORT_HTTPS";
private const string K8sNamespaceKey = "KUBERNETES_NAMESPACE";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this environment variable and KUBERNETES_CONTAINER_NAME available in all Kubernetes environments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I know, KUBERNETES_NAMESPACE and KUBERNETES_CONTAINER_NAME are not available by default in Kubernetes. The user will have to provide them using the downward API.

var @namespace = Environment.GetEnvironmentVariable(K8sNamespaceKey);
var hostname = Environment.GetEnvironmentVariable(K8sHostnameKey);
var containerName = Environment.GetEnvironmentVariable(K8sContainerNameKey);
var url = $"https://{host}:{port}/api/v1/namespaces/{@namespace}/pods/{hostname}";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the guarantee that this will be a well-formed URL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing I think but in this case an exception will be thrown, catched and logged using the EventSource.

private const string K8sServicePortKey = "KUBERNETES_SERVICE_PORT_HTTPS";
private const string K8sNamespaceKey = "KUBERNETES_NAMESPACE";
private const string K8sHostnameKey = "HOSTNAME";
private const string K8sContainerNameKey = "KUBERNETES_CONTAINER_NAME";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tricky part here - k8s doesn't have any standard variable to provide container name. It is entirely based on the user. I don't see any issues that extractor would expect it to be named "KUBERNETES_CONTAINER_NAME" - but probably it may be better to be passed by user to detector as constructor argument.
If it is passed by user, we can also report it as additional attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's the same for KUBERNETES_NAMESPACE in this case, which is not standard as well. Perhaps we can let the user supply the values via the constructor but also keep automatic detection via the environment variables if they are not supplied. What do you think?

If it is passed by user, we can also report it as additional attribute.

I'm not sure that's the purpose of this detector. From my point of view, the user can already use the env variable OTEL_RESOURCE_ATTRIBUTES or implement his own detector to add this resource attribute.

@Kielek Kielek added the comp:resources.container Things related to OpenTelemetry.Resources.Container label May 6, 2024
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label May 15, 2024
@iskiselev
Copy link
Contributor

Not stale. I'll try to finish review this week.

@iskiselev
Copy link
Contributor

Logic / code wise it looks good to me. But I fully support concerns of @Kielek about establishing and following standards across OTEL implementation. And it is tricky for that logic.

We are working on container resource detector package. Talking in terms defined in container and k8s resources, in this PR logic to extract container.id in Kuberentes environment was implemented. To do it, we need to know: k8s.container.name, k8s.namespace.name, and k8s.pod.name. There is well defined way to extract k8s.namespace.name, semi-working way to resolve k8s.pod.name and no way to resolve k8s.container.name. To overcome that, we suggest use user-defined environment variables KUBERNETES_CONTAINER_NAME and KUBERNETES_POD_NAME (with advices how to configure it in README.md.

There are attributes that "MAY be configurable via a dedicated environment variable" per resources and variables list, but all OTEL used variables use OTEL_ prefix.

Should that package be responsible for both cgroup containers and k8s container? Or should it be separate packages? Is it ok for detector to identify resource in another category? How detectors / detector package can reuse data collected by another resource detector?

SKD spec suggests:

Custom resource detectors related to generic platforms (e.g. Docker, Kubernetes) or vendor specific environments (e.g. EKS, AKS, GKE) MUST be implemented as packages separate from the SDK.
Resource detector packages MAY detect resource information from multiple possible sources and merge the result using the Merge operation described above.

With all of that: @Kielek, do you think we can merge it in current state (we still have pre-release of the package), and open issues here and semantic-conventions repo to discuss it? If not, I can suggest to create special user-provided configuration to enable K8s container resolution and do not resolve it from environment variable internally. Something that would be used like

var tracerProvider = Sdk.CreateTracerProviderBuilder()
                        // other configurations
                        .ConfigureResource(resource => resource
                            .AddContainerDetector(
                               settings => settings.EnableK8sResolution(
                                 Environment.GetVaraible("KUBERNETES_CONTAINER_NAME"),
                                 Environment.GetVaraible("KUBERNETES_POD_NAME")))
                        .Build();

@lachmatt
Copy link
Contributor

As this is something that can already be done by collector, as noted in this comment, I think it'd be better to have consensus on this approach in broader opentelemetry community, as suggested by @Kielek, before merging this PR.
I assume scenario with applications running in k8s cluster sending telemetry directly to the backend (e.g without collector deployed) is not that common.

@iskiselev
Copy link
Contributor

If I read k8s processor code correct, it in fact set container id if it was not set. For it, AttributeK8SContainerName should be set. Same is documented.

@iskiselev
Copy link
Contributor

iskiselev commented Jun 13, 2024

I have an idea how we can finish this PR and still be in full compliance with already approved semantic conventions and idea behind k8s processor implementation. Though it will have one small hack (that can be improved later once proper API will be provided).

K8s processor imply that user will provide k8s.container.name to be able to resolve container id. And user can do it already, using OTEL_RESOURCE_ATTRIBUTES environment variable. All it need to do is to add proper pairs there. Something like:

apiVersion: v1
kind: Pod
metadata:
  name: container-resolve-demo
spec:
  serviceAccountName: pod-reader-account
  volumes:
  - name: shared-data
    emptyDir: {}
  containers:
  - name: &container_name my-container-name
    image: ubuntu:latest
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
    env:
    - name: OTEL_RESOURCE_ATTRIBUTES_K8S_CONTAINER_NAME
      value: *container_name
    - name: OTEL_RESOURCE_ATTRIBUTES_K8S_POD_NAME
      valueFrom:
        fieldRef:
          fieldPath: metadata.name
    - name: OTEL_RESOURCE_ATTRIBUTES
      value: k8s.pod.name=$(OTEL_RESOURCE_ATTRIBUTES_K8S_POD_NAME),k8s.container.name=$(OTEL_RESOURCE_ATTRIBUTES_K8S_CONTAINER_NAME)

Now, instead of accessing some environment variables, we just need to get attributes from OTEL_RESOURCE_ATTRIBUTES - effectively use one resource detector in another one. I have not seen it anywhere, but we can do something like:

internal static class K8sSemanticConventions
{
    public const string AttributePodName = "k8s.pod.name";
    public const string AttributeContainerName = "k8s.container.name";
}

private string? ExtractK8sContainerId()
{
  // Is there any way to do it better?
  var environmentVariableAttributes = ResourceBuilder.CreateEmpty()
          .AddEnvironmentVariableDetector().Build().Attributes;
  var podName = environmentVariableAttributes
      .FirstOrDefault(atr => atr.Key == K8sSemanticConventions.AttributeContainerName)
      .Value?.ToString();
  var containerName = environmentVariableAttributes
      .FirstOrDefault(atr => atr.Key == K8sSemanticConventions.AttributePodName)
      .Value?.ToString();
  ...
}

@joegoldman2
Copy link
Contributor Author

joegoldman2 commented Jun 14, 2024

And user can do it already, using OTEL_RESOURCE_ATTRIBUTES environment variable.

With this approach, the pod name and container name will be part of the Resource (as it's quite common to enable the env variable detector) and will be included in all traces/metrics/logs, which is not necessarily the desired behavior. 😕

@iskiselev
Copy link
Contributor

@joegoldman2, currently I see several options:

  • start new discussion in opentelemetry-specification about new environment variable
  • make customer pass container name manually as parameter
  • use container name as already specified resource
  • move K8s logic to separate detector that would require container name on constructor (and somehow make container detector never override container-id detected by controller)
  • do nothing, suggesting to use processor feature on collector (not everyone use collector!)

Or some combination of that approaches. I don't like last option (do nothing).

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 22, 2024
@joegoldman2
Copy link
Contributor Author

@iskiselev I'm not really sure what to do next? Do you want to initiate a new discussion in opentelemetry-specification about new environment variables?

@github-actions github-actions bot added the comp:resources.aws Things related to OpenTelemetry.Resources.AWS label Jun 28, 2024
Copy link
Contributor

github-actions bot commented Jul 6, 2024

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jul 6, 2024
@joegoldman2
Copy link
Contributor Author

Not stale. Waiting for @iskiselev's feedback.

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jul 14, 2024
@iskiselev
Copy link
Contributor

Not stale. open-telemetry/opentelemetry-specification#4140 opened to discuss next step.
@joegoldman2, if new config environment variables will be approved, we will still need to make it configurable though factory method parameters per:
https://github.com/open-telemetry/opentelemetry-specification/blob/27bc9ffad2b386626367dd2f98d0ba2ddf35a18f/specification/configuration/sdk-environment-variables.md?plain=1#L54

@github-actions github-actions bot removed the Stale label Jul 15, 2024
@joegoldman2
Copy link
Contributor Author

Ok, so I guess the next step is to define the desired API to allow users to set the values programmatically, right?

@iskiselev
Copy link
Contributor

@joegoldman2, I already suggested API at #1699 (comment), but any would works once it allows to specify values.
I'm not sure, whether we will see any move open-telemetry/opentelemetry-specification#4140. With program API we should be able to merge this PR even without environment values support if consensus would not be reached.
Btw, do you see an option to merge it already (with @Kielek approval) or is it requires anything else? I still fully agree with @Kielek , that we should either get common agreement on environment values usage and document them - or do not use them at all.

@iskiselev
Copy link
Contributor

And looks like new environment variable would not be added any time soon: open-telemetry/opentelemetry-specification#2891 (comment). So, using API-based values may be the only option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:resources.aws Things related to OpenTelemetry.Resources.AWS comp:resources.container Things related to OpenTelemetry.Resources.Container
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect container id reported
9 participants