Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a mechanism to ignore certain receivers when deriving service from config #527

Closed
BinaryFissionGames opened this issue Nov 10, 2021 · 9 comments · Fixed by #558
Closed
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@BinaryFissionGames
Copy link

BinaryFissionGames commented Nov 10, 2021

The specific receiver configuration that sparked this issue:

receivers:
      kubeletstats/node:
        auth_type: serviceAccount
        collection_interval: 1m
        endpoint: ${K8S_NODE_NAME}:10250
        insecure_skip_verify: true
        k8s_api_config:
          auth_type: serviceAccount
        metric_groups:
        - node
      kubeletstats/pods:
        auth_type: serviceAccount
        collection_interval: 1m
        endpoint: ${K8S_NODE_NAME}:10250
        insecure_skip_verify: true
        k8s_api_config:
          auth_type: serviceAccount
        metric_groups:
        - pod

This fails, because the service derives that there are two endpoints, each with port 10250, giving this error:

{"level":"error","ts":1636559029.2833467,"logger":"controllers.OpenTelemetryCollector","msg":"failed to reconcile services","error":"failed to reconcile the expected services: failed to apply changes: Service \"opentelemetry-node-agent-collector\" is invalid: spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:10250, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}","stacktrace":"github.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).Reconcile\n\t/workspace/controllers/opentelemetrycollector_controller.go:147\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:214"}

The problem here is that endpoint doesn't define an endpoint for something connecting to the collector, rather it's an endpoint that is scraped by the collector. These ports shouldn't be mapped in the service, since they are not for inbound connections.

Currently the workaround for this is to not use endpoint for this specific scenario and have the receiver figure it out, but I could see this being a problem with other "scraping" receivers as well.

Ideally, there'd be a way to skip gathering ports from certain receivers, or even a way to disable the port auto-detection as a whole.

@jpkrohling jpkrohling added good first issue Good for newcomers bug Something isn't working labels Nov 11, 2021
@mritunjaysharma394
Copy link
Contributor

Hi @jpkrohling, I have understood some parts of the issue and some parts I am processing more, I would love to work on this issue 😄

@jpkrohling
Copy link
Member

I think @VineethReddy02 worked recently on this so he might have a clear understanding of what needs to be done, but from what I remember, the problem is that we are using a generic parser for this receiver, which is a scraper, and we shouldn't parse scrapers at all.

If you need more help, let me or @VineethReddy02 know.

@shree007
Copy link
Contributor

@jpkrohling
i may take as second choice.😊

@jsirianni
Copy link
Member

I can assist in replicating this issue if needed. The workaround for me was to use spec.hostNetwork: true instead of setting endpoint (this is specific to the kubeletstats receiver).

Note: a missing or empty endpoint will cause the hostname on which the collector is running to be used as the endpoint. If the hostNetwork flag is set, and the collector is running in a pod, this hostname will resolve to the node's network namespace.

@mritunjaysharma394
Copy link
Contributor

Thanks a lot for all the help and making it much more easier for me to understand it better @jpkrohling 🥺

I will try to replicate the issue to understand it even better and then will suggest the approach to fix this as PR and will surely ping you or @VineethReddy02 or @jsirianni for any help that is needed, thanks a lot again for all the help to everyone 😄

@VineethReddy02
Copy link
Contributor

I'm able to reproduce this issue. It looks like we expect endpoint field in the config only for receivers to expose the port via k8s services. So this needs to be handled with a special validation in the generic receiver parser to ignore scarpers with the field endpoint. To be more precise the fix needs to be added here

@mritunjaysharma394
Copy link
Contributor

Hi @VineethReddy02, thanks a lot for this detailed help, I think I understood some part of it but I have a few questions regarding how to ignore scrapers? And do we need another case before default which determines if we have to expose the port via k8s services or its just an if-else condition below default that will handle the validation? And if there are any existing or similar examples of such a validation that ignores scrapers, then that can be a great help in co-relating it with 😄

Thanks a lot again for all the help!

@VineethReddy02
Copy link
Contributor

The scrapers are by default ignored. In operator, we have custom logic for all receivers that aren't scrapers to expose the service ports as provided in receiver config, so the leftover receivers are considered as scrapers. The root cause behind this issue is we have a generic_receiver parser that looks for a field endpoint and tries exposing it. So this is corner case a receiver i.e. is a scraper contains a field endpoint. So should add a check if the receiver is kubeletstats ignore the endpoint field.

@mritunjaysharma394
Copy link
Contributor

Thanks a lot for all the help and guidance @VineethReddy02 , I have created a PR which is an attempt to work on it after reading about it more from here https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/kubeletstatsreceiver

Looking forward for the feedback to make it better, thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
6 participants