-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datadog: discover host inventory tags from environment or metrics stream #29700
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Related issue with host metadata exporter not seeming to respect configured or discovered host name, using internal cloud-provider host-id instead: #29866 |
This issue is closely related to #29866 (comment) It looks like the exporter has some incomplete code for tag discovery on GCP as part of opentelemetry-mapping-go . It has some hacks for AWS tagging in opentelemetry-collector-contrib exporter/datadogexporter/internal/hostmetadata . But nothing consistent or clear. |
Hey, thanks for this issue. In general I agree this is not supported, but let me try to reply to some of the parts of your message to clarify and expand on what that means:
I mentioned this briefly on #29741 (comment), we are working on something like this. It's not going to work like this exactly (as I said on #29866 (comment) this is not how
We are working on improving our docs as well, both on the hostname part as well as the host metadata work I mentioned above. Stay tuned!
We have a common repository for the mapping of OpenTelemetry since we reuse it also in the Datadog Agent's OTLP ingest implementation, that's why it's on a separate repository. No matter how we do it the implementation is going to be 'hidden' from one of the two repositories (Agent and Collector). If you think there is a way to improve visibility around this, I am happy to take any specific feedback you have.
Agreed, this is not something I can personally help with since it's a general Datadog aspect, but I can rely the feedback.
We do map some of them https://github.com/DataDog/opentelemetry-mapping-go/blob/a7afc4a370f8df1ada06e2af22fde3ee1d0dd84e/pkg/otlp/attributes/attributes.go#L28-L95 and have other users using this mapping, if this isn't working for you this might be a bug on our end or a misconfiguration on yours. |
@mx-psi The mappings you list at the end of your comment work fine for telemetry. The issue is that there's no promotion of important ones like It appears that your recent changes in #30680 may offer a workaround for this, once suitably documented, going by changelog https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/tag/v0.93.0 entry
If I understand correctly, this allows the telemetry stream to mark a resource as being a source for DD host tags, and copy any resource attributes it wants to appear as host tags by prefixing them with |
@mx-psi I've tested the new functionality added to the DD exporter and can indeed see host tags delivered, but a. it seems to break the node configuration info; and Very delayed tag sendingHowever, it seems to have a bug. If there is only one resource with At a guess, if the telemetry payload with the If there are two different resources with With only one resource having
(then no further for 30min) With two or more resources having
Confirmed that if I wait long enough, DD will eventually send the extra tags, there's just a long delay. So it looks likely there's a missed cache invalidation in there, where it doesn't send host metadata when discovering dynamic tags for the first time. Host info lost when tags enabledInterestingly it seems to send the tags separately to the rest of the metadata, and omits other host info when tags are sent. The payload that has non-empty The host info payloads with non-empty
Also noteworthy, the This appears to clobber the host system info sent to DD, which is normally populated, per screenshot above. so the tags feature looks to introduce a regression too. ConfigurationThe config snippet I'm using to set the attributes is in part
|
@mx-psi Not sure if you saw above ^ outcome of testing the |
@ringerc I was on vacation, thanks for the ping. Will take me some time to go through the resulting backlog but will try to have a look at this by end of next week |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
Component(s)
exporter/datadog
Is your feature request related to a problem? Please describe.
When the datadog exporter sends a host inventory entry to Datadog's backend, it does not have any auto-discovered tags associated with it. Nor is the set of tags Datadog expects (like availability-zone) clearly documented anywhere I can find, along with the structure/format of the values it accepts for each cloud provider.
This causes all hosts to show up in the host inventory in the "no-availability-zone" group, with no tags, e.g.
The collector may have pipelines configured with processors to enrich the metrics and logs streams with appropriate resource attributes like
cloud.availability_zone
. These appear to be ignored by the Datadoghost_metadata
exporter, and there's no clear/documented means I can find of setting them appropriately.Describe the solution you'd like
If the
datadog.host_metadata.hostname_source
option is set tofirst_resource
, OpenTelemetry semantic conventions should be used to map the standard tags on the resource payload to Datadog's internally expected tags for the host metadata. This mapping should be clearly documented - even if it's just a link to the relevant part of the code from the README, not hidden away in some other repo's golang code.This would "do the right thing" for a daemonset based collector that uses the cloud discovery resource processor and/or k8s attributes processor.
Ideally some criterion should be supported to specify a specific metric that should be matched for these values, rather than just picking whatever comes first. That would ensure more stable and reliable node tags.
The set of host tags that the datadog backend ascribes special meanings to should be clearly documented.
The interaction of any explicit list of tags set as
datadog.host_metadata.tags
with auto discovered tags should be documented.Describe alternatives you've considered
It could be possible to inject the tags manually by setting datadog.host_metadata.hostname_source using env-vars injected into the collector's DaemonSet workload via external means.
But this is difficult and impactical. Not all that information is necessarily known by whatever is deploying the workload, or available in the format that Datadog expects. It's also hard to know what tags DD actually expects to have values and what "spelling" of those values it expects for e.g. cloud provider zone names.
The kube downward-api is not suitable for this because most of the desirable information is present as labels on the kube Node, but not easily injected into the workload definition (DaemonSet etc) or Pod. The downward API does not provide a means of injecting labels from the containing node into a workload. So while Node labels like
topology.kubernetes.io/zone
,topology.kubernetes.io/region
andnode.kubernetes.io/instance-type
are present, they are not easily mapped to env-vars that can be interpolated into thetags
values.The plugin doesn't appear to support using the kube API to discover and map node metadata. Nor should it, really; it'd be better to delegate this to the resource processors.
The kube downward-api doesn't support mapping load labels to pod workloads: kubernetes/kubernetes#40610 . Even if it did, that'd be verbose and unnecessary configuration when the collector should be able to query the kube apiserver to get this info, or read it via a processor.
Available workarounds are very ugly, see e.g. https://gmaslowski.com/kubernetes-node-label-to-pod/
Additional context
See how a sample set of metrics has sensible resource attributes, but these aren't reflected in the host tags or mapped to Datadog's "standard" tag names?
Also note the hostname is the internal cloud provider ID of the node, even though I actually set "datadog.hostname" in the config to the k8s node name.
The text was updated successfully, but these errors were encountered: