add metrics, metrics analysis, and concurrency #37
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reminder: This is a public repo
This PR adds a new command to export metrics and their associated tags. There's always going to be way more metrics than dashboards/monitors and running this command takes quite a while in our datadog account. I tried to speed this up by adding some concurrency but even a little bit gets us rate limited by datadog. A future PR should add retries on 429s from datadog. Docs on that here.
This PR also adds a new command
metrics-analysis
that writes a fileanalysis.json
. The output of this command is an ordered list of tags across all metrics with the given prefix. So you can run it on an individual metric (datadog-exporter metrics-analysis system.cpu.user
) or on multiple metrics with the same prefix (datadog-exporter metrics-analysis system.cpu
)The goal with the analysis command is that we can take a high-cardinality metric namespace like
server.*
and run this command to get a list of all custom tags used by this metric namespace. We can feed that into metrics without limits and exclude the high-cardinality tags.One caveat: This doesn't seem to ever output the
host
tag and I'm not sure why. It appears thehost
tag is a special datadog thing that has its own API and doesn't show up in the requests for all other tags. Just a way for them to extract money from people who don't even want the tag but can't remove it.