-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tikv should provide an option to return less metrics #12355
Comments
ref #12355 Signed-off-by: glorv <glorvs@163.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
ref tikv#12355 Signed-off-by: glorv <glorvs@163.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> Signed-off-by: Yu Juncen <yujuncen@pingcap.com>
ref tikv#12355, ref tikv#12417 Signed-off-by: glorv <glorvs@163.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> Signed-off-by: Yu Juncen <yujuncen@pingcap.com>
I have a few questions related with reducing number of series:
|
This is one step to reducing the cost of TiDB Cloud. The goal is to reduce the total size to 1/10 if possible.
We want to reduce the overall size of the metrics stored in prometheus. |
@glorv Thanks for the supplied information. If these metrics can be ignored, how about simply removing them? I'm worried about the usefulness of providing complex level switch in TiKV, as other people may be simply hard to configure it. IMO, for the purpose of reducing metric cost, there are also other options:
I'm not sure whether these could be better alternatives to the current implementation in #12732, if these alternatives was not considered previously. Maybe worth to research with later. Anyway, it's always good to step out as the first iteration in #12732. Great job! |
@zhangjinpeng1987 @BusyJay @kevin-xianliu, maybe we can consider Victoriametrics. It provides memory, storage space and performance improvements. If we replace prometheus with Victoriametrics, seems we can just keep all metrics with a much more low storage cost(7x less compared to prometheus). What do you think? |
I also considers this as histogram is the biggest reason for so many metrics. In my test, we may reduce histogrm size in 3 ways:
I have consulted with some colleagues, they thought all of the metrics were useful in some specific scenarios, so we still need to provide the ability to return all metrics dynamically. |
Just came up with a simpler way to "filter core metrics" without touching the TiKV code base: we can write allowlist rules in Prometheus.yaml in TiDB Cloud and drop others. When we want to collect more metrics, we can update the Prometheus config file and reload it. Advantages:
What do you think? |
I also would like to change the tikv source as less as possible.
Beside these drawbacks, I'm also in favor of this approach as it is more flexible. @BusyJay PTAL. |
👍 Both Victoriametrics and configuring the prometheus.yaml LGTM. And Victoriametrics seems can be configured as the storage of prometheus, so it may help reduce the complexity a lot. On the other hand, allowlist can be also used as a way to find those metrics that we think is important but not in practice. We can remove them in the end. |
ref tikv#12355 Signed-off-by: glorv <glorvs@163.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
ref tikv#12355, ref tikv#12417 Signed-off-by: glorv <glorvs@163.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Feature Request
Is your feature request related to a problem? Please describe:
Currently tikv produces too many metrics which can consume a lot of network bandwidth and storage space.
Describe the feature you'd like:
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Migration Strategy:
The text was updated successfully, but these errors were encountered: