Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External Labels Overwrite existing Labels in Thanos #1579

Open
ebini opened this issue Sep 27, 2019 · 17 comments
Open

External Labels Overwrite existing Labels in Thanos #1579

ebini opened this issue Sep 27, 2019 · 17 comments

Comments

@ebini
Copy link

ebini commented Sep 27, 2019

Hi,

i’m using a prometheus federation installation which scrapes the data from 3 Prometheus instances. (Don't ask why, this is at the moment necessary)

External Labels are:

Prometheus Federate:
app=„federate“
env=„prod“

Prometheus 1
app=„abc“
env=„dev“

Prometheus 2
app=„abc“
env=„test“

Prometheus 3
app=„abc“
env=„prod“

In the Prometheus Federation my queries locks fine:
for example on Prometheus (Federate) the query
mymetric{env=„prod“}
returns the correct metric
mymetric{app=„ abc“,env=„prod“} = 0

But on Thanos Query Gateway the Announced LabelSets of the Prometheus Federate is app=„federate“ and env=„prod“.

In it seems to overwrite the existing Labels in the metrics
So if i do the same query:
mymetric{env=„prod“}
i now get 3 metrics, all with the same labels:
mymetric{app=„federate“,env=„prod“} = 0
mymetric{app=„federate“,env=„prod“} = 0
mymetric{app=„federate“,env=„prod“} = 0

It seems Thanos „overwrites“ the existing Labels with the accounced ones.
I don't know if this is a bug, or i have done something wrong.
Perhaps you can help
(Note, at the moment i have to scrape a prometheus federation installation and can not change to prometheus with thanos sidecar installed)

Thanks in advance

@krasi-georgiev
Copy link
Contributor

Maybe try playing with the dedup labels in the querier.

https://github.com/thanos-io/thanos/blob/master/docs/components/query.md#deduplication

@GiedriusS
Copy link
Member

Hi, perhaps you need to specify --query.replica-label=app on Thanos Query?

@ebini
Copy link
Author

ebini commented Oct 7, 2019

Thanks, but it doesn't change anything.
It still seems that the direkt query to Prometheus isn't the same than using Thanos.

And i don't have duplicate data in the Prometheus instances. These are not redundant prometheus instances.

@MarcMielke
Copy link

MarcMielke commented Dec 10, 2019

same problem, looks like the ingesting prometheus instance wins and overwrites the external label(s) already set in the prometheus instances being scraped from the federation endpoint. honor-labels is ignored as it seems. here is what I tried (where cluster is the external label):

    - job_name: monitoring/gke-preprod
      honor_timestamps: true
      honor_labels: true
      scrape_interval: 60s
      scrape_timeout: 30s
      metrics_path: /federate
      scheme: http
      params:
        match[]:
        - '{__name__=~".+"}'
      static_configs:
      - targets:
        - xx.xx.xx.xx:9090
        labels:
          cluster: gke-preprod

@stale
Copy link

stale bot commented Jan 11, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 11, 2020
@QianChenglong
Copy link
Contributor

This problem is a bit annoying, hopefully it can be fixed.

@stale stale bot removed the stale label Jan 13, 2020
@stale
Copy link

stale bot commented Feb 12, 2020

This issue/PR has been automatically marked as stale because it has not had recent activity. Please comment on status otherwise the issue will be closed in a week. Thank you for your contributions.

@stale stale bot added the stale label Feb 12, 2020
@stale stale bot closed this as completed Feb 19, 2020
@ashleyprimo
Copy link

Hi 👋 - has anyone found a fix for this?

@GiedriusS GiedriusS reopened this Apr 6, 2022
@ldb
Copy link

ldb commented Apr 19, 2022

Hello,
we are having the same problem:

A federated Prometheus instance scrapes local Kubernetes pods and sets the external label env=dev-k8s.
However, it also scrapes metrics from EC2 instances which all contain the label env=dev-ec2.

When querying that instance directly, we can find the metrics using the label env=dev-ec2. However, when using the Thanos Query Frontend or Querier, all metrics have the label env=dev-k8s.

According to Prometheus Documentation this should not happen:

Note that any globally configured "external_labels" are unaffected by this
setting. In communication with external systems, they are always applied only
when a time series does not have a given label yet and are ignored otherwise.

In other words, external labels should only be added to a metric if they do not already exist, or not?

@stale stale bot removed the stale label Apr 19, 2022
@stale
Copy link

stale bot commented Aug 13, 2022

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Aug 13, 2022
@ebini
Copy link
Author

ebini commented Aug 16, 2022

in my opinion still an issue

@stale stale bot removed the stale label Aug 16, 2022
@stale
Copy link

stale bot commented Nov 13, 2022

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@pahaeanx
Copy link

This is still an issue in 0.31.0 and the 0.32rc -- thanos sidecar seems to always "apply" external_labels, even when a timeseries already has such a label set. This makes querying locally and via thanos give different results as explained here #1579 (comment)

@GiedriusS
Copy link
Member

Yes, this is by design and I don't think there's anything to do here. Otherwise, the deduplication cannot work correctly. Please see: #6564, #6257 (comment).

@pahaeanx
Copy link

Thank your for the quick reply! I'm still a bit confused however, this seems a valid use case.

This is a rather annoying problem in some situations and I think there's no easy workaround, unfortunately.

@fpetkovski
Copy link
Contributor

Can you change your external labels to not have conflicts with series labels?

@pahaeanx
Copy link

Unfortunately, no. These external labels are used as selectors.

In our case it happens with a setup like this:

external_labels:
  cluster: A
  replica: 1
  
scrape_configs:
- job_name: 'a-single-job-for-some-other-cluster'
  static_configs:
  - targets: ['target:9100']
    labels:
      cluster: B
      
... a bunch of scrape jobs for targets in our cluster (A).

As this Prometheus only has (literally) one job scraping another "cluster" (using the term lightly here), this seems like the perfect solution. And indeed, querying metrics on the Prometheus itself yields the correct cluster=B label. Using thanos yields cluster=A as the sidecar indiscriminately applies the external label(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants