-
Notifications
You must be signed in to change notification settings - Fork 805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are huge amounts of 'sample timestamp out of order' logs normal? #3411
Comments
This happens when Prometheus sends old samples for series, for which Cortex already has newer samples. Cortex can only ingest samples for the series in order.
On restart, Prometheus tries to push all data from its WAL to remote write. It doesn't remember how much it has pushed before the restart. Another source of "sample timestamp out of order" is when running multiple Prometheus servers pushing the same metrics, without proper HA deduplication configuration in Cortex (https://cortexmetrics.io/docs/guides/ha-pair-handling/).
See above.
It can indicate frequent Prometheus restarts or misconfiguration on Cortex side.
Not that I know of. But you're right that it's logging too much, and we can improve it. |
Thanks for explaining, but this lasts for a LONG time. I've restarted my Cortex 11 minutes ago and the logs are still full of these messages. How long should it take them to stop? |
That depends on the cause of this problem. If you have multiple Prometheus servers sending the same series (common in HA setup), then it will only stop once Cortex is properly configured. See the linked article above. |
No, I do not have a setup like that. I have a single Prometheus instance per DC, and there should be no duplicates. For example. I restarted 2 out of 3 Cortex instances. The two restarted ones are spamming logs with copious amounts of |
Do you have multiple DCs? Is it possible that they are sending the same series to Cortex? Those |
You may want to use |
No, it is not possible. Examples:
Labels include fleets and hostnames. Each DC has one prometheus instance and each pulls just for that DC. |
Looks like it calmed down for the most part on one of the hosts 30 minutes after restart. The other one still spamming. I might add that all Prometheus instances push to the same cluster, but the one Cortex instance that has not been restarted does not log any warnings like that. Zero. |
What would be the point of that if they are already labeled with The issue is not about uniqueness. If it was I would be getting these warnings ALL THE TIME. But I do not. I get them only when i restart a Cortex instance, and only on that instance. Why would only it get duplicates if my Prometheus instances have all 3 configured to push to? |
Like right now, the instance I restarted yesterday is fine:
So if the issue was really that I get duplicates from different DCs these logs would be FULL of |
We are experiencing this issue as well. Same symptoms as described above; running perfectly happy for months and after having an issue with the etcd database, cortex was brought back into working order and now there are many "sample timestamp out of order for series" messages from all our ingesters (lasting for at least 6 hours now). According to the prometheus instance suspected for sending this traffic, its around 0.1% of series are being rejected based on the prometheus_remote_storage_failed_samples_total metric (but enough to trigger our alerting). Unsure on if this is relevant, but an interesting note is that cortex is set up for multi-tenancy and as far as we can tell, all the series that get flagged for out of order are coming from our largest tenant and the prometheus mentioned above. All of the smaller users (by at least an order of magnitude) are not seeing any series rejected due to this out of order message. |
Today it was my turn to debug "sample timestamp out of order" errors. However, I discovered there were actually two Prometheus sending! Looking at |
I see the same issue, am getting the errors related to out of bounds and duplicate sample for timestamp. My cortex environment is still in POC stage, so I shutdown for an hour and redeployed. Even then I can still see these errors. I would like to request answers for below questions,
current architecture: |
This typically happens when you have clashing series exported by the same exact exporter. We've seen cAdvisor exporting clashing series in some conditions. |
If it is the clashing series from the same exporter, how to address that? Out of order sample Also, can someone help me to ingest metrics for every minute interval (rather than datapoints for every 15s, I am interested in for every 60secs). |
Some general guidance.
It means: given a single time-series, Cortex has received a sample with a timestamp older than the latest timestamp received. Cortex requires samples to pushed in timestamp order.
It means: given a single time-series, Cortex has received two samples with the same exact timestamp but different value. It's not allowed in Cortex.
It means: Cortex has received a sample with a timestamp "too far in the past" (eg. in the blocks storage you can ingest samples up to 1h old compared to the most recent timestamp received for the same tenant). |
Are there any plans to completely eliminate or at least relax this limitation? The existing behavior is especially problematic for Loki as it leads to log records loss that can not be entirely fixed on the client side. |
No, timeseries are compressed by computing the delta from the previous sample, so re-ordering requires exponentially more work. |
@bboreham Just a side note to this, what's the general fix / expectation Prom side for these errors? I have a tenancy proxy inbetween my Prometheus and Cortex, and when I have a bad series, this effectively backlogs and completely stop ingestion on the node until I have manually removed the series using the admin API on Prometheus. Is there an expectation or option in Prometheus to drop the remote_write for a series if it gets too many failures? In my case, I have a series that basically keeps sending and has flooded my proxy. |
@DandyDeveloper the message in the original description says If your symptoms are different please open a different issue. |
Closing this as I don't think there is anything to fix; it can serve as documentation if people need to dig in to what "sample out of order" does or doesn't mean. |
@pracucci - Sorry to put this on a closed thread, but figured it fits for documentation purposes. What is the "order" (pun intended) that these errors are raised in? For example, suppose I have a sample that is out of bounds, out of order, AND is a duplicate sample? |
A single timestamp can't be both duplicate and out of order. Out of bounds is checked before out of order, here: cortex/vendor/github.com/prometheus/prometheus/tsdb/head.go Lines 1319 to 1357 in 12d1cb1
|
@bboreham - What about something like this sequence? Obviously data integrity issues aside, but curious what Cortex error would come up with something like this:
Would that third sample be: "out of order" or "duplicate timestamp with new value" ? Likewise, if for some reason the same metric(s) got processed twice:
What would the third & fourth samples be? |
Out of order. It only keeps the last sample around for comparison; everything else is highly compressed and unavailable unless you run a query. |
Came acoss "err: out of order sample" in the ingester following a Cortex maintainance window. During this time Prometheus was denied access to Cortex with the expectation that after maintainance and within 2 hours, Prometheus will restore the missing values from its WAL. It looks like for the 5 minutes after Prometheus stops sending, Cortex replicates the last know value in the time series, effectivelly filling in bogus data. After Prometheus connectivity is restored, Cortex will fill in the later values that were missing, but the 5-mins auto-generated values remain and colide with the real values sent by Prometheus, thus the "err: out of order sample". This is all illustrated in the image below: I wonder if someone can shed some light on why Cortex maintains an interval of bogus values. If that is to approximate for possible missing values in the time series, why aren't the values coming from Prometheus allowed to overwrite after the fact. |
Filling in empty values for up to 5 minutes is standard PromQL - https://www.robustperception.io/staleness-and-promql It is interesting to ponder why Prometheus did not supply values during that time; perhaps there are clues in your Prometheus logs. (Please open a new issue if you want to continue the conversation) |
Thanks for the insights, you are right, the Cortex query now shows the increase pattern instead of a flat value, I guess there were cached query results lingering right after recovery that were leading me to believe otherwise. |
Description
Every time I modify Cortex configuration and restart the nodes they generate ungodly amounts of logs like this:
I assume this is because the upstream Prometheus instance is re-trying pushing of the metrics that failed when node was down.
It generates quite a lot of them...
Questions
The text was updated successfully, but these errors were encountered: