-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in ingester after it has failed to write indixes on memcache #1852
Comments
I have seen similar symptoms, and I don't know the cause. I am planning to set up https://github.com/conprof/conprof to fetch heap and goroutine profiles, which should give us some solid leads. If you could set that up too, even better. |
Ok I'll look into conprof and I hope I can provide the desired profiling results within this week. |
Here's the report of another OOM kill, this time with a heap & go routine profiles as requested. I took a profile (go routine + heap allocs) from 14.41h when everything was fine and another profile from 14.48h which is shortly before it gets oom killed by Kubernetes. I have attached it as zip. Log messages for this pod around this time (after filtering the noisy log messages): |
From your zip:
There is an option Dividing the two numbers out gives an average of 273MB in buffers for each call, which seems high - do you configure Prometheus to send thousands of samples at once? If you could look at the You should also see some benefit from the gRPC improvements brought in with #1801 I would link this to #665 - ingester could stop accepting new data when it got above a certain size. But this seems tricky to balance in a live environment. PS: those graphs are really difficult to read; the raw profile would be much better. Feature request to conprof I guess. |
I have new data: a heap dump from an ingester in the middle of rapid heap growth. It's using 28GB in total of which 7.68GB in buffers allocated in However this time I can see this is in 63,996 buffers, giving 120KB per buffer: see #1897 for explanation of what was keeping them alive. This ingester was limited to 20 incoming gRPC streams so that idea can be ruled out. |
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions. |
@bboreham or @pracucci Could you please reopen this issue? Today using Cortex v0.6.1 this has happened again. Looking at the logs I do not necessarily believe that the memcache timeout is really causing it as this happens quite a couple times in the course of the day and hasn't caused issues for other ingesters. @bboreham Have you seen these memory leaks on the ingester recently? |
Today using Cortex v0.7.0 rc this has happened again. |
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions. |
The problem:
Even though I run cortex v0.3.0 for a couple weeks already I face OOM killed ingester replicas more frequently in the recent time (roughly 1-2 times every day, which is quite a lot given that I only run 5 big ingesters).
Pod RAM request/limit in k8s: 40GB
Average RAM usage: 14-18GB
First OOM kill at: 11:29:45am
Second OOM kill at: 11:32:15am
Incident details (RAM usage, log messages etc.)
Log messages:
Since there is an unusual amount of log messages, I was interested what kind of log messages these were. I had to filter the noisy messages anyways to get to the actually interesting log messages:
4k of the 22k log messages were "context cancalled" (QueryStream error)
3k messsages because of "fingerprint collision detected, mapping to new fingerprint"
15k messages because of "sample timestamp out of order" (which was likely caused by this faulty ingester)
After filtering out these messages I had 31 messages left:
In short:
Every time I see these ingester OOM kills it seems to start with memcached timeouts. In this case it couldn't put it's indexes on the "index-cache-write" memcache because it timedout. Afterwards the memory seems to leak very quickly, which makes it hard to provide a heapdump. If the information provided in this issue is not sufficient to track the leak, I'd be happy to provide whatever is needed to figure it out - please let me know what else I can do.
The text was updated successfully, but these errors were encountered: