Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

labelsHashToGlobal is not being garbage collected #7003

Open
kchestnov opened this issue Aug 16, 2024 · 0 comments
Open

labelsHashToGlobal is not being garbage collected #7003

kchestnov opened this issue Aug 16, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@kchestnov
Copy link

What's wrong?

Grafana-agent 0.39.0 in a flow mode via statefulset with clustering enabled does not release some of its memory consumed by GetOrAddGlobalRefID and GetOrAddLink

I believe this is related to the fact that not all cases are covered by this function https://github.com/grafana/agent/blob/v0.39.0/service/labelstore/service.go#L239

From the graphs I can confirm that agent_labelstore_global_ids_count is not decreasing over time, the only event that helps is restart.

~/grafana_heaps: go tool pprof grafana-agent-1_heap.out*                                                                                                            
File: grafana-agent
Build ID: 54e00906a3fcdad65d14ca518d1296d9b729021b
Type: inuse_space
Time: Jul 25, 2024 at 12:11pm (CEST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 14010.80MB, 86.70% of 16160.64MB total
Dropped 427 nodes (cum <= 80.80MB)
Showing top 10 nodes out of 75
      flat  flat%   sum%        cum   cum%
 5452.44MB 33.74% 33.74%  5452.44MB 33.74%  github.com/grafana/agent/service/labelstore.(*service).GetOrAddGlobalRefID
 5072.37MB 31.39% 65.13%  5072.37MB 31.39%  github.com/grafana/agent/service/labelstore.(*service).GetOrAddLink
 1164.51MB  7.21% 72.33%  1164.51MB  7.21%  github.com/prometheus/prometheus/storage/remote.labelsToLabelsProto.func1
  501.75MB  3.10% 75.44%   501.75MB  3.10%  github.com/prometheus/prometheus/model/labels.(*ScratchBuilder).Labels (inline)
  457.26MB  2.83% 78.27%   457.26MB  2.83%  github.com/prometheus/prometheus/model/labels.(*Builder).Labels
  336.94MB  2.08% 80.35%   336.94MB  2.08%  github.com/prometheus/prometheus/scrape.newScrapePool.func1
  281.97MB  1.74% 82.10%   803.55MB  4.97%  github.com/prometheus/prometheus/storage/remote.(*QueueManager).StoreSeries
     270MB  1.67% 83.77%      270MB  1.67%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addRef
  260.37MB  1.61% 85.38%   260.37MB  1.61%  github.com/golang/snappy.Encode
  213.19MB  1.32% 86.70%   213.19MB  1.32%  github.com/prometheus/common/model.LabelSet.Merge
(pprof) 

image (4)
image (5)
image (6)

Steps to reproduce

Run grafana-agent in high cardinality environment in a flow-mode

System information

k8s 1.26.12

Software version

v0.39.0

Configuration

No response

Logs

No response

@kchestnov kchestnov added the bug Something isn't working label Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant