Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: metrics still taking up too much memory when metrics.UseNilMetrics=true #1973

Merged
merged 4 commits into from
Sep 7, 2021

Conversation

qiangmzsx
Copy link
Contributor

Fix metrics taking up too much memory
We call broker.Open() multiple times, and we have the problem of excessive memory usage.
The crawl flame graph can be seen as follows.
image

The fix is to turn off the call to registerMetrics when setting metrics.UseNilMetrics=true.

Fix metrics taking up too much memory
@qiangmzsx qiangmzsx requested a review from bai as a code owner June 24, 2021 13:44
@ghost ghost added the cla-needed label Jun 24, 2021
@dnwe
Copy link
Collaborator

dnwe commented Jun 24, 2021

@qiangmzsx whilst the suggested change is probably fine, I'm surprised that you're hitting a problem here. How many brokers are in the cluster you're connecting to? Are you showing allocs in your pprof or actual inuse? Can you share the full pprof file via https://share.polarsignals.com/ maybe?

Essentially:

a) all metrics registered by broker.Open will be unregistered during a broker.Close and become eligible for GC thereafter

b) if you're already setting metrics.UseNilMetrics = true then go-metrics is using empty struct's for each metric type, with no-op funcs (e.g., NilGauge) so I'm surprised that you'd be seeing several hungred meg of allocations.

@qiangmzsx
Copy link
Contributor Author

@qiangmzsx whilst the suggested change is probably fine, I'm surprised that you're hitting a problem here. How many brokers are in the cluster you're connecting to? Are you showing allocs in your pprof or actual inuse? Can you share the full pprof file via https://share.polarsignals.com/ maybe?

Essentially:

a) all metrics registered by broker.Open will be unregistered during a broker.Close and become eligible for GC thereafter

b) if you're already setting metrics.UseNilMetrics = true then go-metrics is using empty struct's for each metric type, with no-op funcs (e.g., NilGauge) so I'm surprised that you'd be seeing several hungred meg of allocations.

I am glad to receive such a quick reply from you. I'll put the heap file of the capture in the attachment.
The reason for this scenario is that I am using sarama for monitoring services.
heap.zip

@qiangmzsx qiangmzsx closed this Jun 24, 2021
@qiangmzsx
Copy link
Contributor Author

I'm very sorry for my mistake, I clicked close by mistake, actually I didn't close. But I don't know how to deal with it anymore.
I'd like to explain that calling b.Open() has turned on metrics.UseNilMetrics = true, but calling b.registerMetrics() does not change this.
@dnwe

@qiangmzsx qiangmzsx reopened this Jun 24, 2021
@qiangmzsx
Copy link
Contributor Author

@dnwe
For better understanding, I will also write the performance test results on.

func BenchmarkBroker_Open(b *testing.B) {
	mb := NewMockBroker(nil, 0)
	broker := NewBroker(mb.Addr())
	// Set the broker id in order to validate local broker metrics
	broker.id = 0
	metrics.UseNilMetrics = false
	conf := NewTestConfig()
	conf.Version = V1_0_0_0
	for i := 0; i < b.N; i++ {
		err := broker.Open(conf)
		if err != nil {
			b.Fatal(err)
		}
		broker.Close()
	}
}

func BenchmarkBroker_No_Metrics_Open(b *testing.B) {
	mb := NewMockBroker(nil, 0)
	broker := NewBroker(mb.Addr())
	broker.id = 0
	metrics.UseNilMetrics = true
	conf := NewTestConfig()
	conf.Version = V1_0_0_0
	for i := 0; i < b.N; i++ {
		err := broker.Open(conf)
		if err != nil {
			b.Fatal(err)
		}
		broker.Close()
	}
}

One can get.

$ go test -bench=. -benchmem -test.benchmem  -run=broker_test -test.bench Open
goos: windows
goarch: amd64
pkg: github.com/Shopify/sarama
cpu: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
BenchmarkBroker_Open-8                      2768            383208 ns/op           66256 B/op        126 allocs/op
BenchmarkBroker_No_Metrics_Open-8           3538            345948 ns/op            8390 B/op         63 allocs/op
PASS
ok      github.com/Shopify/sarama       2.458s

What we need to see is that the memory usage in this monitoring scenario is not visible in a short period of time and requires two or three weeks or even a month of uninterrupted operation.

@qiangmzsx
Copy link
Contributor Author

qiangmzsx commented Jun 28, 2021

func BenchmarkBroker_Open(b *testing.B) {
	mb := NewMockBroker(nil, 0)
	broker := NewBroker(mb.Addr())
	// Set the broker id in order to validate local broker metrics
	broker.id = 0
	metrics.UseNilMetrics = false
	conf := NewTestConfig()
	conf.Version = V1_0_0_0
	for i := 0; i < 100000; i++ {
		err := broker.Open(conf)
		if err != nil {
			b.Fatal(err)
		}
		broker.Close()
	}
}

func BenchmarkBroker_No_Metrics_Open(b *testing.B) {
	mb := NewMockBroker(nil, 0)
	broker := NewBroker(mb.Addr())
	broker.id = 0
	metrics.UseNilMetrics = true
	conf := NewTestConfig()
	conf.Version = V1_0_0_0
	for i := 0; i < 100000; i++ {
		err := broker.Open(conf)
		if err != nil {
			b.Fatal(err)
		}
		broker.Close()
	}
}

To make it more obvious, I set the loop to call the Open function 100000 times.

BenchmarkBroker_Open:
image

BenchmarkBroker_No_Metrics_Open:
image

BenchmarkBroker_No_Metrics_Open.out.zip
BenchmarkBroker_Open.out.zip

@dnwe
Copy link
Collaborator

dnwe commented Jul 12, 2021

@qiangmzsx are you able to sign the CLA?

@qiangmzsx
Copy link
Contributor Author

@qiangmzsx are you able to sign the CLA?

I have signed and completed the CLA。

image

@qiangmzsx
Copy link
Contributor Author

@dnwe Do I need any more information from my side?

@dnwe
Copy link
Collaborator

dnwe commented Aug 19, 2021

@qiangmzsx did you see the failing gofmt check?

broker_test.go:1076: File is not gofmt-ed with -s (gofmt)

@qiangmzsx
Copy link
Contributor Author

@qiangmzsx did you see the failing gofmt check?

broker_test.go:1076: File is not gofmt-ed with -s (gofmt)
done.

@ghost ghost removed the cla-needed label Aug 31, 2021
@dnwe dnwe changed the title Fix metrics taking up too much memory fix: metrics still taking up too much memory when metrics.UseNilMetrics=true Aug 31, 2021
@bai
Copy link
Contributor

bai commented Sep 7, 2021

Thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants