-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce Garbage Collection pressure #1010
Comments
I think I have identified a potentially quite significant improvement here... While AFAICT, both Ideally Another (tiny) optimisation:
but note that report merging already copies the lhs, so the above really should just be
(This won't save much since presumably copying an empty report is cheap). Yet another, more substantial optimisation: merging should be n-ary, not binary, to avoid creation and copying of intermediate data structures. |
re merging reports in newest-to-oldest order... Note that it's actually the probe-side timestamps that matter here, since that is what |
Doing that will make In the optimal case this will only allocate memory equivalent to final merged report size. And we will usually get close to that optimum. As an additional optimisation, note that the blank report we start with and modify does not have to be a persistent map. Including all its components. Though perhaps it needs "freezing" i.e. converting to a ps map at the end... depends on what happens to the result. |
Another thing which could potentially make a difference: Field allocations on Node initialization. All topologies use the same generic type of node, but their field usage is different. However, some fields (e.g. Metrics, Controls) are initialized with non-nil values, which will cause allocations even if they're not used. It might be worth initializing those fields to nil (or migrating the Metrics and StringSets to immutable data structures, whose nil value is global). |
Here are some notes I've been collecting over the weekend. The Scope App flamegraphs/profiles I've obtained show that
Tongue-in-cheek:
|
Here's the object allocation profile and flamegraph of the Scope standalone app while monitoring Weave Cloud after the recent optimizations (namely #1728 #1720 #1709 ) pprof.localhost:4040.alloc_objects.alloc_space.001.pb.gz I think I am going to implement mutable variants of the merge functions next |
After seeing the positive results in #1732 and the lack of results in #1724 we have learned that:
Thus, it would be worth considering using key-value slices with fixed-sized strings as maps (e.g. LatestMaps). Something in the lines of: const maxSize = 128
type fixedSizedString {
content [maxSize]byte
used int
}
type kv struct {
k, v fixedSizedString
}
type fixedMap []kv To allow for unbounded-strings we could add an optional additional slice which we allocate if the string doesn't fit in maxSize (which should rarely happen if at all). Also, for quicker access, the keys could be ordered (allowing for O(log(n)) binary search) or we could keep a hash to speed up O(n) comparison. const maxSize = 128
type fixedSizedString {
content [maxSize]byte
used int
extra []byte
}
type kv struct {
k, v fixedSizedString
kHash int64
}
type fixedMap []kv |
Here's another idea to reduce garbage collection pressure. Cache the keys in the LatestMap like @tomwilkie did here: |
Another possibility is to intern string keys in general |
Most of this has been done now. |
We can see from #812 (specifically #812 (comment)) and more importantly from #854 (specifically #854 (comment) ) that the current main performance bottleneck is the Garbage collector pressure:
The text was updated successfully, but these errors were encountered: