Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: daemon: automatically set GOMEMLIMIT if it is unset #9451

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Jorropo
Copy link
Contributor

@Jorropo Jorropo commented Dec 3, 2022

I have a rather big collection of profiles where someone claims that Kubo is ooming on XGiB. Then you open the profile and it is using half of that, this is due to the default GOGC=200%. That means, go will only run the GC once it's twice as being as the previous alive set.

This situation happen more than it should / almost always because many parts of Kubo are memory garbage factories.

Adding a GOMEMLIMIT helps by trading off more and more CPU running GC more often when memory is about to run out, it's not healthy to run at the edge of the limit because the GC will continously run killing performance. So this doesn't double the effective memory usable by Kubo, but we should expect to be able to use ~1.5x~1.75x before performance drastically falling off.

Closes: #8798

@Jorropo Jorropo requested a review from ajnavarro December 3, 2022 15:05
@Jorropo Jorropo force-pushed the magic-soft-memory-limit branch 4 times, most recently from d177aeb to 4c0bec8 Compare December 3, 2022 16:19
@Jorropo Jorropo self-assigned this Dec 4, 2022
@dokterbob
Copy link
Contributor

Would it perhaps make sense to default this to something (based on) the resource manager's limit? https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgrmaxmemory

@Jorropo Jorropo force-pushed the magic-soft-memory-limit branch from 4c0bec8 to 6b3e242 Compare December 5, 2022 09:31
Copy link
Member

@ajnavarro ajnavarro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some minor comments.

Shall we add some documentation about GOMEMLIMIT and how it is being used on kubo here: https://github.com/ipfs/kubo/blob/master/docs/environment-variables.md ?

cmd/ipfs/daemon.go Show resolved Hide resolved
cmd/ipfs/daemon.go Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
@BigLep
Copy link
Contributor

BigLep commented Dec 6, 2022

@Jorropo : I like the idea of this but in practice is this actually going to help with resource manager complaints since the go-libp2p resource manager does its own state tracking for resource usage? (I agree it will help Kubo less system resources.)

If it's not going to help with the errors/alarming, I suggest we push this out to next release since we're already trying to squeeze a lot in between now and 2022-12-08 and would rather divert our attention to other items for 0.18.

@Jorropo
Copy link
Contributor Author

Jorropo commented Dec 6, 2022

@BigLep this does not change anything to the error logging, It helps with peoples reporting OOMs.

@dokterbob
Copy link
Contributor

@BigLep this does not change anything to the error logging, It helps with peoples reporting OOMs.

Also, ideally, significantly less GC disruptions on high-use nodes.

@BigLep
Copy link
Contributor

BigLep commented Jan 31, 2023

2023-01-31: we pushed to next iteration because of concerns about getting caught in CPU death spiral and not actually dying. This requires extra discussion and we don't have bandwidth for it now.

@Jorropo Jorropo force-pushed the magic-soft-memory-limit branch 2 times, most recently from 63c4002 to a3fffe5 Compare March 30, 2023 05:03
@Jorropo Jorropo force-pushed the magic-soft-memory-limit branch from a3fffe5 to 4d9489a Compare May 3, 2023 14:23
@Jorropo Jorropo requested a review from a team as a code owner May 3, 2023 14:23
@Jorropo Jorropo force-pushed the magic-soft-memory-limit branch from 4d9489a to 819d8e3 Compare May 4, 2023 04:29
@dokterbob
Copy link
Contributor

❤️

@Jorropo
Copy link
Contributor Author

Jorropo commented Jun 29, 2023

I checked the CPU death spiral was some other bug.
The Go runtime limit itself to gcing at no more than 1hz.
I don't know of any issue for Kubo v0.22.
The magic might be a bit dodgy, I think it's reasonable but this needs to be looked at carefully ?

I have a rather big collection of profiles where someone claims that Kubo is ooming on XGiB.
Then you open the profile and it is using half of that, this is due to the default GOGC=200%.
That means, go will only run the GC once it's twice as being as the previous alive set.

This situation happen more than it should / almost always because many parts of Kubo are memory garbage factories.

Adding a GOMEMLIMIT helps by trading off more and more CPU running GC more often when memory is about to run out,
it's not healthy to run at the edge of the limit because the GC will continously run killing performance.
So this doesn't double the effective memory usable by Kubo, but we should expect to be able to use ~1.5x~1.75x before performance drastically falling off.

Closes: ipfs#8798
@Jorropo Jorropo force-pushed the magic-soft-memory-limit branch from 819d8e3 to 8011af2 Compare June 30, 2023 03:15
@BigLep BigLep mentioned this pull request Aug 3, 2023
@BigLep BigLep mentioned this pull request Nov 9, 2023
11 tasks
@BigLep
Copy link
Contributor

BigLep commented Jan 3, 2024

@Jorropo : what are the next steps here? Is this realistically going to get merged the next month?

@Jorropo Jorropo removed their assignment Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: 🥞 Todo
Development

Successfully merging this pull request may close these issues.

Add a more aggressive GC watchdog to help in bursty memory usage situations
4 participants