Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Memory Leak In New Timecache Implementations #528

Merged
merged 2 commits into from
Mar 15, 2023

Conversation

nisdas
Copy link
Contributor

@nisdas nisdas commented Mar 15, 2023

In our latest release of prysm, we updated it to use version v0.9.2 of go-libp2p-pubsub. However, immediately after a few hours
we received multiple reports of memory leaks with prysm and prysm nodes constantly crashing with an OOM. After fetching a few profiles, this is what we found:

image

The majority of heap usage was being consumed by the first seen cache. After digging into it and the PR which introduced it
#523 , it was found out that the background routine was not clearing our expired entries from the cache. This is because it used NewTimer instead of NewTicker when performing the sweep. So this would only fire an event once into the channel and fire no more events after.

This PR introduces the fix along with fixing the test so that this particular case is correctly checked for.

Copy link
Collaborator

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch! Thank you, release imminent.

@vyzo vyzo merged commit 340387e into libp2p:master Mar 15, 2023
@nisdas nisdas deleted the fixBackgroundSweep branch March 15, 2023 12:51
yhassanzadeh13 added a commit to yhassanzadeh13/go-libp2p-pubsub that referenced this pull request Jul 3, 2023
* feat: expire messages from the cache based on last seen time (libp2p#513)

* feat: expire messages from the cache based on last seen time

* chore: minor renaming

* fix: messages should not be found after expiration

* chore: editorial

* fix: use new time cache strategy consistently

* fix: default to old time cache and add todo for background gc

* chore: update to go-libp2p v0.25 (libp2p#517)

* Update to go-libp2p v0.25

* Use go 1.19

* chore: update go version and dependencies (libp2p#516)

* fix(timecache): remove panic in first seen cache on Add (libp2p#522)

* Refactor timecache implementations (libp2p#523)

* reimplement timecache for sane and performant behaviour

* remove seenMessagesMx, take advantage of new tc api

* fix timecache tests

* fix typo

* store expiry, don't make life difficult

* refactor common background sweep procedure for both impls

* add godocs to TimeCache

* Default validator support (libp2p#525)

* add default validator support

* add an implementation for basic seqno as nonce validation

* missing return

* the nonce belongs to the origin peer

* add note about rust predicament

* add seqno validator tests

* minor test tweak, ensure at least 1ms before replay

* Fix Memory Leak In New Timecache Implementations (libp2p#528)

* fix bug

* add for last seen cache

* chore: Update .github/workflows/stale.yml [skip ci]

* chore: Update .github/workflows/stale.yml [skip ci]

* upgrades libp2p version

* upgrades libp2p version

---------

Co-authored-by: Mohsin Zaidi <2236875+smrz2001@users.noreply.github.com>
Co-authored-by: Marco Munizaga <git@marcopolo.io>
Co-authored-by: RichΛrd <info@richardramos.me>
Co-authored-by: Hlib Kanunnikov <hlibwondertan@gmail.com>
Co-authored-by: vyzo <vyzo@hackzen.org>
Co-authored-by: Nishant Das <nishdas93@gmail.com>
Co-authored-by: GitHub <noreply@github.com>
yhassanzadeh13 added a commit to yhassanzadeh13/go-libp2p-pubsub that referenced this pull request Feb 20, 2024
* feat: expire messages from the cache based on last seen time (libp2p#513)

* feat: expire messages from the cache based on last seen time

* chore: minor renaming

* fix: messages should not be found after expiration

* chore: editorial

* fix: use new time cache strategy consistently

* fix: default to old time cache and add todo for background gc

* chore: update to go-libp2p v0.25 (libp2p#517)

* Update to go-libp2p v0.25

* Use go 1.19

* chore: update go version and dependencies (libp2p#516)

* fix(timecache): remove panic in first seen cache on Add (libp2p#522)

* Refactor timecache implementations (libp2p#523)

* reimplement timecache for sane and performant behaviour

* remove seenMessagesMx, take advantage of new tc api

* fix timecache tests

* fix typo

* store expiry, don't make life difficult

* refactor common background sweep procedure for both impls

* add godocs to TimeCache

* Default validator support (libp2p#525)

* add default validator support

* add an implementation for basic seqno as nonce validation

* missing return

* the nonce belongs to the origin peer

* add note about rust predicament

* add seqno validator tests

* minor test tweak, ensure at least 1ms before replay

* Fix Memory Leak In New Timecache Implementations (libp2p#528)

* fix bug

* add for last seen cache

* chore: Update .github/workflows/stale.yml [skip ci]

* chore: Update .github/workflows/stale.yml [skip ci]

* bump golang.org/x/net from 0.4.0 to 0.7.0 (libp2p#520)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.4.0 to 0.7.0.
- [Release notes](https://github.com/golang/net/releases)
- [Commits](golang/net@v0.4.0...v0.7.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: topicscore params can't be set for dynamically subscribed topic (libp2p#540)

* fix: topicscore params can't be set for a topic subscribed after gossipsub is initialized

* chore:address review comments

* Revert "fix: topicscore params can't be set for dynamically subscribed topic (libp2p#540)" (libp2p#541)

This reverts commit aa5fd79.

* remove usage of deprecated peerid.Pretty method (libp2p#542)

* chore: update go-libp2p to v0.32 (libp2p#548)

* chore: Update .github/workflows/stale.yml [skip ci]

* make tidy

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Mohsin Zaidi <2236875+smrz2001@users.noreply.github.com>
Co-authored-by: Marco Munizaga <git@marcopolo.io>
Co-authored-by: RichΛrd <info@richardramos.me>
Co-authored-by: Hlib Kanunnikov <hlibwondertan@gmail.com>
Co-authored-by: vyzo <vyzo@hackzen.org>
Co-authored-by: Nishant Das <nishdas93@gmail.com>
Co-authored-by: GitHub <noreply@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Prem Chaitanya Prathi <chaitanyaprem@gmail.com>
Co-authored-by: Sukun <sukunrt@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants