Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid guid pool exhaustion from guid leaks by syncing it with SM #92

Merged
merged 1 commit into from
Oct 25, 2022

Conversation

almaslennikov
Copy link
Collaborator

If the node running pods that use ib network is restarted, those pods'
GUID are deleted from UFM but are persisted in ib-kubernetes.
They become unusable and might exhaust the GUID pool if it's configured to be small enough.
The solution is to sync guid pool with UFM because some GUIDs might have become free to use.

Signed-off-by: amaslennikov amaslennikov@nvidia.com

@coveralls
Copy link

coveralls commented Oct 20, 2022

Pull Request Test Coverage Report for Build 3287401876

Warning: This coverage report may be inaccurate.

We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
To ensure accuracy in future PRs, please see these guidelines.
A quick fix for this PR: rebase it; your next report should be accurate.

  • 45 of 60 (75.0%) changed or added relevant lines in 3 files are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-1.4%) to 83.392%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/sm/plugins/noop/noop.go 0 4 0.0%
pkg/sm/plugins/ufm/ufm.go 26 30 86.67%
pkg/guid/guid_pool.go 19 26 73.08%
Files with Coverage Reduction New Missed Lines %
pkg/guid/guid_pool.go 2 89.29%
Totals Coverage Status
Change from base Build 3104941568: -1.4%
Covered Lines: 477
Relevant Lines: 572

💛 - Coveralls

pkg/daemon/daemon.go Outdated Show resolved Hide resolved
pkg/guid/guid_pool.go Show resolved Hide resolved
pkg/sm/plugins/plugin.go Outdated Show resolved Hide resolved
pkg/sm/plugins/ufm/ufm.go Show resolved Hide resolved
pkg/sm/plugins/ufm/ufm.go Outdated Show resolved Hide resolved
pkg/guid/guid_pool.go Outdated Show resolved Hide resolved
pkg/daemon/daemon.go Show resolved Hide resolved
Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nits otherwise LGTM

If the node running pods that use ib network is restarted, those pods'
 GUID are deleted from UFM but are persisted in ib-kubernetes.
They become unusable and might exhaust the GUID pool
if it's configured to be small enough.
The solution is to sync guid pool with UFM because some GUIDs
might have become free to use.

Signed-off-by: amaslennikov <amaslennikov@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants