-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-1805: threads are leaking with continous adding and deleting pods [BP 1.7] #427
NETOBSERV-1805: threads are leaking with continous adding and deleting pods [BP 1.7] #427
Conversation
@msherif1234: This pull request references NETOBSERV-1805 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/ok-to-test |
New image: It will expire after two weeks. To deploy this build, run from the operator repo, assuming the operator is running: USER=netobserv VERSION=f9c17c7 make set-agent-image |
/lgtm |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release-1.7 #427 +/- ##
==============================================
Coverage ? 30.00%
==============================================
Files ? 50
Lines ? 4090
Branches ? 0
==============================================
Hits ? 1227
Misses ? 2756
Partials ? 107
Flags with carried forward coverage won't be shown. Click here to find out more.
|
6a7ab06
to
3973f3f
Compare
/ok-to-test |
New image: It will expire after two weeks. To deploy this build, run from the operator repo, assuming the operator is running: USER=netobserv VERSION=0b40ac4 make set-agent-image |
3973f3f
to
7cd27d3
Compare
/ok-to-test |
New image: It will expire after two weeks. To deploy this build, run from the operator repo, assuming the operator is running: USER=netobserv VERSION=d71f5c2 make set-agent-image |
@@ -31,6 +31,7 @@ type Watcher struct { | |||
linkSubscriberAt func(ns netns.NsHandle, ch chan<- netlink.LinkUpdate, done <-chan struct{}) error | |||
mutex *sync.Mutex | |||
netnsWatcher *fsnotify.Watcher | |||
nsDone map[string]chan struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those new channels seem never closed, isn't it another source of leak?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added defer in sendUpdates to close them
@@ -191,9 +195,15 @@ func (w *Watcher) netnsNotify(ctx context.Context, out chan Event) { | |||
} | |||
if event.Op&fsnotify.Create == fsnotify.Create { | |||
ns := filepath.Base(event.Name) | |||
log.WithField("netns", ns).Debug("netns notification") | |||
log.WithField("netns", ns).Debug("netns create notification") | |||
w.nsDone[ns] = make(chan struct{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when creating a new channel for map key ns
, is there any risk of collision / leak if there was already previously a channel stored in w.nsDone[ns]
? Might be safer to first check if it isn't empty, close the old channel ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that will be an issue as we suppose to know about netns creation or deletion once
7cd27d3
to
95c1388
Compare
/ok-to-test |
New image: It will expire after two weeks. To deploy this build, run from the operator repo, assuming the operator is running: USER=netobserv VERSION=21e1544 make set-agent-image |
95c1388
to
fc43a82
Compare
/ok-to-test |
New image: It will expire after two weeks. To deploy this build, run from the operator repo, assuming the operator is running: USER=netobserv VERSION=74d90ea make set-agent-image |
pkg/ifaces/watcher.go
Outdated
if event.Op&fsnotify.Remove == fsnotify.Remove { | ||
ns := filepath.Base(event.Name) | ||
log.WithField("netns", ns).Debug("netns delete notification") | ||
w.nsDone[ns] <- struct{}{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a risk that this code is called after the channel was closed and deleted?
ie. w.sendUpdates
finishes, and then we get a fsnotify.Remove?
(just asking)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code in sendUpdates is blocking won't finish till done is happen which is triggered via fsnotify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a check around the remove to be safe
…g pods Signed-off-by: Mohamed Mahmoud <mmahmoud@redhat.com> (cherry picked from commit 49d3add)
fc43a82
to
f99b0d1
Compare
/ok-to-test |
New image: It will expire after two weeks. To deploy this build, run from the operator repo, assuming the operator is running: USER=netobserv VERSION=fd27950 make set-agent-image |
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: msherif1234 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
66df1f8
into
netobserv:release-1.7
Signed-off-by: Mohamed Mahmoud mmahmoud@redhat.com
(cherry picked from commit 6111b98)
Description
Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.