Skip to content

Commit

Permalink
pkg/netns: do not loop forever
Browse files Browse the repository at this point in the history
So this is not so simple as one thinks, apparently there are cases where
it is impossible to remove the file but umount() worked fine...
We fixed one issue that ran into this[1] but there seems to be
another[2] problem, unknown cause yet.

Regardless of the real fix for issue[2] add a timeout to not hang/loop
forever. If we were not able to remove the file after 60s give up and
print an error. Leaking these files is not great as the netns references
stay around but it will not prevent containers from running. It will
only start leaking resources.

[1] https://issues.redhat.com/browse/RHEL-59620
[2] containers/podman#24487

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
  • Loading branch information
Luap99 committed Nov 8, 2024
1 parent 50126e3 commit ef5388b
Showing 1 changed file with 7 additions and 5 deletions.
12 changes: 7 additions & 5 deletions pkg/netns/netns_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -273,8 +273,10 @@ func UnmountNS(nsPath string) error {
return fmt.Errorf("failed to unmount NS: at %s: %w", nsPath, err)
}

for {
if err := os.Remove(nsPath); err != nil {
var err error
// wait for up to 60s in the loop
for range 6000 {
if err = os.Remove(nsPath); err != nil {
if errors.Is(err, unix.EBUSY) {
// mount is still busy, sleep a moment and try again to remove
logrus.Debugf("Netns %s still busy, try removing it again in 10ms", nsPath)
Expand All @@ -283,12 +285,12 @@ func UnmountNS(nsPath string) error {
}
// If path does not exists we can return without error.
if errors.Is(err, unix.ENOENT) {
break
return nil
}
return fmt.Errorf("failed to remove ns path: %w", err)
}
break
return nil
}

return nil
return fmt.Errorf("failed to remove ns path (timeout after 60s): %w", err)
}

0 comments on commit ef5388b

Please sign in to comment.