-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up unix socket #739
Clean up unix socket #739
Conversation
On MacOS it seems subsequent attempts to bind to the abstract Unix socket fail without first explicitly removing it. This issue seems to be MacOS-specific as it doesn't occur on an Ubuntu VM (using `lima`). Fixes sigstore#738 Signed-off-by: Paul Thomson <thomsonp83@gmail.com>
Codecov Report
@@ Coverage Diff @@
## main #739 +/- ##
==========================================
- Coverage 54.33% 54.18% -0.15%
==========================================
Files 36 36
Lines 2251 2257 +6
==========================================
Hits 1223 1223
- Misses 936 942 +6
Partials 92 92
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Not sure if the |
The error log message is:
|
cmd/app/grpc.go
Outdated
if err := os.RemoveAll(LegacyUnixDomainSocket); err != nil { | ||
log.Logger.Fatal(err) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of this, try adding lis.SetUnlinkOnClose(true)
on line 120 after the defer lis.Close()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I don't think that will work. It seems the @
in the socket address means it's in the abstract namespace which doesn't belong to a file on the filesystem, so not sure why a readonly filesystem won't let me remove it (yet will let it be created?)
Also, digging in to what os.RemoveAll
does, leads to this (https://github.com/golang/go/blob/e49e8764553bf510b5d9f6fb38aeec0850ec6672/src/os/file_unix.go#L301):
e := ignoringEINTR(func() error {
return syscall.Unlink(name)
})
so it seems that it's trying to do an unlink already? And lastly, looking into the lis.SetUnlinkOnClose(true)
call, it seems it won't unlink due to it being an abstract domain socket (i.e. starting with @
):
func (ln *UnixListener) close() error {
// The operating system doesn't clean up
// the file that announcing created, so
// we have to clean it up ourselves.
// There's a race here--we can't know for
// sure whether someone else has come along
// and replaced our socket name already--
// but this sequence (remove then close)
// is at least compatible with the auto-remove
// sequence in ListenUnix. It's only non-Go
// programs that can mess us up.
// Even if there are racy calls to Close, we want to unlink only for the first one.
ln.unlinkOnce.Do(func() {
if ln.path[0] != '@' && ln.unlink {
syscall.Unlink(ln.path)
}
})
return ln.fd.Close()
}
I'm not super well versed on the nuances of Unix Domain Sockets so will need to do more digging unless someone else has ideas...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this:
https://github.com/golang/go/blob/e49e8764553bf510b5d9f6fb38aeec0850ec6672/src/os/file_unix.go#L307
probably explains why the os.RemoveAll
fails with RO... Whereas I guess creating the socket in the abstract namespace only needs to create a file descriptor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After further review, abstract domain sockets are really a linux-only feature. We should probably wrap your proposed change with a check for if runtime.GOOS != "linux"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so the question for me still remains, how are these being handled on MacOS, as they still don't seem to be bound to a file on the filesystem. And I'd like to get to the bottom of why it's hanging about when it doesn't on Linux, but that might be getting a bit too far off course.
I guess I'm wondering now if:
a) there's any point in "fixing" this since it's the legacy GRPC server, like does it have a future anyway? and
b) if this is intended to be run in containers anyway is MacOS nuance something that we should worry about?
I'll just add the condition you've suggested for now, and at least then if the legacy server goes away so does the unnecessary mess :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how are these being handled on MacOS
🤦 in the most obvious way, just a file in the local dir... will make the change now
MacOS doesn't have abstract unix domain sockets, and so creates a file in the local dir which isn't removed when the server stops (due to the handling of sockets beginning with '@') This file needs to be removed on subsequent runs otherwise there's a bind error Signed-off-by: Paul Thomson <thomsonp83@gmail.com>
On MacOS it seems subsequent attempts to bind to the abstract Unix
socket fail without first explicitly removing it. This issue seems to be
MacOS-specific as it doesn't occur on an Ubuntu VM (using
lima
).Fixes #738
Signed-off-by: Paul Thomson thomsonp83@gmail.com