-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os: FreeBSD system crashes when adding regular files to kqueue #19093
Comments
CL https://golang.org/cl/36800 mentions this issue. |
This changes the os package to use the runtime poller for file I/O where possible. When a system call blocks on a pollable descriptor, the goroutine will be blocked on the poller but the thread will be released to run other goroutines. When using a non-pollable descriptor, the os package will continue to use thread-blocking system calls as before. For example, on GNU/Linux, the runtime poller uses epoll. epoll does not support ordinary disk files, so they will continue to use blocking I/O as before. The poller will be used for pipes. Since this means that the poller is used for many more programs, this modifies the runtime to only block waiting for the poller if there is some goroutine that is waiting on the poller. Otherwise, there is no point, as the poller will never make any goroutine ready. This preserves the runtime's current simple deadlock detection. This seems to crash FreeBSD systems, so it is disabled on FreeBSD. This is issue 19093. Using the poller on Windows requires opening the file with FILE_FLAG_OVERLAPPED. We should only do that if we can remove that flag if the program calls the Fd method. This is issue 19098. Update #6817. Update #7903. Update #15021. Update #18507. Update #19093. Update #19098. Change-Id: Ia5197dcefa7c6fbcca97d19a6f8621b2abcbb1fe Reviewed-on: https://go-review.googlesource.com/36800 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
Do you have a stacktrace of the FreeBSD system? |
@ianlancetaylor Can you please post some more information about the crash? The stack trace, output of |
Sorry for not noticing your earlier message. I do not have a stack trace of the system. I am accessing the system using https://godoc.org/golang.org/x/build/cmd/gomote, which runs FreeBSD in a VM, and I am deducing that the system has crashed because it stops responding for a while, and when it does start responding the system uptime has been reset to a very small number. I assume that what is happening is that the VM is being discarded and restarted. Perhaps @bradfitz can say something more about that. This is how I reproduce it (I just verified that it still fails like this for me). First, remove these lines from src/os/file_unix.go:
Then run
When I try this it gets to the point where it prints
which is the point where it starts running cmd/go built with the new polling code. At that point the run stops and gomote prints
Further attempts to use
Then after a while the gomote usually recovers, but running With luck you can create the problem on a real FreeBSD system, or on a VM which you fully control, by simply editing src/os/file_unix.go as suggested and then running |
I suppose gomote & the build system could be modified to watch the VM's serial console & log that, but I don't have time for that at the moment. For debugging this, somebody should just use a FreeBSD VM directly, not via gomote. |
I'll try it on a real FreeBSD system. But when you've done it using gomote, did you notice if there was anything in |
The gomote proxy server that owns the VM destroys it immediately if it fails its heartbeats. The user (Ian, here) would have no way to check those things before the machine was nuked. |
I can't reproduce on FreeBSD head or stable/10 amd64. Can you please figure out the exact version that gomote is using, and any nondefault configurations? |
I can reproduce it on 11.0-RELEASE. This turns out to be a duplicate of FreeBSD PR 24923, even though the stack traces look different. Here's the stacktrace:
This bug was fixed in head by r310302 and later MFCed to stable/11 by r310578, but no update was ever issued for FreeBSD 11.0. I checked with secteam, and they don't plan to issue an update. So you should continue to use your current workaround until 11.1 is released, which will happen sometime in June. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214923 |
@asomers, thanks for the investigation and update! |
Btw, the x/build/cmd/debugnewvm does the serial console thing, btw, but gomote doesn't yet. But sounds like it's not needed now, since the bug has been fixed. @ianlancetaylor, do you want to conditionally enable this behavior based on FreeBSD version? In the meantime I can update our builder image from 11.0 to 11.1. We still have our 9.3 builders running too, which would test the fallback path if you make it $FREE_VERSION >= 11.1. |
Now that FreeBSD 12 is out, it would be nice if files were finally added to the poller by default. @bradfitz @ianlancetayler, any news about enabling this conditionally dependending on FreeBSD version? |
@fuzxxl No news on that. Do you know what kernel version fixes the problem? |
@ianlancetaylor According to @bradfitz previous comment, r310302 fixed this issue. This was already part of FreeBSD-11.1 apparently. I just tested this on my FreeBSD 12-RELEASE system and it did crash or fail with polling enabled for regular files. |
I've commented out these lines: Lines 126 to 128 in 58a17b4
And ran ./all.bash on FreeBSD 10.4 and 12.0, they both pass. I've previously asked @bradfitz if we could drop the 10.3 builder for this. (#25289 (comment) - no, not before go 1.13 where we officially drop FreeBSD 10 support). I do have an additional concern: kqueue supports EVFILT_READ on Vnodes (regular files) but not EVFILT_WRITE. We however always register for both events. When running the following program to read a large file with package main
import (
"io"
"os"
"sync"
)
func main() {
f, err := os.Open("/boot/kernel/kernel")
if err != nil {
panic(err)
}
defer f.Close()
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
for {
b := make([]byte, 1024)
_, err := f.Read(b)
if err != nil {
if err == io.EOF {
break
}
panic(err)
}
}
}()
wg.Wait()
}
|
Does |
This might be possible. A FreeBSD programmer told me that files are always ready to read/write, so this mechanism might not work as-is. I think kqueue has to be combined with AIO using |
That is correct. Plain files are always ready for reading and writing. AIO works with kqueue because the asynchronous part is the actual read or write; with sockets the asynchronous part is waiting before you start the read or write. But you're out of luck on OSX; EVFILT_AIO isn't supported there. |
Thanks. It seems possible that we should continue avoiding the poller for regular files on FreeBSD, and should perhaps do the same on the other *BSDs. |
I could be wrong, but I thought that the same was true on Linux; poll(2) will always indicate that plain files are ready for reading and writing. |
On GNU/Linux we use |
Yes, both events return immediately as far as I can tell. The EVFILT_READ event actually reports the offset from the current position to end of file. It can be seen decreasing in the In the sample program (with a short time.Sleep after the for loop), once EOF is reached only the EVFILT_WRITE keeps being reported as ready. If I also drop the EVFILT_WRITE registration from the netpoller, then the fd is no longer reported as ready by kevent once it is read to the end. Is this something we wish to pursue? The added benefit would be not requiring an OS thread when reading regular files. diff --git a/src/runtime/netpoll_kqueue.go b/src/runtime/netpoll_kqueue.go
index fdaa1cd80d..88afaba44b 100644
--- a/src/runtime/netpoll_kqueue.go
+++ b/src/runtime/netpoll_kqueue.go
@@ -38,9 +38,9 @@ func netpollopen(fd uintptr, pd *pollDesc) int32 {
ev[0].fflags = 0
ev[0].data = 0
ev[0].udata = (*byte)(unsafe.Pointer(pd))
- ev[1] = ev[0]
- ev[1].filter = _EVFILT_WRITE
- n := kevent(kq, &ev[0], 2, nil, 0, nil)
+ //ev[1] = ev[0]
+ //ev[1].filter = _EVFILT_WRITE
+ n := kevent(kq, &ev[0], /*2*/ 1, nil, 0, nil)
if n < 0 {
return -n
}
|
We would only be able to avoid an OS thread in a case where the We can't simply drop the Perhaps we should call |
Thanks for explaining, you're referring to this, correct? go/src/internal/poll/fd_unix.go Lines 165 to 172 in a3b0144
The manpages for NetBSD, OpenBSD, DragonFly and OS X all indicate |
Adding a test means adding another system call, so we should only do it on systems where we can't avoid it in some other way. |
Change https://golang.org/cl/156379 mentions this issue: |
I've tested only on OpenBSD 6.4 and NetBSD 8.0 using
Attached is the full OpenBSD Update: Dragonfly 5.4.1 behaves the same (first kevent returns 2 filters, and then no more). |
The kqueue based netpoller always registers file descriptors with EVFILT_READ and EVFILT_WRITE. However only EVFILT_READ notification is supported for regular files. On FreeBSD a regular file is always reported as ready for writing, resulting in a busy wait. On Darwin, Dragonfly, NetBSD and OpenBSD, a regular file is reported as ready for both reading and writing only once. Updates #19093 Change-Id: If284341f60c6c2332fb5499637d4cfa7a4e26b7b Reviewed-on: https://go-review.googlesource.com/c/156379 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change https://golang.org/cl/172937 mentions this issue: |
Follow up CL 156379. Updates #19093 Change-Id: I5ea3177fc5911d3af71cbb32584249e419e9d4a3 Reviewed-on: https://go-review.googlesource.com/c/go/+/172937 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change https://golang.org/cl/371016 mentions this issue: |
This was fixed by https://golang.org/cl/165801 which dropped support for FreeBSD 10. |
This test has worked since CL 165801 (committed March 12, 2019), so stop skipping it. With this, we check that Close makes concurrent I/O operations on pipes return Errclosed on all platforms. Updates #19093. Change-Id: Ic090c70996c115abf80d8f9b93ca2aeaf347c9d8 Reviewed-on: https://go-review.googlesource.com/c/go/+/371016 Trust: Austin Clements <austin@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
Change https://golang.org/cl/373358 mentions this issue: |
This test works on FreeBSD since CL 165801 was submitted. Updates #19093 Change-Id: I45ffeb403c1de4385cdb21b9647f21976061e1ea Reviewed-on: https://go-review.googlesource.com/c/go/+/373358 Trust: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
While testing https://golang.org/cl/36800, it appears that the FreeBSD amd64 system crashes when adding regular disk files to kqueue. This is based on testing using gomote, which creates a VM running FreeBSD 10.1 (I think). I am going to disable using kqueue on FreeBSD for regular files; it will still be used for sockets as usual. I am opening this issue to note the problem in the hopes that it can be fixed by someone more familiar with FreeBSD.
The text was updated successfully, but these errors were encountered: