-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: SIGSEGV
via go test with -race -cover and -tags netgo on alpine linux/arm64
#59369
Comments
CC @thanm |
Thanks for the report. I tried reproducing this on our linux/arm64 builder, but so far no luck. It is a bit surprising that you're seeing this only with Go 1.20.2 and not Go 1.20.1. There is a coverage related bugfix in that sequence of commits ( I looked at the assembly we're generating for the function in question ( Not sure what to do next to debug this, without a way to repro the crash. |
Hmmm intruiging. The only other thing of note was that I could only repro this on |
We do have a |
(Probably not, though, what with it being |
@GeorgeMac, I suspect that this could be a bad interaction between coverage flushing and a race in your test. The test function can exit the goroutine here without ever waiting for the If it takes that exit path, then it seems plausible that the goroutine in the |
It would be helpful to see what other goroutines are running at the time of the crash. |
Here are all the goroutine stacks from the crash (below), is there something more I can do to get you information on other goroutines at the time of the crash? I just changed the
|
This may be related to #59435? EDIT: See the duplicate issue for a short explanation. |
In triage, we think that it might be worth running the program under |
Unsure if you wanted me to do the I could run
Bit of googling suggested some flags for strace ( So I have attached a tarball of that output to this issue. |
Thanks. Kind of a messy strace, since it include the Go command and all the stuff it invokes prior to the test executable, but I think I can pick out the necessary bits. In test.trace.4105 I can see the execve("/tmp/go-build1710159565/b001/memory.test"), and in that same trace I see the mmap calls that look like the ones being made by the tsan runtime (see https://github.com/llvm/llvm-project/blob/5c0c2bc1c8d84d01e7a07e1aab9f468b36d071db/compiler-rt/lib/tsan/rtl/tsan_platform.h#L532 that shows a layout of the tsan regions):
but I don't see anything suspicious or that might conflict with Go memory so far. More worked needed. |
Hi, one more request please, which is to build a test executable and dump out the section headers, e.g.
In my linux-arm64 test binary what is interesting about the offending store is that it is right at the very end of the
and here's the counter variable symbol being written:
which is literally the last thing there. I'm wondering if maybe we have an off-by-one somewhere that is producing an address that is slightly outside the section (in combination with slightly different stuff on alpine maybe). Another thing that might conceivably help is the objdump -tldr output for just the function in question (e.g. |
I've attached both outputs below:
|
Thanks. From my analysis it looks about the same on your system as on my linux/arm64-- faulting address is way at the end of the .noptrbss section, but does seem to be legal (unless I messed up my math). |
http://go.dev/cl/c/go/+/503937 may also fix this issue -- very similar failure modes, I am willing to bet this is the same bug. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes. Infact I can't reproduce it on go 1.20.1
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I can comfortably reproduce it in this particular open source package:
https://github.com/flipt-io/flipt/tree/main/internal/storage/oplock/memory
I can consistently reproduce a
SIGSEV
when I performgo test -cover -race -tags netgo .
.It requires all three of these flags to be present to trigger it.
What did you expect to see?
No segfault. Works for
golang:1.20.1-alpine3.16
docker image. However,golang:1.20.2-alpine3:16
fails consistently with the error seen above.What did you see instead?
A
SIGSEGV
fatal error.The text was updated successfully, but these errors were encountered: