-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: go1.14rc1 fatal error: invalid runtime symbol table: runtime: unexpected return pc for runtime.sigreturn called from 0x7 #37127
Comments
Is there a way that we can reproduce the problem ourselves? |
In ~25 minutes of running production traffic at just above 700 request per second per application, this occurred 12 times across the fleet (different instances each time). Each fatal error occurred about 1 to 4 minutes after the application started and received traffic. The application is a non-trivial HTTP API (GraphQL) gateway that aggregates responses from other services. Each time the specific Open to ideas for reproducing this case more minimally. Wild uneducated guess, but I saw a similar issue (#27540) call out profiling as a possible cause, which we do in this application (pprof on an interval with block and mutex profiles enabled). It has not been an issue until go1.14rc1 though. |
We see this error in testing Openshift when compiled with go1.14rc1 on ppc64le. By default this testing is done with the -race option on, and that results in a few dozen errors about unsafe pointer arithmetic, since -race now turns on checkptr testing. If I then run the tests with checkptr=0 along with -race, then I see the same with the invalid pc-encoded symbol table as above. I'm still trying to isolate the conditions which cause this consistently because from run to run the failures are sometimes different. So far it has only happens when running all the tests, if I try to just run a test that previously failed by itself it does not fail. If I test without -race then different errors occur. |
I think this may be related to CL https://go-review.googlesource.com/c/go/+/212079 . The code saving vdsoPC in walltime1/nanotime1 assumes the function has no frame That said, if my assumption is right, the PPC64 failure is probably a different one. |
Change https://golang.org/cl/219118 mentions this issue: |
The text in commit message of CL 219118 included:
Unfortunately, GitHub parses that as "Fix[es] #37127." and automatically closed this issue. I don't think that was intended, so re-opening. /cc @cherrymui |
Given that the crash is hard to reproduce, I cannot tell for sure whether the problem goes away with this, so "may fix". I guess we can close this and, if it didn't work, reopen it. Or we keep it open and let the reporter confirm. |
I am still working on trying to narrow down the failure that happens on ppc64le. Should I open a separate issue for that? I believe it began when the page allocator changed in early November but still trying to verify and find a smaller reproducer. |
@laboger yeah, I think it's better to open a new one. Quick question: do you have profiling turned on? If not, it is clearly a different problem. |
@cherrymui It is not happening with profiling but fails with -race -d=checkptr=0. If I turn off -race it gets a different error. I'll open a new issue. It is the same error output about the symbol table. |
I'm eager to try your fix @cherrymui. Will this be released in something like an |
Yeah, there might be one ~next week or so. You could also check out Go tip, which is very close to 1.14rc1, with just a small number of fixes. Thanks! |
Removing release-blocker, since we think this is probably fixed. If it turns out not to be, we can continue working on this and perhaps issue a fix in a point release. |
Hey all, just tested the changes in tip and wanted to confirm that I no longer see the problem on |
Thanks for confirming @tonyghita! I believe this issue is resolved then. It was waiting on feedback from you, the original reporter. I'll close it because there's nothing left to do. If there's anything else, please let us know! |
Thanks @tonyghita |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Indeed
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Built my application with the latest go1.14rc1 image on the official Docker registry, copied it to an alpine-3.8 based image and ran traffic through it.
What did you expect to see?
No crashes. This application has not exhibited this fatal error before go1.14rc1 (currently on go1.13.7).
What did you see instead?
The text was updated successfully, but these errors were encountered: