-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: incorrect compilation of specific functions in wasm #68156
Comments
Similar Issues (Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
cc @aclements |
Both
~/cogent/core/examples/basic/bin/web/ > wasm2wat app.wasm -o app.wat
error: label stack exceeds max nesting depth
102a212: error: OnBlockExpr callback failed
This also creates a much easier, faster, and more reliable way to check whether an app will fail, which removes the need for an Android device and the GOOS=js GOARCH=wasm go build -o app.wasm && wasm2wat app.wasm -o app.wat |
I have reduced the code necessary to reproduce the issue to this: package main
import (
"github.com/jinzhu/copier"
_ "github.com/yuin/goldmark"
)
func main() {
copier.Copy(&struct{}{}, struct{}{})
} You just have to run this command with that code to reproduce the issue: GOOS=js GOARCH=wasm go build -o app.wasm && wasm2wat app.wasm -o app.wat |
Inside of Also, just to be clear, we are not actually running the wasm file in the latest reproduction of this issue, just trying to process it, so this has nothing to do with these functions causing issues at runtime. This is only about the compiled wasm. |
Actually, you can reproduce the issue even without package main
import (
"bytes"
"github.com/yuin/goldmark"
)
func main() {
goldmark.New().Convert([]byte{}, &bytes.Buffer{})
} And the same command as above: GOOS=js GOARCH=wasm go build -o app.wasm && wasm2wat app.wasm -o app.wat The only way to get this code to work is to comment out |
Also note that in other situations with more complicated code, even commenting out the contents of |
cc @golang/wasm |
By copying code from goldmark and experimenting, I have gotten this down to a minimal standalone reproducible example, which I have attached here (it is a Here is a simple list of the steps required to reproduce this issue now:
GOOS=js GOARCH=wasm go build -o app.wasm && wasm2wat app.wasm -o app.wat You should observe the following error indicating that the compiled wasm is invalid:
Note that this same issue results not only in this error but also Also note that this is not the only way to reproduce this issue, and there are other combinations of functions that result in the same issue as discussed above. |
@kkoreilly If I understand correctly this is unrelated to to the go command itself, right? It sounds like this is perhaps an issue in the compiler, or elsewhere in the toolchain? I'm asking so we can get the right people to look at it. |
Yes, it is an issue with the compiler for wasm, not the go command itself. |
A further note overall: the wasm2wat command is the most sensitive indicator of the issue, and even when it is just under threshold of crashing, it produces extremely large .wat files and takes a long time to run. This is consistent with some kind of "overloading" problem that is cumulative, and might explain the otherwise strange interactions with the copier package, and the fact that even when we eliminated the offending Chromium on the android platform appears to be the most sensitive to this overloading problem, as indicated by this report: https://issues.chromium.org/issues/40770034 and others like it. Interestingly, this issue contains a reference to this other Go project: https://github.com/bvisness/golang-wasm-turbofan-crash -- the author there notes:
So this is consistent with what we are observing. I could not find a prior Go issues report for this case. Finally, as the above issue notes, the problem arises when this code is being optimized by the TurboFan stage of chromium wasm loading: https://v8.dev/docs/wasm-compilation-pipeline which explains the otherwise puzzling delay in seeing the crash on an android device. We were able to modify the chrome flags to disable the baseline loader and only use TurboFan, which resulted in a deterministic crash after the optimization process presumably ran out of memory. Given how sensitive wasm-opt and wasm2wat are to this issue, and the clearly large amount of RAM it is consuming (wasm-opt went over 48GB on my macbook pro with 64GB), it is likely that fixing this branchy code generation will benefit all platforms in terms of wasm loading and reducing the load on the TurboFan step so the optimized code is available sooner. Therefore, it would seem that this should be a very high priority issue overall! |
@rcoreilly One reason for this is WebAssembly/design#796 (comment). |
@neelance so it looks like maybe there are some possible solutions but we shouldn't be holding our breath? :) |
Also, in the meantime, is there any documentation for how to avoid some of the worst-case scenarios, e.g., for initializing maps or other problematic cases? |
CC @golang/compiler |
UPDATE: I found a much simpler way to reproduce this issue; see #68156 (comment)
Go version
go version go1.22.4 darwin/arm64
Output of
go env
in your module/workspace:What did you do?
Apologies for the convoluted steps for reproducing. I tried for several hours to get a more reproducible example, but I could not. Somehow there is some deleterious interaction between these various components, and removing any one of them stops the issue. This may seem like an issue to do with one of these third party libraries, but as you will see, this behavior seems to almost certainly be an issue with the compiler or Chromium itself.
I made a
go.work
workspace containing local clones of the main/master branches of Cogent Core (https://github.com/cogentcore/core) and goldmark (https://github.com/yuin/goldmark). Then, I went to Cogent Core (cd core
) and installed the Cogent Core tool (go install ./cmd/core
). Then, I went to the basic example in Cogent Core (cd examples/basic
) and ran the program for web usingcore run web
.core run web
is a command that runsgo build
usingGOOS=js GOARCH=wasm
under the hood. Then, I went to port 8080 of my computer's local IP address (http://192.168.1.x:8080
; you can runifconfig
to get yours) in Google Chrome on an Android phone (mine is a Pixel 7 Pro running Android 14 with Chrome 126.0.6478.72). I observed that everything worked as expected.Then, I changed the following line in
basic.go
inexamples/basic
fromto
Then, I repeated the
core run web
step from above, opened a new tab on the Android phone, and went to the same URL again. After waiting for some amount of time (no more than a minute), I observed the issue described below.What did you see happen?
After following all of the steps above, the browser tab reliably turns from correctly rendering the app to displaying an
Aw, Snap!
error screen (as in https://support.google.com/chrome/answer/95669). Nothing is printed to the console and no further information about a specific error is given.What did you expect to see?
I expected to see the browser tab continue to correctly render the app.
After several hours of debugging, I have determined that the cause of the issue can be narrowed down to a single function declared in goldmark:
util.ReadWhile
. When I change the body of this function to just bepanic("ReadWhile")
, not only does the issue described above cease to happen, the app does not panic. This means that we are actually not even calling this function in the first place, yet somehow its contents are causing bizarre app crashes. I tried various different variations of this function:println("ReadWhile")
with areturn 0, false
afterward results in no crashing and no print statement, whereas the same but withfmt.Println("ReadWhile")
results in crashing just like the original body of the function. Several variations on the argument and loop structure of the function have not produced any changes in the crashing behavior; you basically have to remove the entire body of the function for it to stop crashing. Changing the name toSomethingElse
also does not stop the crashing. I also tried moving thisReadWhile
function elsewhere, but somehow I could not replicate the crash without it being in the same place in goldmark.The fact that the innocuous contents of a function that is never called lead the program to randomly crash without any further details suggests that this is not a coding error on the part of a third party library but rather some issue with the way that the compiler is building wasm files, or the way that Chromium on Android is handling them. Cogent Core has its own custom
wasm_exec.js
file, but I verified that the issue still occurs with the standard version contained inmisc
.I have only been able to reproduce this issue in Chrome on Android and other Chromium browsers on Android such as Kiwi Browser. It does not occur in Firefox on Android, Safari on iOS, or Chrome on macOS, Ubuntu, or Windows.
The text was updated successfully, but these errors were encountered: