Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasi] Cannot find host wasmtime #100394

Closed
pavelsavara opened this issue Mar 28, 2024 · 22 comments · Fixed by #101599
Closed

[wasi] Cannot find host wasmtime #100394

pavelsavara opened this issue Mar 28, 2024 · 22 comments · Fixed by #101599
Assignees
Labels
arch-wasm WebAssembly architecture area-Build-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab os-wasi Related to WASI variant of arch-wasm
Milestone

Comments

@pavelsavara
Copy link
Member

pavelsavara commented Mar 28, 2024

Log

    Wasi.Build.Tests.ILStripTests.WasmStripILAfterAOT_TestDefaultAndOverride(stripILAfterAOT: "", expectILStripping: True, singleFileBundle: False) [FAIL]
       Expected 0 exit code but got 255: /root/helix/work/workitem/e/dotnet-latest/dotnet run --no-silent --no-build -c Release
      Standard Output:
      [Release_3ubcj3n5_xxb] WasmAppHost --runtime-config /root/helix/work/workitem/e/wbt artifacts/Release_3ubcj3n5_xxb/bin/Release/net9.0/wasi-wasm/AppBundle/Release_3ubcj3n5_xxb.runtimeconfig.json --no-silent
      [Release_3ubcj3n5_xxb] Error: Cannot find host wasmtime: Tried to look for wasmtime in PATH: /root/helix/work/workitem/e/dotnet-latest, /root/helix/work/correlation/dotnet-latest, /root/helix/work/correlation/xharness-cli, /root/helix/work/correlation/dotnet-cli, /root/helix/work/correlation/wasi-sdk, /root/helix/work/correlation/wasmtime, /home/helixbot/.jsvu/bin, /usr/local/sbin, /usr/local/bin, /usr/sbin, /usr/bin, /sbin, /bin .

Build Information

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=621924

Error Message

{
  "ErrorMessage": "Cannot find host wasmtime",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=621924
Error message validated: [Cannot find host wasmtime]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 3/28/2024 9:09:15 AM UTC

Report

Build Definition Test Pull Request
645396 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100141
645013 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100272
644758 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #101095
640285 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100157
640282 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100938
639606 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100661
639437 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100520
638169 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100407
637618 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100407
637478 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100858
637169 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100847
637144 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100846
635807 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100801
635488 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100386
634844 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100610
634832 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #91042
634711 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100767
633684 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish
631271 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100618
630000 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100407
629576 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100610
629024 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #98492
628517 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #98492
627438 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #96169
626872 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100517
624831 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100402
623845 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100283
623635 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100141
622955 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #96169
621924 dotnet/runtime Wasi.Build.Tests.WasiTemplateTests.ConsoleBuildThenRunThenPublish #100251

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 30
@pavelsavara pavelsavara added Known Build Error Use this to report build issues in the .NET Helix tab os-wasi Related to WASI variant of arch-wasm labels Mar 28, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Mar 28, 2024
@pavelsavara pavelsavara added the arch-wasm WebAssembly architecture label Mar 28, 2024
Copy link
Contributor

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Mar 28, 2024
@pavelsavara pavelsavara added area-Build-mono and removed untriaged New issue has not been triaged by the area owner needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Mar 28, 2024
@pavelsavara pavelsavara added this to the 9.0.0 milestone Mar 28, 2024
@pavelsavara
Copy link
Member Author

cc @akoeplinger

@akoeplinger
Copy link
Member

It looks like wasmtime is missing from the correlation payload which suggests to me something goes wrong here:

** Staging system directories before use as helix payloads **
Helix tries to write a `.payload` file to the payload directory, but if that is
not writable then it needs to be staged first. For example:
<HelixDependenciesToStage Condition="'$(NeedsWasmtime)' == 'true'" SourcePath="$(WasmtimeDir)" Include="$(WasmtimeDirForHelixPayload)" />

@maraf
Copy link
Member

maraf commented Apr 4, 2024

Red hering - windows investigation - on windows everything works

From Build.binlog
WasmtimeDir = D:\a\_work\1\s\artifacts\obj\wasmtime\
Downloading from "https://github.com/bytecodealliance/wasmtime/releases/download/v5.0.0/wasmtime-v5.0.0-x86_64-windows.zip" to "D:\a\_work\1\s\artifacts\obj\wasmtime-v5.0.0-x86_64-windows.zip" (3995602 bytes).
image
image
image
image
image
image
image

From SendToHelix.binlog
WasmtimeDir = D:\a\_work\1\s\artifacts\obj\wasmtime\
WasmtimeDirForHelixPayload = D:\a\_work\1\s\artifacts\obj\helix-staging\\wasmtime

One of the corellation payload was empty (empty zip) https://helixde107v0xdeko0k025g8.blob.core.windows.net/helix-job-887530b8-8cc0-4dab-a4c8-242ff41fa82b18102e030f94a06bf/c7d44711-29d4-4ff7-944f-947d4f8dea8a.zip

It all seems correct from the binlog :/

From Build.binlog
image
image
We don't provision wasmtime on linux. It's preinstalled in the image. Could it be that copy fails because of some symlink or something?

I have tried to ignore WASMTIME_PATH (system provided wasmtime) in #100664 and it worked. The corellation payload that is normally empty, contains wasmtime now.

@maraf
Copy link
Member

maraf commented Apr 5, 2024

I'm not seeing it anymore

EDIT: It happens sometimes and on linux lines only

@pavelsavara pavelsavara added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Apr 8, 2024
@maraf
Copy link
Member

maraf commented Apr 9, 2024

It looks like some AzDO machines have "broken" wasmtime installation. The collection payload for wasmtime is empty zip on them, while on the "others" is contains wasmtime

@lewing
Copy link
Member

lewing commented Apr 12, 2024

@dotnet/dnceng this is an intermittent failure on linux that appears to be container related, nothing is changing on our side

@maraf
Copy link
Member

maraf commented Apr 12, 2024

Missing wasmtime
image

Working wasmtime
image

The only difference is Worker Build Number (and Requirest Id). Other builds failed with the worker build number that passed here

@maraf
Copy link
Member

maraf commented Apr 12, 2024

The content of wasmtime on working case (it's not a symlink or something)
image

@pavelsavara
Copy link
Member Author

The content of wasmtime on working case (it's not a symlink or something)

Do we have the same for non-working image ?

@maraf
Copy link
Member

maraf commented Apr 12, 2024

Not yet. I'm starting to think about merging this info in main

@maraf
Copy link
Member

maraf commented Apr 16, 2024

Info from build
Build
SendToHelix.binlog

Source folder

Exec
    Assembly = Microsoft.Build.Tasks.Core, Version=15.1.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
    Parameters
        Command = ls -lsa /usr/local/wasmtime
    CommandLineArguments = ls -lsa /usr/local/wasmtime
    total 19764
        4 drwxr-xr-x 2 root root                       4096 Mar 26 19:59 .
        4 drwxr-xr-x 1 root root                       4096 Mar 26 19:59 ..
       12 -rw-r--r-- 1 1001 azure_pipelines_docker    12243 Jan 20  2023 LICENSE
        8 -rw-r--r-- 1 1001 azure_pipelines_docker     6610 Jan 20  2023 README.md
    19732 -rwxr-xr-x 1 1001 azure_pipelines_docker 20204640 Jan 20  2023 wasmtime
        4 -rw-r--r-- 1 root root                          6 Mar 26 19:59 wasmtime-version.txt

Correlation folder

Exec
    Assembly = Microsoft.Build.Tasks.Core, Version=15.1.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
    Parameters
        Command = ls -lsa /__w/1/s/artifacts/obj/helix-staging/wasmtime
    CommandLineArguments = ls -lsa /__w/1/s/artifacts/obj/helix-staging/wasmtime
    total 19764
        4 drwxr-xr-x 2 cloudtest_azpcontainer cloudtest_azpcontainer     4096 Apr 16 08:26 .
        4 drwxr-xr-x 4 cloudtest_azpcontainer cloudtest_azpcontainer     4096 Apr 16 08:26 ..
       12 -rw-r--r-- 1 cloudtest_azpcontainer cloudtest_azpcontainer    12243 Jan 20  2023 LICENSE
        8 -rw-r--r-- 1 cloudtest_azpcontainer cloudtest_azpcontainer     6610 Jan 20  2023 README.md
    19732 -rwxr-xr-x 1 cloudtest_azpcontainer cloudtest_azpcontainer 20204640 Jan 20  2023 wasmtime
        4 -rw-r--r-- 1 cloudtest_azpcontainer cloudtest_azpcontainer        6 Mar 26 19:59 wasmtime-version.txt

image

Correlation payload is empty https://helixde107v0xdeko0k025g8.blob.core.windows.net/helix-job-688688a5-2889-4ac0-a64f-0493e42e4d7ef03a1c00e8144989a/df58291e-6f6f-455a-befb-d41e018b2fab.zip?helixlogtype=result

EDIT:

There are two helix jobs (workloads and no-workloads), they share correlation payloads. The ls -lsa for the second one prints empty correlation folder for wasmtime

The target RunInParallelForEachScenario run two sendtohelixhelp.proj in parallel. The first one correctly stages correlation payloads, the second one skip them because of Target "StageDependenciesForHelix" skipped, due to false condition; (@(__HelixDependenciesToStageThatDontExist->Count()) > 0) was evaluated as (0 > 0).

I dunno yet what deletes files so the first job don't correctly pack them

@lewing
Copy link
Member

lewing commented Apr 24, 2024

Is this still matching the correct failures?

@maraf
Copy link
Member

maraf commented Apr 25, 2024

@pavelsavara You saw this on lib tests, right? But all of the matches are from WBT 🤔

@maraf

This comment was marked as outdated.

@pavelsavara
Copy link
Member Author

pavelsavara commented Apr 25, 2024

@pavelsavara You saw this on lib tests, right? But all of the matches are from WBT 🤔

I just saw ls -lsa twice because BuildHelixItems -> PrepareHelixCorrelationPayload_Wasi is called twice. In Build and in Test which are separate MSBuild runs. I don't know why.

Also, we won't see it after #101392

I'm not sure about that, why you think so ?
It will get downloaded every time

  • as before on windows
  • now as well on linux

@maraf
Copy link
Member

maraf commented Apr 25, 2024

I'm not sure about that, why you think so ?

My bad, sorry

@pavelsavara
Copy link
Member Author

pavelsavara commented Apr 25, 2024

well maybe it will fix because you saw it on linux, which was using pre-installed. Now it will be fresh every time.

@maraf
Copy link
Member

maraf commented Apr 25, 2024

well maybe it will fix because you saw it on linux, which was using pre-installed. Now it will be fresh every time.

That was my previous thinking, but later it looks like the files are there always, and then are deleted from staging folder

@maraf
Copy link
Member

maraf commented Apr 25, 2024

Theory

  • sendtohelixhelp.proj runs in parallel for two scenarios (workloads, noworkloads) EDIT: not scenario, but same loop
    • First runs StageDependenciesForHelix, because @(__HelixDependenciesToStageThatDontExist->Count()) > 0
        1. It creates folders for payloads
        1. It copies (a lot of) files
        1. Copying files is finished
        1. Executes SendToHelixJob, but zips are already uploaded, reusing them
    • Second doesn't run StageDependenciesForHelix, because @(__HelixDependenciesToStageThatDontExist->Count()) == 0
        1. Folders for payloads already exist
        1. Pass them to SendToHelixJob, which zips folders and upload them

According to this theory, WASI SDK payloads shouldn't be always complete

@pavelsavara
Copy link
Member Author

workloads, noworkloads are not scenarios

@lewing
Copy link
Member

lewing commented Apr 26, 2024

reopening for a few days to let the fix make it into pr builds

@lewing lewing closed this as completed Apr 26, 2024
@github-actions github-actions bot locked and limited conversation to collaborators May 27, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture area-Build-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab os-wasi Related to WASI variant of arch-wasm
Projects
None yet
4 participants