-
Notifications
You must be signed in to change notification settings - Fork 709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PVF: include host architecture in artifact filenames #2871
Conversation
On node restart, artifacts not matching current host architecture get pruned
The CI pipeline was cancelled due to failure one of the required jobs. |
Wasn't the new VM on same architecture (x86_64)? Looking at the issue it seems new VM didn't support |
@ordian I didn't know that! How does wasmtime determine which instructions are available? If this is too much complication we can just remove the artifacts persistence. It's indeed not necessary but an optimization, which with PolkaVM's greatly reduced compile times may just be unnecessary. |
Internally, it's surely implemented using CPUID instruction, but I'm pretty sure there are a lot of crates that encapsulate that functionality, and Wasmtime uses one of them. One idea is to include raw CPUID in the artifact name (after all, that is what "architecture descriptor" would stand for in a wider sense). |
Looks like it (well, cranelift) uses
I'm a bit hesitant about the "mostly". 😅 Will do more research. |
I'm not sure if CPUID encapsulates all possible features (seems like it based on the code in std, but not sure if it's a guarantee), and it looks like mainly an x86 thing. And I saw that std checks features a different way on arm. After a call with @eskimor we think it makes sense to re-enable the artifact pruning if there are no objections? |
Considering the upcoming changes, I'm for it. Node restart is not something that should happen often. We can revert and get back to pruning artifacts just to be on the safe side. (I just hope that nobody will figure out how to change the host architecture without restarting node 😁) |
Yeah, just saying "architecture" was a little bit dumb by me. Maybe we could use something like this: https://docs.rs/machine-uid/latest/machine_uid/? |
That is a great idea! I am just not sure how machine-id would work on a VM. VMs should be reproducible so it's not clear if the machine-id is guaranteed to be different when e.g. cloning a VM. |
Hmm yeah. Can we not request the compilation options that wasmtime uses and then hash these options? |
Looks like cranelift exposes it here, and it implements |
Indeed it looks quite doable. We can also get what cranelift sees as the host target this way. Only issue I see is wasmtime possibly overriding these settings for some reason. It seems safer to remove the artifact persistence as it's not really needed anyway? |
Personally I would suggest to do what I proposed in the original PR - never persist the cache between node restarts (by e.g. just blindly deleting everything on startup), and just use a long randomly generated hex string as the filename to guarantee that there can't be any collisions on the disk. This is dead simple, and even if the artifacts are not cleared out for some reason it won't try to accidentally load them. |
I like that. I'm going to start implementing it because this issue should be fixed in tomorrow's release. |
I raised #2895 - if that looks fine, I can close this PR. |
Considering the complexity of #2871 and the discussion therein, as well as the further complexity introduced by the hardening in #2742, as well as the eventual replacement of wasmtime by PolkaVM, it seems best to remove this persistence as it is creating more problems than it solves. ## Related Closes #2863
Considering the complexity of paritytech#2871 and the discussion therein, as well as the further complexity introduced by the hardening in paritytech#2742, as well as the eventual replacement of wasmtime by PolkaVM, it seems best to remove this persistence as it is creating more problems than it solves. ## Related Closes paritytech#2863
On node restart, artifacts not matching current host architecture get pruned
Potential Follow-ups / Open Questions
This PR attempts an immediate fix for #2863. There are some potential follow-ups as well as open questions:
RuntimeConstruction
errors?I am not sure if these are real concerns. We can either implement these follow-ups and leave a note above the exec params about the concerns, or we just skip doing this. @s0me0ne-unkn0wn
Related
Closes #2863