Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust 1.78 introduced a panic in our application #41

Closed
elpiel opened this issue May 7, 2024 · 13 comments
Closed

Rust 1.78 introduced a panic in our application #41

elpiel opened this issue May 7, 2024 · 13 comments

Comments

@elpiel
Copy link

elpiel commented May 7, 2024

First of all, thank you very much for working on this allocator (it seems that it's still the only one feature-rich enough for wasm in Rust)!

After a recent upgrade to Rust 1.78 our application started getting a panic message:

panicked at /rust/deps/dlmalloc-0.2.6/src/dlmalloc.rs:1198:13:
assertion failed: psize <= size + max_overhead

I've tried re-compiling the wasm module on 1.76 and 1.77 and did not get the panic message.
I believe this could be related to recent changes in the Rust compiler which are causing this misbehavior:

https://blog.rust-lang.org/2024/05/02/Rust-1.78.0.html#asserting-unsafe-preconditions
https://blog.rust-lang.org/2024/05/02/Rust-1.78.0.html#deterministic-realignment

Honestly, I don't feel competent enough to dig into the dlmalloc code and debug our wasm module but I'll be happy to get some guidance or help with debugging this issue.

PS: Our application is open-source and the wasm build can be found here: https://github.com/Stremio/stremio-core-web

@alexcrichton
Copy link
Owner

cc @SFBdragon and #37, would you be able to help investigate this?

@alexcrichton
Copy link
Owner

Also, @elpiel, could you detail how to reproduce this with the repository you linked?

@elpiel
Copy link
Author

elpiel commented May 7, 2024

  1. Make sure you use Rust 1.78
  2. Clone https://github.com/Stremio/stremio-core-web
  3. Build wasm package - ./scripts/build.sh (by default it will build with release), I'm using wasm-pack 0.12.1
  4. Clone https://github.com/Stremio/stremio-web
  5. update package.json and update stremio-core-web to point to the stremio-core-web folder, e.g. file:../stremio-core-web
  6. npm i && npm start
  7. open https://localhost:8080 and open the console

@elpiel
Copy link
Author

elpiel commented May 7, 2024

Although it might not be related, I think it's good to link these issues and repos as well. They mention a nightly bug:

@SFBdragon
Copy link
Contributor

I'm willing to investigate but I can't say for sure when I'll be able to. Hopefully within a few days.

At this point I'm very tempted to write a "safe" memory allocator to help comprehensively detect allocation issues at runtime, given the (relatively small, I imagine!) number of allocation bugs I've diagnosed in the Rust WASM ecosystem from bug reports.

In the meantime, I'm curious what happens if you substitute dlmalloc for talc, as the latter has more debug assertions (but is actually worse at detecting a straight-up wrong-size deallocation, so it may or may not be informative). @elpiel would it be possible to swap out the allocator temporarily and test that? (disclaimer: I authored Talc, I'm biased 😸 )

@philpax
Copy link

philpax commented May 8, 2024

We are also seeing this behaviour internally; our WASM modules are dying at random, likely after some allocation of a few dozen kilobytes. I'll try to collect more data and report back.

The initial issue we commented upon was rustwasm/wasm-pack#1389, but I suspect that this is the root cause issue?

@SFBdragon
Copy link
Contributor

@philpax Thanks for letting us know. Given this seems to be effecting quite a few people, I'll try to make time to dive into this today, then 👍

@philpax
Copy link

philpax commented May 8, 2024

I wrote a quick test WASM application to determine if it was a fragmentation issue, but I can't reproduce the issue on 1.78.0:

[package]
name = "dlmalloc_test"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
wasm-bindgen = "0.2"
use wasm_bindgen::prelude::*;

static mut LAST_BYTES: Vec<Vec<u8>> = Vec::new();

#[wasm_bindgen]
pub fn allocate(bytes: usize) -> usize {
    unsafe {
        if LAST_BYTES.len() > 10 {
            LAST_BYTES.remove(0);
        }

        LAST_BYTES.push(vec![0; bytes]);

        LAST_BYTES.iter().map(|b| b.len()).sum()
    }
}
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <title>dlmalloc test</title>
    <script type="module">
      import init, { allocate } from "./pkg/dlmalloc_test.js";

      async function run() {
        await init();
        setInterval(() => {
          let bytesNow = Math.floor(Math.random() * 1000000);
          let bytesActive = allocate(bytesNow);

          let output = document.getElementById("output");
          let msg = `Total: ${bytesActive}b | Now: ${bytesNow}b`;
          output.textContent = msg;
          console.log(msg);
        }, 20);
      }

      run();
    </script>
  </head>
  <body>
    <p id="output"></p>
  </body>
</html>

This runs indefinitely without crashing. I was hoping this would catch it so that I could alter the asserts to print the values, but it looks like it'll take a bit more work to reproduce!

This is definitely using dlmalloc and has the assert available:

# wasm2wat pkg/dlmalloc_test_bg.wasm | grep dlmalloc
  (data (;0;) (i32.const 1048576) "\03\00\00\00\0c\00\00\00\04\00\00\00\04\00\00\00\05\00\00\00\06\00\00\00/rust/deps/dlmalloc-0.2.6/src/dlmalloc.rsassertion failed: psize >= size + min_overhead\00\18\00\10\00)\00\00\00\a8\04\00\00\09\00\00\00assertion failed: psize <= size + max_overhead # rest elided

@SFBdragon
Copy link
Contributor

SFBdragon commented May 8, 2024

Update:

@elpiel

I don't think this is an issue with dlmalloc. Swapping out dlmalloc for talc as the global allocator triggered almost the exact same error, only that Talc catches it because a flag it expects to be set for all allocations in not set when deallocating a Vec is attempted, which indicates an incorrect size was passed to the allocator upon deallocation.

Error and stack trace
panicked at /home/sfbea/src/ono/talc/talc/src/talc.rs:472:9:
assertion failed: tag.is_allocated()

Stack:

_callee2$/imports.wbg.__wbg_new_693216e109162396/<@webpack-internal:///../stremio-core-web/stremio_core_web.js:587:27
logError@webpack-internal:///../stremio-core-web/stremio_core_web.js:266:14
_callee2$/imports.wbg.__wbg_new_693216e109162396@webpack-internal:///../stremio-core-web/stremio_core_web.js:586:22
stremio_core_web.wasm.console_error_panic_hook::Error::new::heb2cba416d369fc9@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[53444]:0x1336d7d
stremio_core_web.wasm.console_error_panic_hook::hook_impl::hbe92aeb7afb6029b@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[9978]:0xcc3961
stremio_core_web.wasm.console_error_panic_hook::hook::h0cfcb229301b06a9@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[60654]:0x1397976
stremio_core_web.wasm.core::ops::function::Fn::call::h07a205ee61be0aa3@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[50091]:0x1302595
stremio_core_web.wasm.std::panicking::rust_panic_with_hook::h32c80a64fe4de396@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[19344]:0xf6ed2e
stremio_core_web.wasm.std::panicking::begin_panic_handler::{{closure}}::hd496964d114e98b9@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[26299]:0x10b206c
stremio_core_web.wasm.std::sys_common::backtrace::__rust_end_short_backtrace::h0d4686a7fe3981a4@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[68599]:0x13e0774
stremio_core_web.wasm.rust_begin_unwind@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[53634]:0x1339bdd
stremio_core_web.wasm.core::panicking::panic_fmt::hc7427f902a13f1a9@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[57652]:0x1370979
stremio_core_web.wasm.core::panicking::panic::hb157b525de3fe68d@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[52705]:0x132ba81
stremio_core_web.wasm.talc::talc::Talc<O>::free::h00e3ff37d0215b67@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[6195]:0xaf965c
stremio_core_web.wasm.<talc::talck::Talck<R,O> as core::alloc::global::GlobalAlloc>::dealloc::hf4fb32ab738e7540@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[28568]:0x1107b8b
stremio_core_web.wasm.__rust_dealloc@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[34263]:0x11b9c22
stremio_core_web.wasm.<alloc::alloc::Global as core::alloc::Allocator>::deallocate::h430e1721e944ddbb@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[19706]:0xf8246c
stremio_core_web.wasm.<alloc::raw_vec::RawVec<T,A> as core::ops::drop::Drop>::drop::h3a92634a5e1074d2@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[21145]:0xfcca66
stremio_core_web.wasm.core::ptr::drop_in_place<alloc::raw_vec::RawVec<u8>>::he15df48ab3dfb497@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[61670]:0x13a4c00
stremio_core_web.wasm.core::ptr::drop_in_place<alloc::vec::Vec<u8>>::h92cd1545d304c2bf@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[54762]:0x1349f80
stremio_core_web.wasm.core::ptr::drop_in_place<alloc::string::String>::he0a0b1c199465632@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[61665]:0x13a4af7
stremio_core_web.wasm.<stremio_core_web::env::WebEnv as stremio_core::runtime::env::Env>::fetch::{{closure}}::{{closure}}::h7787c07ad41fb979@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[17388]:0xeff876
stremio_core_web.wasm.core::result::Result<T,E>::and_then::h39fbc5c1734cf473@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[14645]:0xe4d647
stremio_core_web.wasm.<stremio_core_web::env::WebEnv as stremio_core::runtime::env::Env>::fetch::{{closure}}::h326e257a4183af35@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[22294]:0x1004885
stremio_core_web.wasm.<T as futures_util::fns::FnOnce1<A>>::call_once::h25166f17a6c2d287@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[49556]:0x12f984a
stremio_core_web.wasm.<futures_util::fns::MapOkFn<F> as futures_util::fns::FnOnce1<core::result::Result<T,E>>>::call_once::{{closure}}::hc0082c7641d5c791@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[55762]:0x13575a1
stremio_core_web.wasm.core::result::Result<T,E>::map::h3b36496dc5c0ec43@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[12734]:0xdbb53a
stremio_core_web.wasm.<futures_util::fns::MapOkFn<F> as futures_util::fns::FnOnce1<core::result::Result<T,E>>>::call_once::hb8fd52ca1ea8c7da@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[62712]:0x13b1a69
stremio_core_web.wasm.<futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll::h404b8f95be571bbc@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[6436]:0xb1d17b
stremio_core_web.wasm.<futures_util::future::future::Map<Fut,F> as core::future::future::Future>::poll::h3c61dc2a433b45cd@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[47695]:0x12d972a
stremio_core_web.wasm.<futures_util::future::try_future::MapOk<Fut,F> as core::future::future::Future>::poll::heb4e43c975f096be@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[48356]:0x12e4f90
stremio_core_web.wasm.<F as futures_core::future::TryFuture>::try_poll::h64ca1af9ba779a32@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[51995]:0x13207a4
stremio_core_web.wasm.<futures_util::future::try_future::try_flatten::TryFlatten<Fut,<Fut as futures_core::future::TryFuture>::Ok> as core::future::future::Future>::poll::h23b2fe1cc9a78ea8@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[2601]:0x80b408
stremio_core_web.wasm.<futures_util::future::try_future::TryFlatten<Fut1,Fut2> as core::future::future::Future>::poll::h2535b7219a260ac2@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[48514]:0x12e7beb
stremio_core_web.wasm.<futures_util::future::try_future::AndThen<Fut1,Fut2,F> as core::future::future::Future>::poll::h2e1749e705e6091d@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[48644]:0x12ea00f
stremio_core_web.wasm.<core::pin::Pin<P> as core::future::future::Future>::poll::hc21e751b34c699df@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[27612]:0x10e4b40
stremio_core_web.wasm.<futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll::hd7bc3e4214b8ded7@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[5523]:0xa900fb
stremio_core_web.wasm.<futures_util::future::future::Map<Fut,F> as core::future::future::Future>::poll::h4547f1718ef2f55a@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[47700]:0x12d988d
stremio_core_web.wasm.<core::pin::Pin<P> as core::future::future::Future>::poll::ha2dd031109f1a334@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[27609]:0x10e496f
stremio_core_web.wasm.<futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll::hbbc91a76e2017651@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[5147]:0xa50696
stremio_core_web.wasm.<futures_util::future::future::Map<Fut,F> as core::future::future::Future>::poll::h7cbc330481d84249@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[47778]:0x12dae2f
stremio_core_web.wasm.<futures_util::future::future::flatten::Flatten<Fut,<Fut as core::future::future::Future>::Output> as core::future::future::Future>::poll::h933aaa09a0f6c0d6@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[5338]:0xa71d4f
stremio_core_web.wasm.<futures_util::future::future::Then<Fut1,Fut2,F> as core::future::future::Future>::poll::h98ac72b783ea800c@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[41037]:0x12594a0
stremio_core_web.wasm.wasm_bindgen_futures::task::singlethread::Task::run::hb1d3b581e08f3256@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[7076]:0xb77a08
stremio_core_web.wasm.wasm_bindgen_futures::queue::QueueState::run_all::h72be003fc91b2266@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[9294]:0xc7f22c
stremio_core_web.wasm.wasm_bindgen_futures::queue::Queue::new::{{closure}}::h3c0c26eb2bc3d76e@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[31390]:0x11656f1
stremio_core_web.wasm.<dyn core::ops::function::FnMut<(A,)>+Output = R as wasm_bindgen::closure::WasmClosure>::describe::invoke::ha6db83d11de330c8@https://localhost:8080/4c86c4844082b4c73bc61dde0c63813315a22a9c/binaries/stremio_core_web_bg.wasm:wasm-function[20725]:0xfb7d03
__wbg_adapter_31@webpack-internal:///../stremio-core-web/stremio_core_web.js:298:8
real@webpack-internal:///../stremio-core-web/stremio_core_web.js:250:16


stremio_core_web.js:578:27

I suspect we just have a bad deallocation on our hands; this wouldn't be the first time rustwasm/wasm-bindgen#3801

The reason for this is that dlmalloc without assertions enabled is tolerant of deallocation/reallocations with the wrong size (it stores the size of allocations itself and ignores the layout specified by Rust). These allocations violate the safety contract specified by GlobalAlloc, and implementing these assertions was done to catch this issue.

I'll poke around a bit more to try narrow down the cause. (I suspect wasm-bindgen given its nature but we shall see.)

alexcrichton added a commit that referenced this issue May 8, 2024
Inspired by #41 but hasn't actually found any issues.
@alexcrichton
Copy link
Owner

Thanks for taking a look at this @SFBdragon! Your conclusion, a buggy alloc/dealloc in wasm-bindgen, sounds most likely to me at this point as well. Given the prevalence of this issue that's the main common denominator once dlmalloc itself is ruled out.

@SFBdragon
Copy link
Contributor

SFBdragon commented May 8, 2024

Update 2:

For others also hunting down the bug, we are indeed looking for a badly sized deallocation.

I logged all the allocations/deallocations/reallocations:

ALLOC
 ptr 
2364552
 size 
387646

... much later in the logs ...

DEALLOC  
 ptr 
2364552
 size 
137198

That size is very consistent.

(Note that this likely isn't the same exact allocation dlmalloc is crashing on.)

@elpiel
Copy link
Author

elpiel commented May 8, 2024

Thank you very much for spend time on this bug!

I will follow up on the other issue in wasm-pack then.

@elpiel elpiel closed this as completed May 8, 2024
@elpiel
Copy link
Author

elpiel commented May 9, 2024

For anyone landing on this issue, it seems that in the latest revision on main this issue has been fixed.
In the changelog on wasm-bindgen I see this bugfix related to UB in String deallocation, since we were using an older version of wasm-bindgen (because last time we tried to upgrade it was breaking our app again :D )

Fixed UB when freeing strings received from JS if not using the default allocator. rustwasm/wasm-bindgen#3808

https://github.com/rustwasm/wasm-bindgen/blob/0.2.92/CHANGELOG.md#fixed-1

OR

Take alignment into consideration during (de/re)allocation. rustwasm/wasm-bindgen#3463

https://github.com/rustwasm/wasm-bindgen/blob/0.2.92/CHANGELOG.md#0287

rustwasm/wasm-bindgen#3463

irh added a commit to koto-lang/koto.dev that referenced this issue May 15, 2024
rszyma added a commit to rszyma/vscode-kanata that referenced this issue Jun 1, 2024
The default allocator (dlmalloc-rs) caused panic in workspaces with very
large number of files and bumping wasm-bindgen to latest didn't help.
See the issue:
alexcrichton/dlmalloc-rs#41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants