Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dangling pointer in JIT Engine #1568

Closed
sebpuetz opened this issue Aug 29, 2020 · 11 comments
Closed

Dangling pointer in JIT Engine #1568

sebpuetz opened this issue Aug 29, 2020 · 11 comments
Labels
1.0 Wasmer at 1.0 bug Something isn't working 🔈soundness Bugs causing an unsound API

Comments

@sebpuetz
Copy link

Describe the bug

Initially reported by @Hywan at PyO3/pyo3#1120


This test causes abort on linux:

    #[test]
    fn test2() {
        let store = wasmer::Store::default();
        let module = wasmer::Module::new(&store, "(module)").unwrap();
        mem::drop(store);
        mem::drop(module); // abort
    }

Stepping through the code shows that the abort happens while dropping wasmer_engine_jit::unwind::systemv::UnwindRegistry. More specifically, the last call before abort is __deregister_frame in UnwindRegistry::drop.

Artifact holds a pointer to a Vec owned by Store that gets passed to __deregister_frame. If the Store goes out first, the pointer in Artifact's UnwindRegistry becomes invalid but gets used anyways.

Allocation happens here:

let custom_sections =
inner_jit.allocate_custom_sections(&serializable.compilation.custom_sections)?;

Subverting lifetimes happens here:

self.registrations.push(ptr as usize);

Assuming dead-pointers are still alive happens here:

for fde in self.registrations.iter().rev() {
__deregister_frame(*fde as *const _);
}

A fix is switching the order of the struct members in wasmer::Module since dropping happens in reverse order. It'd probably be cleaner to keep the Vec alive independently of member order.

pub struct Module {
    artifact: Arc<dyn Artifact>,
    store: Store,
}

Steps to reproduce

Create project with test and run cargo test.

Expected behavior

Exit succesfully

Actual behavior

(signal: 6, SIGABRT: process abort signal)

Additional context

@sebpuetz sebpuetz added the bug Something isn't working label Aug 29, 2020
@sebpuetz
Copy link
Author

sebpuetz commented Aug 30, 2020

Adding

impl Drop for CodeMemory {
    fn drop(&mut self) {
        for section in self.read_sections.iter() {
            println!("dropping section at: {:p} {:?}", section.as_ptr(), section);
        }
    }
}

and this print demonstrates that the Vec is dropped before the pointer to it gets passed to __deregister_frame:

impl Drop for UnwindRegistry {
    fn drop(&mut self) {
        if self.published {
            unsafe {
                // libgcc stores the frame entries as a linked list in decreasing sort order
                // based on the PC value of the registered entry.
                //
                // As we store the registrations in increasing order, it would be O(N^2) to
                // deregister in that order.
                //
                // To ensure that we just pop off the first element in the list upon every
                // deregistration, walk our list of registrations backwards.
                for fde in self.registrations.iter().rev() {
                    println!("deregistering fde at {:p}, {:?}", *fde as *const u8, &*slice_from_raw_parts(*fde as *const u8, 4));
                    __deregister_frame(*fde as *const _);
                }
            }
        }
    }
}

This prints then:

dropping section at: 0x55739b7a6430 [0, 0, 0, 0]
deregistering fde at 0x55739b7a6430, [0, 0, 0, 0]

The print in the drop impl for CodeMemory prevents the abort, without that print, the contents of the slice randomly change.

@syrusakbary
Copy link
Member

Thanks for creating the issue @sebpuetz, we just created the PR #1581 to fix it :)

@Hywan
Copy link
Contributor

Hywan commented Aug 31, 2020

The problem is twofold. The one you found is correct! But you have also #1581 that fixes the root of the problem.

@Hywan
Copy link
Contributor

Hywan commented Aug 31, 2020

Thank you for your help by the help! I wasn't expected that much involvement!

@sebpuetz
Copy link
Author

I'd expect the issue to persist tho, maybe it's harder to trigger. There is still no mechanism to ensure that the CodeMemory outlives the frames in the UnwindRegistry, no?

If CodeMemory gets dropped and a Module still holds an Artifact with an UnwindRegistry, all frames registered in that UnwindRegistry have been dropped and might get overwritten at random. If I'm understanding #1581 correctly, this only prevents creating frames for empty functions. So this would not address the use after free?

Thank you for your help by the help! I wasn't expected that much involvement!

No worries, that was a fun exercise ;)

@Hywan
Copy link
Contributor

Hywan commented Aug 31, 2020

Correct! I've created a new PR to fix that!

@sebpuetz
Copy link
Author

Sorry to keep bugging you, but that's also just another band-aid, there's still unsoundness in safe code I'm aware that this particular case passes with the empty-frame fix, but here you have no control over drop-order. There needs to be some association between the CodeMemory and the UnwindRegistry. Otherwise you can not ensure that the pointers in UnwindRegistry are still valid.

use std::mem;
use wasmer_engine_jit::CodeMemory;
use wasmer_compiler::SectionBody;

fn main() {
    let store = wasmer::Store::default();
    let bytes = wat::parse_bytes(b"(module)").unwrap();
    let artifact = store.engine().compile(&bytes, store.tunables()).unwrap();
    mem::drop(store);
}

@sebpuetz
Copy link
Author

Changing the drop impl a bit on UnwindRegistry also makes demonstrating the issue easier with ASAN.

impl Drop for UnwindRegistry {
    fn drop(&mut self) {
        if self.published {
            unsafe {
                // libgcc stores the frame entries as a linked list in decreasing sort order
                // based on the PC value of the registered entry.
                //
                // As we store the registrations in increasing order, it would be O(N^2) to
                // deregister in that order.
                //
                // To ensure that we just pop off the first element in the list upon every
                // deregistration, walk our list of registrations backwards.
                for fde in self.registrations.iter().rev() {
                    let s = &mut *slice_from_raw_parts_mut(*fde as *mut u8, 4);
                    s[0] = s[0];
                    __deregister_frame(*fde as *const _);
                }
            }
        }
    }
}

Running the following code with RUSTFLAGS="-Z sanitizer=address" cargo run --target x86_64-unknown-linux-gnu:

use std::mem;
use wasmer_engine_jit::CodeMemory;
use wasmer_compiler::SectionBody;

fn main() {
    let store = wasmer::Store::default();
    let bytes = wat::parse_bytes(b"(module
  (func (export \"add\") (param $x i64) (param $y i64) (result i64) (i64.add (local.get $x) (local.get $y)))
)").unwrap();
    let artifact = store.engine().compile(&bytes, store.tunables()).unwrap();
    mem::drop(store);
}
$ RUSTFLAGS="-Z sanitizer=address" cargo run --target x86_64-unknown-linux-gnu
dropping section at: 0x607000000d40 [20, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 120, 16, 12, 7, 8, 144, 1, 0, 0, 0, 0, 0, 0, 36, 0, 0, 0, 28, 0, 0, 0, 0, 144, 19, 64, 99, 127, 0, 0, 14, 0, 0, 0, 0, 0, 0, 0, 66, 14, 16, 134, 2, 67, 13, 6, 72, 12, 7, 8, 0, 0, 0, 0, 0, 0, 0, 0] 68 68
deregistering 0x607000000d40
=================================================================
==25181==ERROR: AddressSanitizer: heap-use-after-free on address 0x607000000d40 at pc 0x555c1ceccd28 bp 0x7ffdebe87c50 sp 0x7ffdebe87c48
READ of size 1 at 0x607000000d40 thread T0
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.

@MarkMcCaskey MarkMcCaskey added 1.0 Wasmer at 1.0 🔈soundness Bugs causing an unsound API labels Sep 23, 2020
@nlewycky
Copy link
Contributor

I wasn't able to reproduce this on master. I also tried reverting #1628 that might have fixed this and even locally reverting that change I still didn't see the failure.

I added examples/issue1568.rs with:

use std::mem;
use wasmer::{wat2wasm, Store};
use wasmer_engine_jit::CodeMemory;
use wasmer_compiler::SectionBody;
use wasmer_compiler_cranelift::Cranelift;
use wasmer_engine_jit::JIT;

fn main() {
    let mut compiler_config = Cranelift::default();
    let engine = JIT::new(&mut compiler_config).engine();
    let store = Store::new(&engine);

    let bytes = wat2wasm(r#"(module
  (func (export "add") (param $x i64) (param $y i64) (result i64) (i64.add (local.get $x) (local.get $y)))
)
"#.as_bytes()).unwrap();
    let artifact = store.engine().compile(&bytes, store.tunables()).unwrap();
    mem::drop(store);
}

and added the example to Cargo.toml and ran RUSTFLAGS="-Z sanitizer=address" cargo run --target x86_64-unknown-linux-gnu --example issue1568 --features="cranelift,jit" and didn't find an error.

Is this bug still present?

@Hywan
Copy link
Contributor

Hywan commented Dec 10, 2020

No. #1628 fixes this issue. I'm closing it.

@Hywan Hywan closed this as completed Dec 10, 2020
@gorgos
Copy link

gorgos commented Dec 25, 2023

We are still seeing this issue in 4.2.2. Stack trace example here: https://gist.github.com/gorgos/b8a485d2c31430728bdad7b04d985eca.

Only with Docker + Alpine Linux + muslc though.

Edit: Issue now being tracked here: #4488

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0 Wasmer at 1.0 bug Something isn't working 🔈soundness Bugs causing an unsound API
Projects
None yet
Development

No branches or pull requests

6 participants