Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STATUS_ACCESS_VIOLATION on exit #5637

Closed
daniel5gh opened this issue Apr 30, 2024 · 8 comments · Fixed by #5640
Closed

STATUS_ACCESS_VIOLATION on exit #5637

daniel5gh opened this issue Apr 30, 2024 · 8 comments · Fixed by #5640
Labels
type: bug Something isn't working

Comments

@daniel5gh
Copy link

daniel5gh commented Apr 30, 2024

Description
Since I have updated from 0.19 to 0.20, I am facing an issue similar to #1377.

When my program exits after the first use of the device, in a call surface.configure(&device, &config), the program exits with STATUS_ACCESS_VIOLATION when exiting through an anyhow error bubbled up to pollster or a regular exit with an Ok(()) result.

When I use the panic!() macro, it seems fine. See the various exit scenarios commented in the snippet below.

Repro steps

Extracted minimal reproduction from my Main Project in which I am using pollster, anyhow and winit.

use anyhow::{Context, Result};
use std::sync::Arc;
use wgpu::Features;
use winit::event_loop::{EventLoop};
use winit::window::WindowBuilder;


#[pollster::main]
async fn main() -> Result<()> {
    let event_loop = EventLoop::new()
        .context("Failed to create event loop")?;

    let window = Arc::new(WindowBuilder::new()
        .with_title("Panic! at the disco")
        .build(&event_loop)
        .context("Failed to create window")?);

    let instance = wgpu::Instance::new(wgpu::InstanceDescriptor {
        backends: wgpu::Backends::all(),
        ..Default::default()
    });

    let surface = instance.create_surface(window.clone())
        .context("Failed to create surface")?;

    let adapter = instance.request_adapter(
        &wgpu::RequestAdapterOptions {
            power_preference: wgpu::PowerPreference::HighPerformance,
            force_fallback_adapter: false,
            compatible_surface: Some(&surface),
        })
        .await
        .context("Failed to find an appropriate adapter")?;

    let (device, _queue) = adapter
        .request_device(
            &wgpu::DeviceDescriptor {
                label: None,
                required_features: Features::default(),
                required_limits: Default::default(),
            },
            None,
        )
        .await
        .context("Failed to create device / queue")?;

    let surface_caps = surface.get_capabilities(&adapter);
    let surface_format = surface_caps.formats.iter()
        .filter(|format| format.is_srgb())
        .next()
        .unwrap_or(surface_caps.formats.first().context("No surface formats found")?);

    let config = wgpu::SurfaceConfiguration {
        usage: wgpu::TextureUsages::RENDER_ATTACHMENT,
        format: *surface_format,
        width: window.inner_size().width,
        height: window.inner_size().height,
        present_mode: surface_caps.present_modes[0],
        alpha_mode: surface_caps.alpha_modes[0],
        view_formats: vec![],
        desired_maximum_frame_latency: 2,
    };

    // error before call to surface.configure, will exit as expected
    // return Err(anyhow::anyhow!("Disco"));

    surface.configure(&device, &config);

    // panic exits with code 101, as expected
    // panic!();

    // this will exit with exit code: 0xc0000005, STATUS_ACCESS_VIOLATION, unexpected
    return Err(anyhow::anyhow!("Disco"));

    // when forgetting the device first, will exit as expected
    // std::mem::forget(device);
    // return Err(anyhow::anyhow!("Disco"));

    // also without a panic, the program will exit with exit code: 0xc0000005, STATUS_ACCESS_VIOLATION, unexpected
    Ok(())
}

Expected vs observed behavior
Running the above code results in the program exiting with STATUS_ACCESS_VIOLATION, but I expect it to exit with code 1 and a message printed:

Error: Disco

Stack backtrace:
<followed by backtrace>

Specifying 0.19.4 for version in the dependencies does indeed show this expected behavior, where with 0.20.0 the access violation is seen.

Expected behavior with 0.20.0 is seen when specifying wgpu::Backends::GL as opposed to VULKAN, DX12 or all().

Platform

[dependencies]
anyhow = "1.0.82"
pollster = { version = "0.3.0", features = ["macro"] }
wgpu = { version = "0.20.0" }
winit = { version = "0.29.15", features = ["rwh_06"] }

reproduced on:

  • windows 11 [Version 10.0.22631.3527]: rustc 1.76.0 (07dca489a 2024-02-04)
  • windows 11 [Version 10.0.22631.3527]: rustc 1.78.0-nightly (2d24fe591 2024-03-09)
  • in CI using docker.io/rustlang/rust:nightly: rustc 1.79.0-nightly (aed2187d5 2024-04-27)
    • cross compiled using cargo build --target=x86_64-pc-windows-gnu and ran on the same windows 11

adapter, driver and other info, from log from Main Project, that has env_logger output:

[2024-04-30T00:47:47Z WARN  wgpu_hal::vulkan::instance] InstanceFlags::VALIDATION requested, but unable to find layer: VK_LAYER_KHRONOS_validation
[2024-04-30T00:47:47Z INFO  wgpu_hal::vulkan::instance] Debug utils not enabled: debug_utils_user_data not passed to Instance::from_raw
[2024-04-30T00:47:47Z INFO  wgpu_core::instance] Adapter Vulkan AdapterInfo { name: "NVIDIA GeForce RTX 4080 SUPER", vendor: 4318, device: 9986, device_type: DiscreteGpu, driver: "NVIDIA", driver_info: "551.61", backend: Vulkan }
[2024-04-30T00:47:47Z INFO  wgpu_core::instance] Adapter Dx12 AdapterInfo { name: "NVIDIA GeForce RTX 4080 SUPER", vendor: 4318, device: 9986, device_type: DiscreteGpu, driver: "", driver_info: "", backend: Dx12 }
[2024-04-30T00:47:47Z INFO  wgpu_core::instance] Adapter Dx12 AdapterInfo { name: "Microsoft Basic Render Driver", vendor: 5140, device: 140, device_type: Cpu, driver: "", driver_info: "", backend: Dx12 }
[2024-04-30T00:47:47Z INFO  wgpu_core::instance] Adapter Gl AdapterInfo { name: "NVIDIA GeForce RTX 4080 SUPER/PCIe/SSE2", vendor: 4318, device: 0, device_type: Other, driver: "", driver_info: "", backend: Gl }
[2024-04-30T00:47:47Z INFO  wgpu_core::instance] Adapter Vulkan AdapterInfo { name: "NVIDIA GeForce RTX 4080 SUPER", vendor: 4318, device: 9986, device_type: DiscreteGpu, driver: "NVIDIA", driver_info: "551.61", backend: Vulkan }
@cwfitzgerald
Copy link
Member

Could you run this in a debugger and show the the backtrace of the segfault

@daniel5gh
Copy link
Author

daniel5gh commented Apr 30, 2024

Certainly!

alloc::sync::impl$33::drop<wgpu_hal::vulkan::InstanceShared,alloc::alloc::Global>(*mut alloc::sync::Arc<wgpu_hal::vulkan::InstanceShared,alloc::alloc::Global>) 0x00007ff616a516aa
core::ptr::drop_in_place<alloc::sync::Arc<wgpu_hal::vulkan::InstanceShared,alloc::alloc::Global> >(*mut alloc::sync::Arc<wgpu_hal::vulkan::InstanceShared,alloc::alloc::Global>) 0x00007ff61693dd8e
core::ptr::drop_in_place<wgpu_hal::vulkan::Surface>(*mut wgpu_hal::vulkan::Surface) 0x00007ff616939493
alloc::sync::Arc<wgpu_hal::vulkan::Surface,alloc::alloc::Global>::drop_slow<wgpu_hal::vulkan::Surface,alloc::alloc::Global>() 0x00007ff616308e1f
alloc::sync::impl$33::drop<wgpu_hal::vulkan::Surface,alloc::alloc::Global>(*mut alloc::sync::Arc<wgpu_hal::vulkan::Surface,alloc::alloc::Global>) 0x00007ff61630f7ec
core::ptr::drop_in_place<alloc::sync::Arc<wgpu_hal::vulkan::Surface,alloc::alloc::Global> >(*mut alloc::sync::Arc<wgpu_hal::vulkan::Surface,alloc::alloc::Global>) 0x00007ff6162ebd8e
wgpu_core::device::any_device::impl$0::new::drop_glue<wgpu_hal::vulkan::Api>(*mut tuple$<>) any_device.rs:37
wgpu_core::device::any_device::impl$1::drop(*mut wgpu_core::device::any_device::AnyDevice) any_device.rs:89
core::ptr::drop_in_place<wgpu_core::device::any_device::AnyDevice>(*mut wgpu_core::device::any_device::AnyDevice) 0x00007ff61671e49e
core::ptr::drop_in_place<wgpu_core::present::Presentation>(*mut wgpu_core::present::Presentation) 0x00007ff61671d72f
wgpu_core::global::Global::surface_drop(wgpu_core::id::Id<enum2$<wgpu_core::id::markers::Surface> >) instance.rs:723
wgpu::backend::wgpu_core::impl$7::surface_drop(*mut wgpu::backend::wgpu_core::ContextWgpuCore,*mut wgpu_core::id::Id<enum2$<wgpu_core::id::markers::Surface> >,*mut wgpu::backend::wgpu_core::Surface) wgpu_core.rs:1583
wgpu::context::impl$5::surface_drop<wgpu::backend::wgpu_core::ContextWgpuCore>(*mut wgpu::backend::wgpu_core::ContextWgpuCore,*mut wgpu::context::ObjectId,ref$<dyn$<core::any::Any,core::marker::Send,core::marker::Sync> >) context.rs:2487
wgpu::impl$4::drop(*mut wgpu::Surface) lib.rs:586
core::ptr::drop_in_place<wgpu::Surface>(*mut wgpu::Surface) 0x00007ff616218caf
wgpu_panic::main::async_block$0(core::pin::Pin<ref_mut$<enum2$<wgpu_panic::main::async_block_env$0> > >,*mut core::task::wake::Context) main.rs:81
pollster::block_on<enum2$<wgpu_panic::main::async_block_env$0> >(enum2$<wgpu_panic::main::async_block_env$0>) lib.rs:128
wgpu_panic::main() main.rs:8
core::ops::function::FnOnce::call_once<enum2$<core::result::Result<tuple$<>,anyhow::Error> > (*)(),tuple$<> >(*mut ) function.rs:250
std::sys_common::backtrace::__rust_begin_short_backtrace<enum2$<core::result::Result<tuple$<>,anyhow::Error> > (*)(),enum2$<core::result::Result<tuple$<>,anyhow::Error> > >(*mut ) 0x00007ff6162211ce
std::rt::lang_start::closure$0<enum2$<core::result::Result<tuple$<>,anyhow::Error> > >(*mut std::rt::lang_start::closure_env$0<enum2$<core::result::Result<tuple$<>,anyhow::Error> > >) rt.rs:166
[Inlined] std::rt::lang_start_internal::closure$2() rt.rs:148
[Inlined] std::panicking::try::do_call() 0x00007ff616ff1387
[Inlined] std::panicking::try() 0x00007ff616ff1387
[Inlined] std::panic::catch_unwind() 0x00007ff616ff1387
std::rt::lang_start_internal() rt.rs:148
std::rt::lang_start<enum2$<core::result::Result<tuple$<>,anyhow::Error> > >(*mut ,i64,*mut *mut u8,u8) rt.rs:165
main 0x00007ff616222fb9
[Inlined] invoke_main() 0x00007ff617018890
__scrt_common_main_seh() 0x00007ff61701886e
<unknown> 0x00007ff8e218257d
<unknown> 0x00007ff8e388aa48

The main issue seems to be arising from the following function call:

  • *wgpu_core::device::any_device::impl$0::new::drop_glue<wgpu_hal::vulkan::Api>(mut tuple$<>) in any_device.rs:37

https://github.com/gfx-rs/wgpu/blob/4521502da69bcf4f92c8350042c268573ef216d4/wgpu-core/src/device/any_device.rs#L33C1-L39C10

which seems to be introduced in this commit f45d500

As my understanding of these implementation details is limited, any insights or guidance would be greatly appreciated. Thank you!

@Wumpf Wumpf added the type: bug Something isn't working label Apr 30, 2024
@sagudev
Copy link
Contributor

sagudev commented Apr 30, 2024

The problem is order of dropping of surface and device, even more minimal example without anyhow:

use std::sync::Arc;
use wgpu::Features;
use winit::event_loop::EventLoop;
use winit::window::WindowBuilder;

#[pollster::main]
async fn main() {
    let event_loop = EventLoop::new().unwrap();

    let window = Arc::new(
        WindowBuilder::new()
            .with_title("Panic! at the disco")
            .build(&event_loop)
            .unwrap(),
    );

    let instance = wgpu::Instance::new(wgpu::InstanceDescriptor {
        backends: wgpu::Backends::all(),
        ..Default::default()
    });

    let surface = instance.create_surface(window.clone()).unwrap();

    let adapter = instance
        .request_adapter(&wgpu::RequestAdapterOptions {
            power_preference: wgpu::PowerPreference::HighPerformance,
            force_fallback_adapter: false,
            compatible_surface: Some(&surface),
        })
        .await
        .unwrap();

    let (device, _queue) = adapter
        .request_device(
            &wgpu::DeviceDescriptor {
                label: None,
                required_features: Features::default(),
                required_limits: Default::default(),
            },
            None,
        )
        .await
        .unwrap();

    let surface_caps = surface.get_capabilities(&adapter);
    let surface_format = surface_caps
        .formats
        .iter()
        .find(|format| format.is_srgb())
        .unwrap_or(surface_caps.formats.first().unwrap());

    let config = wgpu::SurfaceConfiguration {
        usage: wgpu::TextureUsages::RENDER_ATTACHMENT,
        format: *surface_format,
        width: window.inner_size().width,
        height: window.inner_size().height,
        present_mode: surface_caps.present_modes[0],
        alpha_mode: surface_caps.alpha_modes[0],
        view_formats: vec![],
        desired_maximum_frame_latency: 2,
    };

    surface.configure(&device, &config);

    drop(_queue);
    drop(device);
    drop(surface); // segfaults if both queue and device are dropped before surface
}

sagudev added a commit to sagudev/wgpu that referenced this issue Apr 30, 2024
Arc has device stored inside not surface.

fixes gfx-rs#5637
@daniel5gh
Copy link
Author

Thanks. Straight forward fix of what seems to be a copy paste oopsie..

I was looking at that code too, but I was too focussed on trying to understand how it works that I overlooked this inconsistency.

@cwfitzgerald
Copy link
Member

Going to push a patch with this tomorrow.

@EliiasG
Copy link

EliiasG commented May 9, 2024

I am still getting this in 0.30.0, tho i am pretty sure i am dropping device and queue before surface.
Not even sure if winit is causing it (might be wgpu?)
Debugger simply sends me here:
image

@cwfitzgerald
Copy link
Member

No, we're doing a really stupid cast in wgpu that is fixed, but needs to be released

@EliiasG
Copy link

EliiasG commented May 9, 2024

Thank you, i will ignore the error for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants