-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic/hang in VulkanSwapChain::acquire during window resize #8185
Comments
I've continued trying to debug this, and I have now confirmed that something is being leaked when the window is resized. The tests below are done using my NVIDIA 2080 Ti. I did similar tests on my Intel Iris Xe and it crashes within 1 second of resizing. Same issue, but the integrated chipset runs out of whatever resource is leaked much more quickly. In this screenshot, I show the windows task manager GPU memory graph. There are two regions where I am resizing the window wildly with the mouse, resulting in hundreds of calls to mPlatform->recreate() inside VulkanSwapChain::acquire. In both regions you can see the GPU memory use grow extremely rapidly: Once it hits peak memory usage, window resizing gets erratic and glitchy and will eventually crash. In this second screenshot, I show the GPU memory graph again, but this time I added a call to Engine::createSwapChain() and Engine::createRenderer() (as well as corresponding calls to Engine::destroy() for each) on every frame. This obviously shows that something is being leaked whenever the swap chain is destroyed and recreated. I think this is about as far I can debug. Hopefully this is helpful -- right now the Vulkan implementation on Windows is quite unstable due to this particular issue. |
One last note: for a SUPER easy repro, even with a good graphics card, you can just write a for loop that resizes the window:
I did this and it crashes in less than a second. I guess for the demos it might be:
|
Ah. It looks like the bug is that vkDestroySwapchainKHR() is not being called after vkCreateSwapchainKHR() is used to recreate the swap chain. So the old swap chain is being leaked. Note that setting VkSwapchainCreateInfoKHR::oldSwapchain in the creation does not cause it to be freed -- the calling code still has to destroy it once it is no longer in use, or it is leaked. |
See also KhronosGroup/Vulkan-Docs#1678 for info on how to destroy the swap chain safely. Yuck. |
Thanks for bring this bug to our attention and doing the research. It's a bit tricky to address correctly. I'll try to resolve it soon. |
Describe the bug
Rapid window resizes (e.g. due to the user dragging an edge of the window) eventually cause Filament's Vulkan backend to crash. I initially noticed this with my app using Filament, but I was able to reproduce it with the gltf_viewer.exe demo as well. However it is more difficult to reproduce with gltf_viewer.exe, because it doesn't resize the render window until the user is done resizing it. My app resizes the render window constantly as the user is dragging it, which makes the bug much more likely to appear.
I initially found out about this because a user on an Intel Iris Xe integrated graphics chipset used my app. I happen to have an Intel Iris Xe as well, and it is VERY easy to reproduce on this chip. It doesn't crash, but hangs, and it is totally deterministic within about half a second of resizing the window.
I thought it was a bug specific to Intel Iris Xe, but eventually I reproduced it on my NVIDIA 2080 Ti. On the NVIDIA chip it is just a lot harder to get the issue to show up, but it's there. The fact that it's harder to repro on NVIDIA makes me think that it's an issue with running out of resources, and the NVIDIA chip has more.
Also, this page seems relevant:
https://docs.vulkan.org/samples/latest/samples/api/swapchain_recreation/README.html
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Not crashing.
Screenshots
N/A
Logs
I1008 15:47:53.8515384 9276.0 model_renderer.cc:66] [Filament I]: vkCreateSwapchain: 1809x1528, 44, 0, swapchain-size=3, identity-transform=true, depth=126
I1008 15:47:53.9954227 9276.0 model_renderer.cc:66] [Filament I]: vkCreateSwapchain: 483x765, 44, 0, swapchain-size=3, identity-transform=true, depth=126
I1008 15:47:54.0431041 9276.0 model_renderer.cc:66] [Filament I]: vkCreateSwapchain: 483x980, 44, 0, swapchain-size=3, identity-transform=true, depth=126
I1008 15:47:54.1264958 9276.0 model_renderer.cc:66] [Filament I]: vkCreateSwapchain: 483x1352, 44, 0, swapchain-size=3, identity-transform=true, depth=126
E1008 15:47:54.2013704 9276.0 model_renderer.cc:56] [Filament E]: Postcondition
in acquire:130
reason: Cannot acquire in swapchain.
E1008 15:47:54.2013983 9276.0 model_renderer.cc:56] [Filament E]:
E1008 15:47:55.0510578 9276.0 logging.cc:43] *** SIGABRT received at time=1728402475 ***
E1008 15:47:55.0513225 9276.0 logging.cc:43] @ 00007FF70A7943DC (unknown) abort
E1008 15:47:55.0514756 9276.0 logging.cc:43] @ 00007FF70A55A73E (unknown) utils::TPanicutils::PostconditionPanic::panic
E1008 15:47:55.0515894 9276.0 logging.cc:43] @ 00007FF70A55A668 (unknown) utils::TPanicutils::PostconditionPanic::panic
E1008 15:47:55.0517386 9276.0 logging.cc:43] @ 00007FF70A512568 (unknown) filament::backend::VulkanSwapChain::acquire
E1008 15:47:55.0518643 9276.0 logging.cc:43] @ 00007FF70A4CC0F3 (unknown) filament::backend::VulkanDriver::makeCurrent
E1008 15:47:55.0520594 9276.0 logging.cc:43] @ 00007FF70A49246D (unknown) filament::backend::CommandStream::CommandStream
E1008 15:47:55.0522953 9276.0 logging.cc:43] @ 00007FF70A492550 (unknown) filament::backend::CommandStream::execute
E1008 15:47:55.0524554 9276.0 logging.cc:43] @ 00007FF70A3CCE44 (unknown) filament::FEngine::execute
E1008 15:47:55.0526161 9276.0 logging.cc:43] @ 00007FF70A3CEC2E (unknown) filament::FEngine::loop
E1008 15:47:55.0528188 9276.0 logging.cc:43] @ 00007FF70A3C580F (unknown) std::thread::_Invoke<std::tuple<int (__cdecl filament::FEngine::)(void) __ptr64,filament::FEngine * __ptr64>,0,1>
E1008 15:47:55.0529430 9276.0 logging.cc:43] @ 00007FF70A794496 (unknown) thread_start<unsigned int (__cdecl)(void *),1>
E1008 15:47:55.0530671 9276.0 logging.cc:43] @ 00007FFA3212257D (unknown) BaseThreadInitThunk
E1008 15:47:55.0531682 9276.0 logging.cc:43] @ 00007FFA341EAF28 (unknown) RtlUserThreadStart
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context
N/A
The text was updated successfully, but these errors were encountered: