Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random crash in wayfire_foreign_toplevel::init_request_handlers() (race condition with xdg-popups?) #2445

Open
dkondor opened this issue Aug 24, 2024 · 7 comments
Labels
Milestone

Comments

@dkondor
Copy link
Contributor

dkondor commented Aug 24, 2024

Describe the bug
This happened once randomly while I was experimenting in follow up to #2422 -- Wayfire crashed while clicking on an icon with a subdock in Cairo-Dock. My hunch is that this is a race condition with xdg-popups similar to #772 but in this case Wayfire trying to use an "inert" popup (more detailed reasoning below).

To Reproduce
So far I cannot actually reproduce, which is not surprising, since it is likely extremely sensitive to timing, and I've already been using the equivalent setup for ~2-3 years without issues. I do have an idea how to write a reproducible example based on my previous experience with similar race conditions e.g. this and this. Once I have a bit more time, I can experiment with this.

What I did the time it happened:

  1. Start Cairo-Dock (beta version) and also this plugin (note: based on my previous experience, any other interaction that triggers a close of a subdock at the wrong time should also work).
  2. Open at least two instances of an app (so they are grouped into a subdock)
  3. Click the corresponding icon.

Expected behavior
Scale is started showing only the views of the selected app. The corresponding subdock might appear as you move the mouse over the icon, and it might be closed once scale starts.

Screenshots or stacktrace
This is from the syslog and unfortunately I don't have debug info enabled:

Aug 24 12:30:25 thinkpad-x230 wayfire[2351]: II 24-08-24 12:30:25.053 - [src/view/xdg-shell/xdg-toplevel-view.cpp:29] new xdg_shell_stable surface:  app-id: sakura
Aug 24 12:30:25 thinkpad-x230 systemd[2109]: Started vte-spawn-6d04869b-f1bd-42b2-979d-8bd17ba5b16e.scope - VTE child process 3331 launched by sakura process 3323.
Aug 24 12:30:26 thinkpad-x230 wayfire[2351]: II 24-08-24 12:30:26.118 - [src/view/xdg-shell/xdg-toplevel-view.cpp:29] new xdg_shell_stable surface:  app-id: sakura
Aug 24 12:30:26 thinkpad-x230 systemd[2109]: Started vte-spawn-98c67524-bf95-4b5e-b2d5-ea8498e0414e.scope - VTE child process 3353 launched by sakura process 3345.
Aug 24 12:30:26 thinkpad-x230 wayfire[2351]: II 24-08-24 12:30:26.141 - [src/view/xdg-shell.cpp:124] New xdg popup
Aug 24 12:30:26 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:26.141 - [types/xdg_shell/wlr_xdg_surface.c:169] A configure is scheduled for an uninitialized xdg_surface 0x5a47d57ede60
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: II 24-08-24 12:30:30.016 - [src/view/xdg-shell.cpp:124] New xdg popup
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.016 - [types/xdg_shell/wlr_xdg_surface.c:169] A configure is scheduled for an uninitialized xdg_surface 0x5a47d5827870
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.030 - [src/main.cpp:141] Fatal error: Segmentation fault
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.068 - #1  signal_handler(int) main.cpp:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.652 - #2  __restore_rt libc_sigaction.c:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.666 - #3  wayfire_foreign_toplevel::init_request_handlers()::{lambda(void*)#3}::operator()(void*) const ??:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.684 - #4  wf::wl_listener_wrapper::emit(void*) ??:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.700 - #5  wl_signal_emit_mutable ??:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.717 - #6  foreign_toplevel_handle_set_rectangle wlr_foreign_toplevel_management_v1.c:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.730 - #7  ffi_prep_go_closure ??:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.745 - #8  ?? ??:0
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.756 - #9  ffi_call ??:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.770 - #10 ?? ??:0
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.784 - #11 wl_array_add ??:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.798 - #12 wl_event_loop_dispatch ??:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.814 - #13 wl_display_run ??:?
Aug 24 12:30:30 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:30.835 - #14 main ??:?
Aug 24 12:30:31 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:31.020 - #15 __libc_start_call_main ../sysdeps/x86/libc-start.c:74
Aug 24 12:30:31 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:31.207 - #16 call_init ../csu/libc-start.c:128
Aug 24 12:30:31 thinkpad-x230 wayfire[2351]: EE 24-08-24 12:30:31.229 - #17 _start ??:?

Wayfire version
git (44e1fa9)

What I think happens

  1. With the mouse pointing to an icon that represents multiple instances (multiple open views) of an app, CD will open a subdock, which is an xdg-popup (BTW I'm aware this is not a very good design choice, but changing it will take some work; however, it should not trigger a crash in the compositor)
  2. CD will send zwlr_foreign_toplevel_handle_v1::set_rectangle to update the minimize position of each view to be within the newly opened subdock
  3. Clicking on the icon, CD will send an IPC request to start scale
  4. When scale starts, Wayfire will close the xdg-popup
  5. The request from step #2 is processed by Wayfire, triggering the crash seen in the stacktrace above as it now refers to an inert popup's surface
  6. CD receives an xdg_popup::popup_done event, but it is too late (since it already sent a related request in step #2)

Note: likely step #2 actually happens later, but still before step #6. Another possibility is that there is actually a bug in CD and it does send a set_rectangle request on an already-closed popup, and I'll look into this separately, but IMO this should lead to a protocol error and not a crash.

@dkondor dkondor added the bug label Aug 24, 2024
@ammen99
Copy link
Member

ammen99 commented Aug 24, 2024

If your hunch is correct, maybe https://0x0.st/Xy3U.diff will help. But it is hard to say for sure without at least address sanitizer so that we get line numbers.

@dkondor
Copy link
Contributor Author

dkondor commented Aug 24, 2024

So I'm able to reproduce this reliably with this code: https://github.com/dkondor/xdg_popup_crasher

Full stacktrace:

EE 24-08-24 15:05:53.039 - #1  signal_handler(int) ../src/main.cpp:143
EE 24-08-24 15:05:53.621 - #2  __restore_rt libc_sigaction.c:?
EE 24-08-24 15:05:53.659 - #3  wayfire_foreign_toplevel::handle_minimize_hint(wf::toplevel_view_interface_t*, wf::view_interface_t*, wlr_box) ../plugins/protocols/foreign-toplevel.cpp:235
EE 24-08-24 15:05:53.699 - #4  wayfire_foreign_toplevel::init_request_handlers()::{lambda(void*)#3}::operator()(void*) const ../plugins/protocols/foreign-toplevel.cpp:221
EE 24-08-24 15:05:53.739 - #5  void std::__invoke_impl<void, wayfire_foreign_toplevel::init_request_handlers()::{lambda(void*)#3}&, void*>(std::__invoke_other, wayfire_foreign_toplevel::init_request_handlers()::{lambda(void*)#3}&, void*&&) /usr/include/c++/13/bits/invoke.h:61
EE 24-08-24 15:05:53.777 - #6  std::enable_if<is_invocable_r_v<void, wayfire_foreign_toplevel::init_request_handlers()::{lambda(void*)#3}&, void*>, void>::type std::__invoke_r<void, wayfire_foreign_toplevel::init_request_handlers()::{lambda(void*)#3}&, void*>(wayfire_foreign_toplevel::init_request_handlers()::{lambda(void*)#3}&, void*&&) /usr/include/c++/13/bits/invoke.h:117
EE 24-08-24 15:05:53.816 - #7  std::_Function_handler<void (void*), wayfire_foreign_toplevel::init_request_handlers()::{lambda(void*)#3}>::_M_invoke(std::_Any_data const&, void*&&) /usr/include/c++/13/bits/std_function.h:291
EE 24-08-24 15:05:53.907 - #8  std::function<void (void*)>::operator()(void*) const ../src/util.cpp:591
EE 24-08-24 15:05:54.000 - #9  wf::wl_listener_wrapper::emit(void*) ../src/wl-listener-wrapper.tpp:59
EE 24-08-24 15:05:54.089 - #10 wf::handle_wrapped_listener(wl_listener*, void*) ../src/wl-listener-wrapper.tpp:11
EE 24-08-24 15:05:54.099 - #11 wl_signal_emit_mutable ??:?
EE 24-08-24 15:05:54.118 - #12 foreign_toplevel_handle_set_rectangle wlr_foreign_toplevel_management_v1.c:?
EE 24-08-24 15:05:54.129 - #13 ffi_prep_go_closure ??:?
EE 24-08-24 15:05:54.141 - #14 ?? ??:0
EE 24-08-24 15:05:54.150 - #15 ffi_call ??:?
EE 24-08-24 15:05:54.162 - #16 ?? ??:0
EE 24-08-24 15:05:54.173 - #17 wl_array_add ??:?
EE 24-08-24 15:05:54.184 - #18 wl_event_loop_dispatch ??:?
EE 24-08-24 15:05:54.194 - #19 wl_display_run ??:?
EE 24-08-24 15:05:54.279 - #20 main ../src/main.cpp:449
EE 24-08-24 15:05:54.404 - #21 __libc_start_call_main ../sysdeps/x86/libc-start.c:74
EE 24-08-24 15:05:54.531 - #22 call_init ../csu/libc-start.c:128
EE 24-08-24 15:05:56.382 - #23 _start ??:?

@ammen99 ammen99 added this to the 0.10 milestone Aug 24, 2024
@ammen99
Copy link
Member

ammen99 commented Aug 24, 2024

Nice, thanks for the good reproducer app! Did you check whether my patch helps at all?

@ammen99
Copy link
Member

ammen99 commented Aug 24, 2024

Actually not the previous patch, it had a small mistake, see https://0x0.st/XyY3.diff

@dkondor
Copy link
Contributor Author

dkondor commented Aug 24, 2024

With your patch I seem to get a crash whenever closing a popup the normal way:

II 24-08-24 15:23:10.565 - [src/view/xdg-shell.cpp:124] New xdg popup
EE 24-08-24 15:23:10.565 - [types/xdg_shell/wlr_xdg_surface.c:169] A configure is scheduled for an uninitialized xdg_surface 0x63ce97ed03b0
EE 24-08-24 15:23:11.696 - [src/main.cpp:141] Fatal error: Segmentation fault
EE 24-08-24 15:23:11.783 - #1  signal_handler(int) ../src/main.cpp:143
EE 24-08-24 15:23:12.372 - #2  __restore_rt libc_sigaction.c:?
EE 24-08-24 15:23:13.471 - #3  wayfire_xdg_popup::destroy() ../src/view/xdg-shell.cpp:304
EE 24-08-24 15:23:14.586 - #4  auto xdg_popup_controller_t::xdg_popup_controller_t(wlr_xdg_popup*)::{lambda(auto:1)#1}::operator()<void*>(void*) const ../src/view/xdg-shell.cpp:410
EE 24-08-24 15:23:15.695 - #5  void std::__invoke_impl<void, xdg_popup_controller_t::xdg_popup_controller_t(wlr_xdg_popup*)::{lambda(auto:1)#1}&, void*>(std::__invoke_other, xdg_popup_controller_t::xdg_popup_controller_t(wlr_xdg_popup*)::{lambda(auto:1)#1}&, void*&&) /usr/include/c++/13/bits/invoke.h:61
EE 24-08-24 15:23:16.795 - #6  std::enable_if<is_invocable_r_v<void, xdg_popup_controller_t::xdg_popup_controller_t(wlr_xdg_popup*)::{lambda(auto:1)#1}&, void*>, void>::type std::__invoke_r<void, xdg_popup_controller_t::xdg_popup_controller_t(wlr_xdg_popup*)::{lambda(auto:1)#1}&, void*>(xdg_popup_controller_t::xdg_popup_controller_t(wlr_xdg_popup*)::{lambda(auto:1)#1}&, void*&&) /usr/include/c++/13/bits/invoke.h:117
EE 24-08-24 15:23:17.915 - #7  std::_Function_handler<void (void*), xdg_popup_controller_t::xdg_popup_controller_t(wlr_xdg_popup*)::{lambda(auto:1)#1}>::_M_invoke(std::_Any_data const&, void*&&) /usr/include/c++/13/bits/std_function.h:291
EE 24-08-24 15:23:18.006 - #8  std::function<void (void*)>::operator()(void*) const ../src/util.cpp:591
EE 24-08-24 15:23:18.092 - #9  wf::wl_listener_wrapper::emit(void*) ../src/wl-listener-wrapper.tpp:59
EE 24-08-24 15:23:18.179 - #10 wf::handle_wrapped_listener(wl_listener*, void*) ../src/wl-listener-wrapper.tpp:11
EE 24-08-24 15:23:18.191 - #11 wl_signal_emit_mutable ??:?
EE 24-08-24 15:23:18.205 - #12 destroy_xdg_popup :?
EE 24-08-24 15:23:18.220 - #13 xdg_surface_handle_role_resource_destroy wlr_xdg_surface.c:?
EE 24-08-24 15:23:18.232 - #14 wl_resource_queue_event ??:?
EE 24-08-24 15:23:18.244 - #15 wl_resource_destroy ??:?
EE 24-08-24 15:23:18.256 - #16 ffi_prep_go_closure ??:?
EE 24-08-24 15:23:18.267 - #17 ?? ??:0
EE 24-08-24 15:23:18.278 - #18 ffi_call ??:?
EE 24-08-24 15:23:18.289 - #19 ?? ??:0
EE 24-08-24 15:23:18.301 - #20 wl_array_add ??:?
EE 24-08-24 15:23:18.312 - #21 wl_event_loop_dispatch ??:?
EE 24-08-24 15:23:18.323 - #22 wl_display_run ??:?
EE 24-08-24 15:23:18.399 - #23 main ../src/main.cpp:449
EE 24-08-24 15:23:18.524 - #24 __libc_start_call_main ../sysdeps/x86/libc-start.c:74
EE 24-08-24 15:23:18.647 - #25 call_init ../csu/libc-start.c:128
EE 24-08-24 15:23:20.477 - #26 _start ??:?

@ammen99
Copy link
Member

ammen99 commented Aug 24, 2024

My bad again, the line I wnat to add should be put 1 line earlier .. https://0x0.st/XyYg.diff

@dkondor
Copy link
Contributor Author

dkondor commented Aug 24, 2024

Yes, the latest one works (it writes the error message about unknown surface which is expected in this case). Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants