Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websocket segfault when quickly opening and closing connections #60

Open
AlejoAsd opened this issue Sep 9, 2020 · 3 comments
Open

Websocket segfault when quickly opening and closing connections #60

AlejoAsd opened this issue Sep 9, 2020 · 3 comments
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@AlejoAsd
Copy link
Contributor

AlejoAsd commented Sep 9, 2020

The Websocket server segfaults if a websocket connection is opened and closed in quick succession.

Steps to reproduce

  1. Start a simulation with the websocket server enabled. (My specific tests were performed using Cloudsim.)
  2. Quickly connect and disconnect from the simulation.

Stack trace

Stack trace (most recent call last) in thread 65:
#13   Object "", at 0xffffffffffffffff, in 
#12   Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7f326a152a3e, in clone
#11   Object "/lib/x86_64-linux-gnu/libpthread.so.0", at 0x7f3269e196da, in start_thread
#10   Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7f32670b96de, in std::error_code::default_error_condition() const
#9    Object "/usr/lib/x86_64-linux-gnu/ign-launch-1/plugins/libignition-launch-websocket-server.so", at 0x7f3265faac87, in ignition::launch::WebsocketServer::Run()
#8    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d8b55a, in lws_SHA1
#7    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d80015, in lws_service_fd_tsi
#6    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d7cd41, in lws_read
#5    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d8ea29, in lws_serve_http_file
#4    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d83d9e, in lws_frame_is_binary
#3    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d7e7e6, in lws_close_reason
#2    Object "/usr/lib/x86_64-linux-gnu/ign-launch-1/plugins/libignition-launch-websocket-server.so", at 0x7f3265faefcc, in rootCallback(lws*, lws_callback_reasons, void*, void*, unsigned long)
#1    Object "/usr/lib/x86_64-linux-gnu/ign-launch-1/plugins/libignition-launch-websocket-server.so", at 0x7f3265fae6e1, in ignition::launch::WebsocketServer::OnMessage(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#0    Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7f326a1bbfde, in __nss_passwd_lookup
Segmentation fault (Address not mapped to object [0xa6d3000])
./run_sim.bash: line 16:    54 Segmentation fault      (core dumped) ign launch -v 4 $@
@chapulina chapulina added bug Something isn't working help wanted Extra attention is needed labels Nov 15, 2021
@ruffsl
Copy link

ruffsl commented May 24, 2023

I'm also seeing a similar segfault when attempting to open a connection at all, without even quickly cycling the connection:

$ gz launch --versions
6.0.0

/usr/share/gz/gz-launch6/configs$ gz launch websocket.gzlaunch -v 4
[Dbg] [Manager.cc:1164] Loading plugin. Name[gz::launch::WebsocketServer] File[gz-launch-websocket-server]
[Dbg] [WebsocketServer.cc:414] Using port[9002]
[Dbg] [WebsocketServer.cc:429] Using maximum connection count of -1
[Wrn] [WebsocketServer.cc:559] Partial SSL configuration specified. Please specify: 	<ssl>
	  <cert_file>PATH_TO_CERT_FILE</cert_file>
	  <private_key_file>PATH_TO_KEY_FILE</private_key_file>
	</ssl>.
Continuing without SSL.
[Dbg] [WebsocketServer.cc:246] LWS_CALLBACK_ESTABLISHED
[Dbg] [WebsocketServer.cc:301] LWS_CALLBACK_RECEIVE
[Dbg] [WebsocketServer.cc:729] Protos request received
[Dbg] [WebsocketServer.cc:301] LWS_CALLBACK_RECEIVE
[Dbg] [WebsocketServer.cc:784] Topic and message type list request received
[Dbg] [WebsocketServer.cc:301] LWS_CALLBACK_RECEIVE
[Dbg] [WebsocketServer.cc:814] World info request received
[Dbg] [WebsocketServer.cc:301] LWS_CALLBACK_RECEIVE
Stack trace (most recent call last) in thread 50526:
#16   Object "", at 0xffffffffffffffff, in 
#15   Source "./misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S", line 81, in __clone3 [0x7fc91c3269ff]
#14   Source "./nptl/pthread_create.c", line 442, in start_thread [0x7fc91c294b42]
#13   Object "/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7fc91c6dc2b2, in std::error_code::default_error_condition() const
#12   Object "/usr/lib/x86_64-linux-gnu/gz-launch-6/plugins/libgz-launch-websocket-server.so", at 0x7fc91c9def4c, in gz::launch::WebsocketServer::Run()
#11   Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8cbfb6, in lws_service
#10   Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8ec979, in _lws_plat_file_open
#9    Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8ec6ea, in _lws_plat_file_open
#8    Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8c9808, in lws_service_fd_tsi
#7    Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8d66bd, in lws_hdr_custom_copy
#6    Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8d5b3c, in lws_hdr_custom_copy
#5    Object "/usr/lib/x86_64-linux-gnu/gz-launch-6/plugins/libgz-launch-websocket-server.so", at 0x7fc91c9ea44a, in rootCallback(lws*, lws_callback_reasons, void*, void*, unsigned long)
#4    Object "/usr/lib/x86_64-linux-gnu/gz-launch-6/plugins/libgz-launch-websocket-server.so", at 0x7fc91c9e763d, in gz::launch::WebsocketServer::OnMessage(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
#3    Object "/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7fc91c73cb34, in std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
#2    Object "/lib/x86_64-linux-gnu/libgz-common5.so.5", at 0x7fc91c970955, in gz::common::Logger::Buffer::xsputn(char const*, long)
#1    Object "/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7fc91c74a72d, in std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, long)
#0    Source "./string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S", line 317, in __memcpy_avx_unaligned_erms [0x7fc91c3a094d]
Segmentation fault (Address not mapped to object [(nil)])
Segmentation fault (core dumped)

This is simply when using the visualization app hosted from the gazebosim site:

Core dump with crash file:

_usr_lib_x86_64-linux-gnu_gz_launch6_gz-launch.1000.zip

System info:

Ubuntu 22.04

$ apt info libgz-launch6-dev
Package: libgz-launch6-dev
Version: 6.0.0-1~jammy
Priority: optional
Section: libdevel
Source: gz-launch6
Maintainer: Jose Luis Rivero <jrivero@osrfoundation.org>
Installed-Size: 97.3 kB
Depends: libgz-cmake3-dev, libgz-common5-dev, libgz-sim7-dev, libgz-gui7-dev, libgz-msgs9-dev, libgz-plugin2-dev, libgz-tools2-dev, libgz-transport12-dev, libsdformat13-dev, libtinyxml2-dev, libwebsockets-dev, qtquickcontrols2-5-dev, libqt5core5a, libgz-launch6 (= 6.0.0-1~jammy)
Breaks: libignition-launch6-dev (<< 5.999.999+nightly+git20220630+2rcec9c00a42bbd412815a3c9d64a3ce9b7dfd186d-2)
Replaces: libignition-launch6-dev (<< 5.999.999+nightly+git20220630+2rcec9c00a42bbd412815a3c9d64a3ce9b7dfd186d-2)
Homepage: https://github.com/gazebosim/gz-launch
Download-Size: 16.2 kB
APT-Manual-Installed: no
APT-Sources: http://packages.osrfoundation.org/gazebo/ubuntu-stable jammy/main amd64 Packages
Description: Gazebo Launch Library - Development files
 Gazebo Launch, a component of Gazebo, provides a command line
 interface to run and manager application and plugins.
 .
 Package contains the Gazebo launch development files and cli client

@usedhondacivic
Copy link

usedhondacivic commented Feb 27, 2024

Found something interesting regarding this bug. I am able to reproduce it using docker with this Dockerfile:

ARG ROS_VERSION=humble

FROM ros:$ROS_VERSION

RUN apt-get update && apt-get install -y --no-install-recommends wget curl

ARG GAZEBO_VERSION=garden

RUN wget https://packages.osrfoundation.org/gazebo.gpg -O /usr/share/keyrings/pkgs-osrf-archive-keyring.gpg && \
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/pkgs-osrf-archive-keyring.gpg] http://packages.osrfoundation.org/gazebo/ubuntu-stable $(lsb_release -cs) main" | tee /etc/apt/sources.list.d/gazebo-stable.list > /dev/null && \
apt-get update && \
apt-get install -y --no-install-recommends gz-$GAZEBO_VERSION ros-$ROS_DISTRO-ros-gz$GAZEBO_VERSION

RUN curl -O https://raw.githubusercontent.com/gazebosim/gz-launch/main/examples/websocket.gzlaunch

CMD bash -c "gz sim -s -v 4 shapes.sdf & gz launch -v 4 websocket.gzlaunch"

And running this command:
docker build -t gz_launch_bug . && docker run -it --network host gz_launch_bug

However, if I run with
docker build -t gz_launch_bug . && docker run -it -p9002:9002 gz_launch_bug
(ie, I expose the port instead of using --network host) I can connect just fine from the gazebo sim website visualizer.

I'm curious if @ruffsl was also using Docker / network host when he encountered the bug.

This is not a root cause of course, but could point someone more knowledgeable in the right direction.

@ruffsl
Copy link

ruffsl commented Mar 7, 2024

I'm curious if @ruffsl was also using Docker / network host when he encountered the bug.

@usedhondacivic , I think I probably was using the host network interface, as I was mainly using dev containers for experimentation & semi isolation for this project:

Perhaps this as something to do with unusual differences in process namespace isolation in containers vs matching host names with host network interfaces throwing off ZeroMQ, similarly to what I've experienced with DDS and shared memory transport?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants