Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken http parser #1782

Merged
merged 3 commits into from
May 31, 2019
Merged

Fix broken http parser #1782

merged 3 commits into from
May 31, 2019

Conversation

pmconrad
Copy link
Contributor

... by replacing http_server with another websocketpp instance and decprecating cli_wallet -H

@pmconrad pmconrad added this to the 3.2.0 - Feature Release milestone May 28, 2019
@pmconrad pmconrad self-assigned this May 28, 2019
@abitmore
Copy link
Member

  1. cli_wallet which started with -H only crashes quite often on quit, but I was unable to reproduce with -r only. There might be something wrong. I think it's better to avoid duplicate code or different logic if the 2 options would serve same functionality.
  2. I'd like to keep -H for backward compatibility, because there are old tutorials and clients only use it but not -r, e.g. https://dev.bitshares.works/en/master/bts_guide/tutorials/exchange_single_node.html and the exchanges. I think it's better to reword the description e.g. "-H is an alternative to -r".
  3. I guess we can remove network/http/server.hpp and the cpp file from FC?

@pmconrad
Copy link
Contributor Author

I can't reproduce crashes, neither with -r or -H nor both. Did you update submodules before you tested?

Will deprecate -r instead of -H. Keeping both would confuse new users IMO.

Planning to remove the now unused http_server from FC in the context of #1584.

@abitmore
Copy link
Member

abitmore commented May 29, 2019

There are other clients using -r, E.G. the ruby faucet (https://github.com/bitshares/faucet).

I can reproduce the crash by starting cli_wallet with -H then press Ctrl+D after a few seconds, although it doesn't crash on every quit. Ubuntu 16.04, boost 1.58, openssl 1.0.2g.

Stack back trace:

Thread 1 "cli_wallet" received signal SIGSEGV, Segmentation fault.
0x0000000000d93f9c in fc::http::detail::websocket_server_impl::websocket_server_impl()::{lambda(std::weak_ptr<void>)#6}::operator()(std::weak_ptr<void>) const::{lambda()#1}::operator()() const ()

(gdb) bt
#0  0x0000000000d93f9c in fc::http::detail::websocket_server_impl::websocket_server_impl()::{lambda(std::weak_ptr<void>)#6}::operator()(std::weak_ptr<void>) const::{lambda()#1}::operator()() const ()
#1  0x0000000000d94259 in fc::detail::void_functor_run<fc::http::detail::websocket_server_impl::websocket_server_impl()::{lambda(std::weak_ptr<void>)#6}::operator()(std::weak_ptr<void>) const::{lambda()#1}>::run(void*, fc::http::detail::websocket_server_impl::websocket_server_impl()::{lambda(std::weak_ptr<void>)#6}::operator()(std::weak_ptr<void>) const::{lambda()#1}) ()
#2  0x0000000000cd6cd2 in fc::task_base::run_impl() ()
#3  0x0000000000cd4a8a in fc::thread_d::process_tasks() ()
#4  0x0000000000cd50e4 in fc::thread_d::start_process_tasks(long) ()
#5  0x0000000000f10ce1 in make_fcontext ()
#6  0x0000000000000000 in ?? ()

No interesting info found in other threads.

Still crashes with a clean build.

@abitmore
Copy link
Member

Still crashes with a clean build. It's strange that I haven't reproduced same issue on -r, perhaps just because it's initialized earlier? Or we can't have two instances of same class (probably an issue inside websocketpp)?

auto _websocket_server = std::make_shared<fc::http::websocket_server>();

auto _http_ws_server = std::make_shared<fc::http::websocket_server>();

@pmconrad
Copy link
Contributor Author

After many attempts I still haven't seen a crash (Boost-1.69, RelWithDebInfo build).

Your stacktrace indicates that the crash happens in the fail_handler. In fact a breakpoint in that handler is triggered after CTRL-D, even though there are no open connections. From the code I see that the websocket_server_impl destructor falls through after calling server.stop_listening() if there are no open connections, which means that the fail handler might be called after websocket_server_impl has been destroyed. I believe this is unrelated to this PR though.

I have attempted to rule out the effect from multiple websocker_server instances in the lastest push, by only creating an instance if the matching option is present. Please check if this improves things for you.

@abitmore
Copy link
Member

abitmore commented May 30, 2019

Same crash, in addition, now I can reproduce the crash on -r :D

I think I got what's happening. You're correct. ~websocket_server_impl() calls _server.stop_listening() which triggers the callback registered by set_fail_handler(), which may call _server_thread.async().wait() which won't block current thread so the websocket_server_impl object could be destructed first. When the code inside async() executes, _connection and _close no longer exist, thus the crash.

There is a race condition on the if( _server.is_listening() ) check, sometimes the _server_thread.async() won't be called, in this case it won't crash.

Actually, there is another race condition, the entire fail handler may execute after the server object is destructed, which may even lead to a crash on if( _server.is_listening() ), so IMHO we need to wait for the callback in the destructor.

I'll try to fix it when refactoring websocket_server_impl after bitshares/bitshares-fc#134 is merged.

@pmconrad pmconrad merged commit 4bc8339 into develop May 31, 2019
@pmconrad pmconrad deleted the 1772_http_server branch May 31, 2019 07:51
@pmconrad pmconrad mentioned this pull request May 31, 2019
17 tasks
@abitmore abitmore mentioned this pull request Jun 4, 2019
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants