Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Ship threaded send #10513

Merged
merged 7 commits into from
Jul 19, 2021
Merged

Conversation

huangminghuang
Copy link
Contributor

Change Description

This PR

  • creates additional threads to send data to ship clients.
  • fixes a bug where state history plugin would send blocks with block_num smaller than the client requests when the state history block head is behind the block head of the client.
  • stops the application when it fails to listen on the configured state history port.

Select ONE:

  • Documentation
  • Stability bug fix
  • Other
  • Other - special case

Testing Changes

Select ANY that apply:

  • New Tests
  • Existing Tests
  • Test Framework
  • CI System
  • Other

Consensus Changes

  • Consensus Changes

API Changes

  • API Changes

Documentation Additions

  • Documentation Additions

@huangminghuang huangminghuang requested a review from heifner July 14, 2021 16:45
@huangminghuang huangminghuang force-pushed the huangminghuang/ship-threaded-send branch from 8bb6a99 to 42dd93a Compare July 14, 2021 22:56
std::thread thr;
std::atomic<bool> send_thread_has_exception = false;
std::exception_ptr eptr;
boost::asio::io_context ctx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

io_context per connection seems a bit heavy. Since most/all uses of SHiP are likely to have only a few connections it is likely ok. Could have one io_context and share it across all connections. Just FYI, I'm ok with leaving it as is.

return;
}
self->socket_stream->binary(true);
self->start_read();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this is not thread safe. You are not allowed to access a socket from more than one thread at a time. Here you have the main thread doing an async_read and the internal thread doing an write.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not what I understand about socket programming. According to https://stackoverflow.com/questions/7418093/use-one-socket-in-two-threads, it should be safe for TCP. AFAIK, websocket is full-duplex communication channels over a single TCP connection, it should also be safe for read/write from different threads.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you really want to do it, you can dup() the fd and create two boost sockets referring to the same underlying TCP connection. Then use one boost socket for read, one for write, on two separate threads. Not that i'm recommending doing that here, but that's how it can be done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case anyone finds this: my brain was thinking asio socket. The semantics of dup()ing a socket backed by a websocket::stream are almost certainly no good!

plugins/state_history_plugin/state_history_plugin.cpp Outdated Show resolved Hide resolved
});
});
void send(T obj, ::std::optional<::fc::zipkin_span>&& span) {
if (!send_thread_has_exception) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we should at least log that we are ignoring this send.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception is reported via post at the end of thread function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems racy. send_thread_has_exception won't be set true until some period of time after the io_context has stopped running. So it seems possible for send() to queue up another post() that will never be dispatched. If that's not a problem (and it might not be, I'm not sure on cursory look), then I don't see why this check needs to be here in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop in the thread will break when the first task throw an exception, all the subsequent tasks would not be able to be executed. The check here is not strictly needed, just intended to be save some cycles.

plugins/state_history_plugin/state_history_plugin.cpp Outdated Show resolved Hide resolved
plugins/state_history_plugin/state_history_plugin.cpp Outdated Show resolved Hide resolved
plugins/state_history_plugin/state_history_plugin.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@spoonincode spoonincode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to review this further so I'm just putting this in for now to block merging. In particular I'm concerned about the change to a sync write()

@spoonincode
Copy link
Contributor

The sync write was already a significant shut down concern, but I get the impression that the mutex introduced to fix the threading violation now makes the sync write even more dangerous. If the sync write is blocked, but data is received, the main thread then becomes blocked because the async_read cannot start until the blocked sync write completes.

Would like to understand more why this sync write is being used instead of async. I've heard performance several times but tests I've performed show async write being able to send north of 6GB/sec over a websocket. A single thread with a strand per connection would seem more than capable of handling the demand.

@spoonincode spoonincode dismissed their stale review July 18, 2021 23:06

dismissing review

Copy link
Contributor

@heifner heifner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would still like to understand why async_write performs so poorly in this code where it seems perfectly fine in a simple test program. Hopefully a follow on task can be worked to understand.

@huangminghuang huangminghuang merged commit fc40ffb into develop-boxed Jul 19, 2021
@huangminghuang huangminghuang deleted the huangminghuang/ship-threaded-send branch October 8, 2021 17:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants