Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve trx_generator http client #1699

Merged
merged 5 commits into from
Oct 3, 2023
Merged

Improve trx_generator http client #1699

merged 5 commits into from
Oct 3, 2023

Conversation

heifner
Copy link
Member

@heifner heifner commented Sep 29, 2023

When error conditions (eof) have the http_plugin shutdown_both on socket. Also call close() for good measure. I actually tried this explicit shutdown in the destructor but it seemed to make no difference.

Modify the trx_generator to better handle errors from client and also specify no keep alive.

This improved performance as can be seen by comparing:
This branch: https://github.com/AntelopeIO/leap/actions/runs/6356928532 15001 TPS
With main: https://github.com/AntelopeIO/leap/actions/runs/6356929999 14001 TPS

@heifner heifner added the OCI Work exclusive to OCI team label Oct 2, 2023
Comment on lines 76 to 78
req_.set(http::field::connection, "close");
req_.body() = std::move(request_body);
req_.keep_alive(false);
Copy link
Contributor

@greg7mdp greg7mdp Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these changes? I would think that keep_alive is better?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was an attempt to get nodeos to not hold on to the connection as long.
Note that nodeos has this: https://github.com/AntelopeIO/leap/blob/main/plugins/http_plugin/include/eosio/http_plugin/beast_http_session.hpp#L114

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, wouldn't the transaction generator send a bunch of requests to the node, and benefit from the connection not dropping after each request?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would if the trx_generator reused a socket. It creates a new connection for each request. We really need to fix that because on my machine I run out of ports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. For posterity, maybe add a comment that we disable keep_alive because the trx_generator doesn't reuse sockets, but this needs to be changed, and then keep_alive should be re-enabled.

@@ -503,7 +503,8 @@ class beast_http_session : public detail::abstract_conn,
try {
// Send a shutdown signal
beast::error_code ec;
socket_.shutdown(Socket::shutdown_send, ec);
socket_.shutdown(Socket::shutdown_both, ec);
socket_.close(ec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change a little more. I understand what it's doing but I don't follow if the PR is suggesting this is what improves performance, and if so, why.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change less about performance and more about leaving a socket in timed wait on client abandoning a socket. The eof is used for error conditions, so makes sense to shutdown both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change got rid of TIME_WAIT states?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the keep_alive change helped with the TIME_WAIT. This was for when it was already in a bad state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand what the purpose of this entire do_eof() function is. Letting the socket's shared_ptr fall out of scope is going to close it anyways.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try experimenting with just removing it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 4f4f5d3 removed do_eof, resulted in read-only-trx-parallel-test failing do to Failure when receiving data from the peer which looks like it is http error 56. This test does a lot of http requests to nodeos.

It does look like the do_eof is needed. The example boost code also has the do_eof.
@spoonincode I will revert 4f4f5d3 unless you can find something wrong in my removal of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, with the removal of do_eof, this branch reported a 15501 TPS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we might have a problem somewhere; I'm still not understanding why it's needed. But yeah might as well put it back for now.

@heifner heifner merged commit f239a63 into main Oct 3, 2023
22 checks passed
@heifner heifner deleted the GH-1662-close branch October 3, 2023 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCI Work exclusive to OCI team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants