Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

commit_ret_elems_ performance improvement and enclose basic types with namespace #142

Closed
wants to merge 4 commits into from

Conversation

sheepgrass
Copy link
Contributor

@sheepgrass sheepgrass commented Oct 15, 2020

I've made the following changes:

  • improve performance by reducing commit_ret_elems_lock_ scope
  • improve performance by using find() instead of loop for getting entry from commit_ret_elems_
  • enclose typedefs with namespace in basic_type.hxx to avoid name collision with other projects
  • add skip_initial_election_timeout_ option to raft_params
  • support setting raft callback in raft_launcher::init()

Copy link
Contributor

@greensky00 greensky00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sheepgrass, thanks for your contributions.

I see that your current PR contains multiple changes together, and we recommend one (functional) change per PR. I think it can be split into three PRs:

  1. Adding namespace nuraft around custom type definitions.
  2. Passing skip_initial_election_timeout_ and callback functions to Raft launcher.
  3. commit_ret_elems_ things.

Could you please submit separate PRs for 1 and 2? And please refer to my comments regarding 3. Thanks!

@@ -34,7 +34,7 @@ if (BOOST_INCLUDE_PATH AND BOOST_LIBRARY_PATH)

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DUSE_BOOST_ASIO")

set(ASIO_INCLUDE_DIR ${BOOST_INCLUDE_PATH})
set(ASIO_INCLUDE_DIR ${BOOST_INCLUDE_PATH}/boost)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is unnecessary, you can set the variable as

-DBOOST_INCLUDE_PATH=<your library include path>/boost

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said this is my first PR, actually this cmake file change is for my own use only and I actually have no intention to make PR for this. Sorry for bother you.

Comment on lines +63 to +66
if (BOOST_INCLUDE_PATH)
message(STATUS "Boost include path: " ${BOOST_INCLUDE_PATH})
include_directories(BEFORE ${BOOST_INCLUDE_PATH})
endif ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boost include path is already printed out at line 32 above. And also include_directories of boost include path is not required, as what we need is ASIO_INCLUDE_DIR only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said this is my first PR, actually this cmake file change is for my own use only and I actually have no intention to make PR for this. Sorry for bother you.

@@ -23,6 +23,8 @@ limitations under the License.

#include <cstdint>

namespace nuraft {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please make this change as a separate PR? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Comment on lines +476 to +486

/**
* If `true`, the election timer will not be initiated
* automatically, so that this node will never trigger
* leader election until it gets the first heartbeat
* from any valid leader.
*
* Purpose: to avoid becoming leader when there is only one
* node in the cluster.
*/
bool skip_initial_election_timeout_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raft_params includes parameters repeatedly used throughout the lifetime of a Raft server, while skip_initial_election_timeout_ is just a one-time parameter for the initialization only. If the reason why you added it here is merely for passing it to launcher, please add raft_server::init_options as a parameter of the constructor of launcher.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood and noted

@@ -45,7 +46,8 @@ public:
ptr<logger> lg,
int port_number,
const asio_service::options& asio_options,
const raft_params& params);
const raft_params& params,
cb_func::func_type raft_callback = nullptr);
Copy link
Contributor

@greensky00 greensky00 Oct 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the other comment, can you please replace this line with raft_server::init_options, and move cb_func::func_type raft_callback into raft_server::init_options? Please make this change as a separate commit too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood and noted

auto entry = commit_ret_elems_.begin();
while (entry != commit_ret_elems_.end()) {
auto entry = commit_ret_elems_.find(sm_idx);
if (entry != commit_ret_elems_.end()) {
ptr<commit_ret_elem> elem = entry->second;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes regarding commit_ret_elems_lock_ will not be acceptable because of the following situation. Let's say T1 is the user thread (calling append_entries) and T2 is the commit thread.

  1. T1: appends log and calls pre_commit, but thread scheduling happens before dealing with commit_ret_elem.
  2. Replication is done.
  3. T2: calls commit_app_log, grabs the lock, but cannot find corresponding commit_ret_elem, and releases the lock.
  4. T1: grabs the lock, adds commit_ret_elem (let's say elem1) as there is no corresponding commit_ret_elem, and releases the lock.
  5. T2: grabs the lock, adds commit_ret_elem (let's say elem2), and invokes elem2.
  6. T1: sleeps using elem1.

Due to the racing between user and commit threads, commit_ret_elems_lock_ should protect the entire process of 1) finding commit_ret_elem and 2) inserting commit_ret_elem if not exists. We can't release the lock in the middle.

Please note that, in most cases, there won't be notable performance degradation caused by commit_ret_elems_lock_, as Raft commit rate is around a few thousand operations per second due to network latency and disk I/O (by log store and state machine), while the commit_ret_elem logic within the lock is a pure in-memory operation without I/O: faster than a few hundred thousand executions per second I believe. The impact will be marginal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the changes regarding commit_ret_elems_lock_ have been made a few months ago so I have already forgotten some details here. But my main purpose is to avoid locking unnecessary scope to reduce lock contention.

And for commit_app_log(), I want to avoid elem->awaiter_.invoke() inside commit_ret_elems_lock_ as it would wait other thread to gain lock inside.

Besides the lock, for commit_app_log() function, I have removed the while loop for linear search and instead I use the commit_ret_elems_.find() for getting the required entry. Please review if it's ok.

@sheepgrass
Copy link
Contributor Author

Hi @sheepgrass, thanks for your contributions.

I see that your current PR contains multiple changes together, and we recommend one (functional) change per PR. I think it can be split into three PRs:

  1. Adding namespace nuraft around custom type definitions.
  2. Passing skip_initial_election_timeout_ and callback functions to Raft launcher.
  3. commit_ret_elems_ things.

Could you please submit separate PRs for 1 and 2? And please refer to my comments regarding 3. Thanks!

Sorry for mixing the issues, since this is my first PR so I don't know the common practice for that. I will create separate PRs.

@sheepgrass sheepgrass marked this pull request as draft October 20, 2020 09:02
@sheepgrass sheepgrass closed this Oct 22, 2020
@sheepgrass sheepgrass deleted the master branch October 22, 2020 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants