Skip to content
This repository has been archived by the owner on Jun 23, 2022. It is now read-only.

feat(split): replica server handle pause and cancel status #681

Merged
merged 3 commits into from
Jan 19, 2021

Conversation

hycdong
Copy link
Contributor

@hycdong hycdong commented Dec 4, 2020

When meta server pause or cancel split, partition's split_status will be pausing or canceling(#679), and this split_status will transfer replica server through on_config_sync(#653). This pr implements how replica server handle those two split_status.
When primary parent partition receives pausing or canceling split_status from meta server, it will set split_status into not_split and broadcast it through group_check, secondary parent partition also set its split_status into not_split and set is_split_stopped = true in group_check_response. Primary parent partition will check if all partitions in its group have already paused or canceled split, if all stop succeed, it will send notification to meta, which will be implmented in next pull request.

@hycdong hycdong marked this pull request as ready for review January 13, 2021 05:24
"wrong partition_status({})",
enum_to_string(status()));
dassert_replica(_split_status == split_status::SPLITTING ||
_split_status == split_status::NOT_SPLIT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why allow stop under NOT_SPLIT?

Copy link
Contributor

@foreverneverer foreverneverer Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, if we send cancel or pause by mistake, dassert will cause crash? is your expect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When learn happened, parent partition may stop split, set it _split_status as NOT_SPLIT, meta server won't know it, so it is possible when a pause or cancel split request sync to parent partition, its split_status is NOT_SPLIT.

Copy link
Contributor Author

@hycdong hycdong Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, if we send cancel or pause by mistake, dassert will cause crash? is your expect?

Pause or cancel split request will send to meta server, meta server should check if this table is splitting, you can reference pr679. Besides, parent partition will not set split_status as PAUSING and CANCELING, when it receives pause or cancel request, it will set it NOT_SPLIT. I don't know if I explain it clearly, you can comment to me if you have any questions.

return;
}

if (!resp->__isset.is_split_stopped || !resp->is_split_stopped) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__isset.is_split_stopped and is_split_stopped defference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__isset.is_split_stopped = false means this group_check_response doesn't have any information about pause or cancel split including normal group_check or splitting group_check, when it is true, meaning this group_check include pause or cancel split information. resp->is_split_stopped = true means secondary parent partition pause or cancel split succeed.


_replica->_primary_states.split_stopped_secondary.insert(req->node);
auto count = 0;
for (auto &iter : _replica->_primary_states.statuses) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can count be stored as global variable, but not repeat compute when check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's safety to check it each time, because _primary_states.statuses may change during learn, and this case is not always happened, it's okay to check it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants