Skip to content
This repository has been archived by the owner on Jun 23, 2022. It is now read-only.

feat(update_replication_factor#3): support to set the replication factor for all partitions of each table #1077

Merged

Conversation

empiredan
Copy link
Contributor


auto app = get_app_and_check_exist(app_name, response);
if (app == nullptr) {
response.hint_message +=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need +=


const auto app_id = app->app_id;

dassert_f(app->is_stateful,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not main business logic, such as read/write, not suggest execute dassert, which will cause unnecessary downtime of the cluster. You just need to return error to disable this operation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not main business logic, such as read/write, not suggest execute dassert, which will cause unnecessary downtime of the cluster. You just need to return error to disable this operation

Actually dassert(app->is_stateful) is put here according to existing code, such as
L1860, server_state::on_update_configuration. I think this assertion is just for the stateful table, is it ? On the other hand, it seems that for now there's no stateless table ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's no stateless table. Butserver_state::on_update_configuration is main logic, it must make sure current state is correct, so it dassert is acceptable. This added feature is not necessary for either successful or coredump, just return error log message is ok.

src/meta/server_state.cpp Outdated Show resolved Hide resolved
src/meta/server_state.cpp Outdated Show resolved Hide resolved
src/meta/server_state.cpp Outdated Show resolved Hide resolved
src/meta/server_state.cpp Outdated Show resolved Hide resolved
@foreverneverer
Copy link
Contributor

foreverneverer commented Mar 17, 2022

I find you use many dassert to validate the update action, I have comment for some. My suggestion is that updating replica count is not a key read-write path. Its failure will not have a real-time impact on the business. You just need to return an error and tell us that it has not succeeded, rather than downtime the machine

@hycdong
Copy link
Contributor

hycdong commented Mar 17, 2022

I think it is strange to define two functions called update_partition_max_replica_count, and update_next_partition_max_replica_count.
You can update each partition's max replica count through while, such as

for(auto i = 0; i < partition_count; i++){
    update_partition_max_replica_count(i);
}

Besides, I notice that #1072 support update app's max_replica_count, and in this pr, if partition update max_replica_count failed, app's value should be revert. I suggest that you can update the code like:

void update_app_max_replica_count(int new_replica_count) {
    // 1. mark updating replica count to avoid repeated request(this flag should store on zk, you can add a new filed in envs)
    // 2. update each partition replica count 
   //  3. after all partition update succeed, then update app replica count and mark not updating replica count
   //  function server_state::do_app_drop and server_state::process_one_partition may be helpful.
}

@empiredan empiredan marked this pull request as draft March 22, 2022 11:49
@empiredan empiredan marked this pull request as ready for review March 23, 2022 09:06
src/meta/server_state.cpp Show resolved Hide resolved
src/meta/server_state.cpp Outdated Show resolved Hide resolved
src/meta/server_state.cpp Outdated Show resolved Hide resolved
@acelyc111 acelyc111 merged commit ec3d397 into XiaoMi:max_replica_count_dev Mar 25, 2022
acelyc111 pushed a commit that referenced this pull request Jun 1, 2022
acelyc111 pushed a commit that referenced this pull request Jun 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants