Force change cluster's elected master node #17493

redserpent7 · 2016-04-03T12:37:10Z

Hi,

Today I've been trying to increase the storage of my elasticsearch nodes. The nodes are hosted on AWS EC2 each with an attached EBS of 10 GB. I was trying to increase the EBS size to 20GB for each node and it all went fine until I went and restarted the cluster master.

It took about 30 seconds for the cluster to elect a new master and during that time all requests failed and all the other node gave me the 503 error when I tried to check their status.

I am wondering if there is a way to change the cluster master to a specific node instantly without having to wait for the nodes to elect a new master.

For example, lets say my cluster has three nodes:

Node1 (Cluster Master)
Node2
Node3

What I would like to do is change the size of the drive for lets say Node2, then once that node joins the cluster and all shards get reallocated, I force elect it as the cluster master, then I can safely change the configuration for the other two nodes.

Is this possible in ES? If not, how can I go about reducing the time it takes for the nodes to elect a new master?

clintongormley · 2016-04-04T19:47:52Z

You don't mention what version of Elasticsearch you're using. @bleskes why would it take 30s to elect a master?

redserpent7 · 2016-04-05T06:32:41Z

@clintongormley I am running 1.7.2, not sure why it took 30 secs the first time. I tried it some time later and it did not take that long to elect.

BTW, my PING timeout is set to 5s and the PING retry is set to the default, (not specified in yml)

bleskes · 2016-04-05T06:49:31Z

It should take 3s for a clean restart. It might take longer if network is slow or nodes are so overloaded that they fail to process the master loss quickly.

The only way to remove a master from it's position is to restart it. In theory it is possible to implement a clean mastership transfer but it's very tricky and there things we should do first. For now, I will close the issue.

@redserpent7 - if you keep running into a 30s master election please open up an issue with the revelent details (logs, timing + the output of _cat/master on all the nodes. this is a handy program for that)

redserpent7 · 2016-04-05T07:51:15Z

@bleskes I did try restarting the master node several times and it did not take a long time for the other nodes to re-elect a new master, probably it had something to do with AWS EC2 at the time of my initial restart.

Its a non issue for me really as the cluster in question is my testing environment while the production environment nodes have a big enough drives attached to them that should last long which makes the increase very infrequent.

I would like though if you can consider implementing the clean mastership transfer in a future version.

bleskes · 2016-04-05T08:07:11Z

Thank you for letting us know.

I would like though if you can consider implementing the clean mastership transfer in a future version.

I agree. We just have bigger fish to fry first.

munnerz · 2017-03-22T12:16:47Z

Hey @bleskes - I'm currently working on some automation for running Elasticsearch in a clustered fashion on top of Kubernetes, and would love to be able to manually trigger a master re-election (or alternatively, disallow the current master from being master, similar to setting cluster.routing.allocation.exclude. Right now upon a scale down event involving the master node, the cluster can turn red for up to 30s (thus serving no requests).

Are there any plans to implement this yet? Is it something you'd consider adding still?

(FYI, I am using ES 5.2.2 here)

bleskes · 2017-03-22T12:57:50Z

@munnerz can you open a topic on discuss.elastic.co and link it here? we can continue talking there. I have a feeling this will become a discussion ;)

munnerz · 2017-03-22T13:02:26Z

@bleskes done! Thanks for the prompt response :) https://discuss.elastic.co/t/gracefully-trigger-re-election-of-master-node/79587

devoncrouse · 2018-02-15T01:05:41Z

Now that the issue is closed, and the discussion is archived, and I'm curious if this is being tracked anywhere. Running on ephemeral infrastructure in AWS/OCI, I can completely reprovision new data and clients nodes without users noticing, but when I get to the last (active) master, I must still endure a stressful ~30 seconds of cluster unavailability. Just as with shard allocation settings, I'd like to be able to exclude one or more masters, have the cluster complete any queued operations against the active, and then gracefully elect a new master without completely rejecting operations. Any thoughts? Seems like a big fish.

bleskes · 2018-02-15T08:49:16Z

@devoncrouse during master re-election the cluster is available for search and indexing waits until a new master is reelected. This should take 3 seconds plus a little overheard. If it takes 30 seconds something else is not going right.

DaveCTurner · 2018-02-15T15:08:29Z

@devoncrouse how do you shut down the elected master? Do you terminate the Elasticsearch process with a signal, or do you simply pull the plug on the machine?

I ask this because if you simply pull the plug then the established connections to the master are not actively dropped, so it looks like a networking blip, and Elasticsearch waits for the network to be restored for a while before starting a new election. If you terminate Elasticsearch first then the connections are actively dropped and a master should be elected more quickly.

devoncrouse · 2018-02-15T16:36:07Z

Aha, that's probably my issue; I assumed it would have been getting signaled on machine shutdown, but I see now what's happening. Thanks for the reply.

michalm86 · 2018-07-09T10:43:53Z

Hi, I am running 3-node cluster using docker-compose and I experience ~30s (or even 45s) unavailability during re-election (to trigger re-election I run: 'docker stop master_node').
I reported this here: https://discuss.elastic.co/t/timed-out-waiting-for-all-nodes-to-process-published-state-and-cluster-unavailability/138590

Could anyone please take a look?

michalm86 · 2018-07-10T08:46:21Z

Thank you @DaveCTurner

luyuncheng · 2019-09-09T07:40:44Z

if there is a master leave but cluster settings not changed, what if changed the cluster block level into METADATA_WRITE?
I wonder whether a data node can continuously write when only change the leader without metadata changed.

DaveCTurner · 2019-09-09T09:19:07Z

@luyuncheng this issue was closed over 3 years ago, so this isn't a good place to ask a question like yours. I see you've asked the same question on another closed issue too. I recommend not doing this. If you would like to discuss your question, please open a thread on the discussion forum instead.

luyuncheng · 2019-09-09T09:36:19Z

@DaveCTurner Sorry about this, I opened a new thread on the discussion forum: Link

clintongormley added discuss :Cluster labels Apr 4, 2016

bleskes closed this as completed Apr 5, 2016

clintongormley added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. and removed :Cluster labels Feb 13, 2018

munnerz mentioned this issue May 8, 2018

Elasticsearch master election during upgrade jetstack/navigator#344

Open

dliappis mentioned this issue May 23, 2018

Force master reelection #30812

Closed

luyuncheng mentioned this issue Sep 9, 2019

Force a new master election for rolling restart #26992

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Force change cluster's elected master node #17493

Force change cluster's elected master node #17493

redserpent7 commented Apr 3, 2016

clintongormley commented Apr 4, 2016

redserpent7 commented Apr 5, 2016

bleskes commented Apr 5, 2016

redserpent7 commented Apr 5, 2016

bleskes commented Apr 5, 2016

munnerz commented Mar 22, 2017

bleskes commented Mar 22, 2017

munnerz commented Mar 22, 2017

devoncrouse commented Feb 15, 2018

bleskes commented Feb 15, 2018

DaveCTurner commented Feb 15, 2018

devoncrouse commented Feb 15, 2018

michalm86 commented Jul 9, 2018

michalm86 commented Jul 10, 2018

luyuncheng commented Sep 9, 2019

DaveCTurner commented Sep 9, 2019

luyuncheng commented Sep 9, 2019

Force change cluster's elected master node #17493

Force change cluster's elected master node #17493

Comments

redserpent7 commented Apr 3, 2016

clintongormley commented Apr 4, 2016

redserpent7 commented Apr 5, 2016

bleskes commented Apr 5, 2016

redserpent7 commented Apr 5, 2016

bleskes commented Apr 5, 2016

munnerz commented Mar 22, 2017

bleskes commented Mar 22, 2017

munnerz commented Mar 22, 2017

devoncrouse commented Feb 15, 2018

bleskes commented Feb 15, 2018

DaveCTurner commented Feb 15, 2018

devoncrouse commented Feb 15, 2018

michalm86 commented Jul 9, 2018

michalm86 commented Jul 10, 2018

luyuncheng commented Sep 9, 2019

DaveCTurner commented Sep 9, 2019

luyuncheng commented Sep 9, 2019