Skip to content
This repository has been archived by the owner on Nov 24, 2023. It is now read-only.

dm-master: leader election #367

Merged
merged 20 commits into from
Nov 26, 2019

Conversation

csuzhangxc
Copy link
Member

@csuzhangxc csuzhangxc commented Nov 20, 2019

What problem does this PR solve?

add leader election supported for DM-master based on embed etcd

What is changed and how it works?

do leader election based on embed etcd

some code learn from https://github.com/pingcap/tidb/blob/v3.0.5/owner/manager.go.

Check List

Tests

  • Unit test
  • Integration test

Code changes

  • Has exported function/method change
  • Has exported variable/fields change
  • Has interface methods change
  • Has persistent data change

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

  • Need to be included in the release note

@csuzhangxc csuzhangxc added priority/important Major change, requires approval from ≥2 primary reviewers status/WIP This PR is still work in progress type/feature New feature needs-update-release-note This PR should be added into release notes. Remove this label once the release notes are updated labels Nov 20, 2019
@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

@codecov
Copy link

codecov bot commented Nov 20, 2019

Codecov Report

Merging #367 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master       #367   +/-   ##
===========================================
  Coverage   57.6674%   57.6674%           
===========================================
  Files           162        162           
  Lines         16368      16368           
===========================================
  Hits           9439       9439           
  Misses         6006       6006           
  Partials        923        923

@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

1 similar comment
@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

@csuzhangxc csuzhangxc added status/PTAL This PR is ready for review. Add this label back after committing new changes and removed status/WIP This PR is still work in progress labels Nov 20, 2019
@csuzhangxc
Copy link
Member Author

@WangXiangUSTC , @GregoryIan or @amyangfei PTAL

@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

@csuzhangxc
Copy link
Member Author

I update the methods for Election after added PTAL label. It's ready for review again now.

@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

Comment on lines 1 to 7
all_mode
sequence_sharding
sequence_safe_mode
relay_interrupt
start_task
initial_unit
http_apis
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the CI to run these cases separately


for _, ev := range resp.Events {
if ev.Type == mvccpb.DELETE {
e.l.Info("fail to watch, the leader is deleted", zap.ByteString("key", ev.Kv.Key))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the type is DELETE, dose it mean the electionTTL is overdue?

Copy link
Member Author

@csuzhangxc csuzhangxc Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user may force to trigger a new campaign by deleting the key with etcdctl (or dmctl in the future); in the normal case, before electionTTL overdue, Session will keep alive for the lifetime of a client, so this will not get DELETE event.

see:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about adding a comment in which case will get delete event

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in b99ae46.

"github.com/pingcap/dm/pkg/log"
)

func (s *Server) electionNotify(ctx context.Context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need this function, seems toBeLeader and retireLeader already print the log

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toBeLeader and retireLeader only log the leadership of the current member.
this function is used to show the usage of Election (and also log the leader name even if it's not the current member), do you think should we need to remove it?

Copy link
Member Author

@csuzhangxc csuzhangxc Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more thing, pkg/election is a pkg, but this function is in the application. Maybe we need to add some metrics for the leadership later, and do that in this function is better.

so, how about removing the log in pkg/election?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if master needs to do something(for example: meet error in the election, master exit) according to the election's result I think putting here is better.
it is up to you

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to remove the log in pkg/election in b99ae46.

pkg/election/election.go Show resolved Hide resolved
s.closed.Set(false) // the server started now.

// start leader election
s.election = election.NewElection(ctx, s.etcdClient, electionTTL, electionKey, s.cfg.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to handle election's error from election.ErrorNotify()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I update Election to let it more like which one in TiDB's owner pkg in 80ae3d2 and 5c4c820.

  1. Detect any potential error (for unrecoverable error) in NewElection, and do not start Server if an error occurred.
  2. Retry for errors (for recoverable error) until the context is done after entering the campaign loop.
  3. receive errors from election.ErrorNotify() in server/election.go, but do no meaningful things now.

@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

@csuzhangxc
Copy link
Member Author

@WangXiangUSTC PTAL again

Copy link
Contributor

@WangXiangUSTC WangXiangUSTC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

select {
case <-time.After(newSessionRetryInterval):
case <-ctx.Done():
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you means break the for loop, this break only break the select
https://stackoverflow.com/questions/11104085/in-go-does-a-break-statement-break-from-a-switch-select

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!, fixed in b99ae46.

@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

@WangXiangUSTC
Copy link
Contributor

LGTM

@WangXiangUSTC WangXiangUSTC added status/LGT1 One reviewer already commented LGTM and removed status/PTAL This PR is ready for review. Add this label back after committing new changes labels Nov 21, 2019
pkg/election/election.go Outdated Show resolved Hide resolved
dm/master/election.go Outdated Show resolved Hide resolved
@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

1 similar comment
@csuzhangxc
Copy link
Member Author

/run-all-tests tidb=release-3.0

Copy link
Contributor

@amyangfei amyangfei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@amyangfei amyangfei added status/LGT2 Two reviewers already commented LGTM, ready for merge and removed status/LGT1 One reviewer already commented LGTM labels Nov 26, 2019
@csuzhangxc csuzhangxc merged commit 18f66f2 into pingcap:master Nov 26, 2019
@csuzhangxc csuzhangxc deleted the master-leader-election branch November 26, 2019 15:00
@csuzhangxc csuzhangxc mentioned this pull request Dec 6, 2019
23 tasks
lichunzhu pushed a commit to lichunzhu/dm that referenced this pull request Apr 6, 2020
@csuzhangxc csuzhangxc removed the needs-update-release-note This PR should be added into release notes. Remove this label once the release notes are updated label Jun 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority/important Major change, requires approval from ≥2 primary reviewers status/LGT2 Two reviewers already commented LGTM, ready for merge type/feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants