-
Notifications
You must be signed in to change notification settings - Fork 188
dm-master: leader election #367
dm-master: leader election #367
Conversation
/run-all-tests tidb=release-3.0 |
Codecov Report
@@ Coverage Diff @@
## master #367 +/- ##
===========================================
Coverage 57.6674% 57.6674%
===========================================
Files 162 162
Lines 16368 16368
===========================================
Hits 9439 9439
Misses 6006 6006
Partials 923 923 |
/run-all-tests tidb=release-3.0 |
/run-all-tests tidb=release-3.0 |
1 similar comment
/run-all-tests tidb=release-3.0 |
@WangXiangUSTC , @GregoryIan or @amyangfei PTAL |
/run-all-tests tidb=release-3.0 |
/run-all-tests tidb=release-3.0 |
I update the methods for |
/run-all-tests tidb=release-3.0 |
tests/others_integration.txt
Outdated
all_mode | ||
sequence_sharding | ||
sequence_safe_mode | ||
relay_interrupt | ||
start_task | ||
initial_unit | ||
http_apis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the CI to run these cases separately
|
||
for _, ev := range resp.Events { | ||
if ev.Type == mvccpb.DELETE { | ||
e.l.Info("fail to watch, the leader is deleted", zap.ByteString("key", ev.Kv.Key)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the type is DELETE
, dose it mean the electionTTL is overdue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user may force to trigger a new campaign by deleting the key with etcdctl (or dmctl in the future); in the normal case, before electionTTL overdue, Session
will keep alive for the lifetime of a client
, so this will not get DELETE
event.
see:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about adding a comment in which case will get delete event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added in b99ae46.
"github.com/pingcap/dm/pkg/log" | ||
) | ||
|
||
func (s *Server) electionNotify(ctx context.Context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need this function, seems toBeLeader
and retireLeader
already print the log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toBeLeader
and retireLeader
only log the leadership of the current member.
this function is used to show the usage of Election
(and also log the leader name even if it's not the current member), do you think should we need to remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more thing, pkg/election
is a pkg, but this function is in the application. Maybe we need to add some metrics for the leadership later, and do that in this function is better.
so, how about removing the log in pkg/election
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if master needs to do something(for example: meet error in the election, master exit) according to the election's result I think putting here is better.
it is up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose to remove the log in pkg/election
in b99ae46.
dm/master/server.go
Outdated
s.closed.Set(false) // the server started now. | ||
|
||
// start leader election | ||
s.election = election.NewElection(ctx, s.etcdClient, electionTTL, electionKey, s.cfg.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to handle election
's error from election.ErrorNotify()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I update Election
to let it more like which one in TiDB's owner pkg in 80ae3d2 and 5c4c820.
- Detect any potential error (for unrecoverable error) in
NewElection
, and do not startServer
if an error occurred. - Retry for errors (for recoverable error) until the context is done after entering the campaign loop.
- receive errors from
election.ErrorNotify()
in server/election.go, but do no meaningful things now.
/run-all-tests tidb=release-3.0 |
/run-all-tests tidb=release-3.0 |
@WangXiangUSTC PTAL again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM
pkg/election/election.go
Outdated
select { | ||
case <-time.After(newSessionRetryInterval): | ||
case <-ctx.Done(): | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you means break the for loop, this break only break the select
https://stackoverflow.com/questions/11104085/in-go-does-a-break-statement-break-from-a-switch-select
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!, fixed in b99ae46.
/run-all-tests tidb=release-3.0 |
LGTM |
…er-election # Conflicts: # tests/others_integration.txt
/run-all-tests tidb=release-3.0 |
1 similar comment
/run-all-tests tidb=release-3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What problem does this PR solve?
add leader election supported for DM-master based on embed etcd
What is changed and how it works?
do leader election based on embed etcd
some code learn from https://github.com/pingcap/tidb/blob/v3.0.5/owner/manager.go.
Check List
Tests
Code changes
Side effects
Related changes