Skip to content
This repository has been archived by the owner on Nov 24, 2023. It is now read-only.

master: use etcd as operate queue #363

Closed
wants to merge 39 commits into from

Conversation

WangXiangUSTC
Copy link
Contributor

@WangXiangUSTC WangXiangUSTC commented Nov 18, 2019

What problem does this PR solve?

use etcd as operate queue:

What is changed and how it works?

  1. when dm-master receive operate request, save it to etcd, and watch the key for response
  2. dm-master watch the path /dm-operate. when getting events, will handle the request if the dm-master is leader.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test

Side effects

  • Breaking backward compatibility

Related changes

  • Need to be included in the release note

@WangXiangUSTC WangXiangUSTC added the status/WIP This PR is still work in progress label Nov 18, 2019
@WangXiangUSTC
Copy link
Contributor Author

/run-all-tests

@WangXiangUSTC
Copy link
Contributor Author

/run-all-tests tidb=release-3.0

@codecov
Copy link

codecov bot commented Nov 20, 2019

Codecov Report

Merging #363 into master will decrease coverage by 0.4939%.
The diff coverage is 37.6257%.

@@               Coverage Diff               @@
##             master       #363       +/-   ##
===============================================
- Coverage   57.5617%   57.0677%   -0.494%     
===============================================
  Files           163        164        +1     
  Lines         16471      16936      +465     
===============================================
+ Hits           9481       9665      +184     
- Misses         6065       6290      +225     
- Partials        925        981       +56

@WangXiangUSTC WangXiangUSTC added priority/important Major change, requires approval from ≥2 primary reviewers needs-update-release-note This PR should be added into release notes. Remove this label once the release notes are updated status/PTAL This PR is ready for review. Add this label back after committing new changes and removed status/WIP This PR is still work in progress labels Nov 20, 2019
@WangXiangUSTC
Copy link
Contributor Author

/run-all-tests tidb=release-3.0

@WangXiangUSTC
Copy link
Contributor Author

/run-all-tests tidb=release-3.0

@csuzhangxc csuzhangxc added the type/compatibility Backward compatibility will be broken after this PR merged label Nov 21, 2019

defaultOperatePath = "/dm-operate"

defaultEtcdTimeout = time.Duration(10 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also defined this in pkg/etcdutil/etcdutil.go. and and a CreateClient in it in #367 . we can merge then later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, if your pr merged first I will use your function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WangXiangUSTC It's time to update this defaultEtcdTimeout with the one in pkg/etcdutil/etcdutil.go now?

dm/master/etcd.go Outdated Show resolved Hide resolved
pkg/terror/error_list.go Show resolved Hide resolved
dm/proto/operate.proto Outdated Show resolved Hide resolved
)

func TestMaster(t *testing.T) {
etcdMockCluster := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

dm/master/server.go Outdated Show resolved Hide resolved
@WangXiangUSTC
Copy link
Contributor Author

/run-all-tests tidb=release-3.0

@WangXiangUSTC
Copy link
Contributor Author

/run-all-tests tidb=release-3.0

@WangXiangUSTC
Copy link
Contributor Author

/run-all-tests tidb=release-3.0

@WangXiangUSTC
Copy link
Contributor Author

/run-all-tests tidb=release-3.0

@WangXiangUSTC
Copy link
Contributor Author

/run-all-tests tidb=release-3.0

"github.com/pingcap/dm/dm/pb"
)

// NewShowLeaderCmd creates a CheckTask command
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update the comment (not CheckTask)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

function show_master_leader_success() {
run_dm_ctl $WORK_DIR "127.0.0.1:$MASTER_PORT" \
"show-master-leader" \
"\"result\": true" 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about checking any successful msg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add in 20524e7

message ShowMasterLeaderRequest {
}

// If have leader, result is true, and put leader information in msg.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about defining the leader information as another message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add in 20524e7

}

enum OperateStage {
UnknownStage = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need another stage like New to differ with the default Unknown?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add in 20524e7

server.etcdClient = etcdClient
ctx, cancel := context.WithCancel(context.Background())
server.election = &election.Election{}
server.election.SetIsLeader(true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a comment to highlight the purpose here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated 0201e85

@csuzhangxc csuzhangxc added status/PTAL This PR is ready for review. Add this label back after committing new changes and removed status/LGT1 One reviewer already commented LGTM labels Dec 3, 2019
@@ -253,6 +267,21 @@ func errorCommonWorkerResponse(msg string, worker string) *pb.CommonWorkerRespon

// StartTask implements MasterServer.StartTask
func (s *Server) StartTask(ctx context.Context, req *pb.StartTaskRequest) (*pb.StartTaskResponse, error) {
log.L().Info("receive request and will save it to etcd", zap.Stringer("payload", req), zap.String("request", "StartTask"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove will save it to etcd, because we must to save it into etcd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed f0931fb

}, nil
}

_, leaderID, err := s.election.LeaderInfo(ctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what‘s leaderID in DM?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's better to add a function getDMLeaderMasterInfo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update in 20524e7

for {
select {
case <-ctx.Done():
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some log to indicate that the goroutine exited, here or in defer function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add in b091a46


// only master leader need handle request
if s.election.IsLeader() {
go s.handleRequest(string(ev.Kv.Key), operate)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we manage these goroutines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is unnecessary, just abandon these requests when leader is exit. dmctl will get error, and user can execute command again later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I care about is correctness. If the leader migrates and the transaction processing has not ended, will there be unexpected consequences?

@WangXiangUSTC
Copy link
Contributor Author

/run-all-tests tidb=release-3.0

@csuzhangxc csuzhangxc mentioned this pull request Dec 6, 2019
23 tasks
@WangXiangUSTC WangXiangUSTC deleted the xiang/etcd_operate_queue branch January 9, 2020 06:58
@csuzhangxc csuzhangxc removed the needs-update-release-note This PR should be added into release notes. Remove this label once the release notes are updated label Jun 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority/important Major change, requires approval from ≥2 primary reviewers status/PTAL This PR is ready for review. Add this label back after committing new changes type/compatibility Backward compatibility will be broken after this PR merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants