Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async rpc in put message to cluster #24

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

DoraALin
Copy link
Collaborator

@DoraALin DoraALin commented May 9, 2017

Signed-off-by: 元守 lulin@youzan.com
change rpc call in async way in PutMessageToCluster & PutMessagesToCluster
async in put messages to cluster

@@ -222,9 +224,10 @@ func (self *NsqdCoordinator) PutMessagesToCluster(topic *nsqd.Topic,
}
return false
}
clusterErr := self.doSyncOpToCluster(true, coord, doLocalWrite, doLocalExit, doLocalCommit, doLocalRollback,
clusterErr := self.doSyncOpToClusterAsync(true, coord, doLocalWrite, doLocalExit, doLocalCommit, doLocalRollback,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No async for cluster write, async is only for slave replication.

@@ -116,8 +117,8 @@ func (self *NsqdCoordinator) PutMessageToCluster(topic *nsqd.Topic,
return false
}

clusterErr := self.doSyncOpToCluster(true, coord, doLocalWrite, doLocalExit, doLocalCommit, doLocalRollback,
doRefresh, doSlaveSync, handleSyncResult)
clusterErr := self.doSyncOpToClusterAsync(true, coord, doLocalWrite, doLocalExit, doLocalCommit, doLocalRollback,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is No async for cluster write, async is only for slave replication.

if rpcErr == nil {
wg.Add(1)
id := nodeID
go func(){
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think no goroutinues should be used here, just wait on the slave result channel is enough

syncStart = time.Now()
}
select {
case <-time.After(c.c.RequestTimeout):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use only one timeout for all slaves. Once timeout cancel all the slave replication and retry or just fail.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need onr timeout for all? My thought is that if all async rpc calls need wait, it only degrade to sync rpc call in time eclapse.

@DoraALin DoraALin force-pushed the async-in-cluster-rpc-call branch from 2839dfc to b51d1f1 Compare May 15, 2017 10:19
@DoraALin
Copy link
Collaborator Author

@absolute8511 for your review again.

@DoraALin DoraALin force-pushed the async-in-cluster-rpc-call branch from b51d1f1 to 84a6757 Compare May 16, 2017 03:38
@DoraALin
Copy link
Collaborator Author

@absolute8511 for your review

Signed-off-by: 元守 <lulin@youzan.com>

async in put messages to cluster

Signed-off-by: 元守 <lulin@youzan.com>

remove debug log

Signed-off-by: 元守 <lulin@youzan.com>

one doSyncOpToCluster

Signed-off-by: 元守 <lulin@youzan.com>
@DoraALin DoraALin force-pushed the async-in-cluster-rpc-call branch from 84a6757 to 32a31e0 Compare May 17, 2017 03:10
@DoraALin
Copy link
Collaborator Author

@absolute8511 for your review. async rpc response wait incorporated in SlaveWriteResult.GetResult()

@@ -318,15 +320,12 @@ retrysync:
}
}
success = 0
failedNodes = make(map[string]struct{})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing this? While we retry we need reset the failed info for all nodes

var start time.Time
if checkCost {
start = time.Now()
}
// should retry if failed, and the slave should keep the last success write to avoid the duplicated
rpcErr = doSlaveSync(c, nodeID, tcData)
retSlave := doSlaveSync(c, nodeID, tcData)
rpcResps = append(rpcResps, RpcResp{nodeID, retSlave})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rpcResps and totalTimeout should be reset while retry.

@absolute8511
Copy link
Owner

@DoraALin Could you please compare the detail benchmark before and after this changes. (both for benchmark test and the real server test.)

@DoraALin
Copy link
Collaborator Author

@absolute8511 benchmark shows async rpc call does NOT fatser than replica synchronization. I will try with server test.

Sync RPC call
10000 211161 ns/op 0.61 MB/s
--- BENCH: BenchmarkNsqdCoordPub1Replicator128-4
10000 279250 ns/op 0.46 MB/s
--- BENCH: BenchmarkNsqdCoordPub2Replicator128-4
10000 297141 ns/op 0.43 MB/s
--- BENCH: BenchmarkNsqdCoordPub3Replicator128-4
5000 251618 ns/op 4.07 MB/s
--- BENCH: BenchmarkNsqdCoordPub1Replicator1024-4
10000 331672 ns/op 3.09 MB/s
--- BENCH: BenchmarkNsqdCoordPub2Replicator1024-4
5000 307006 ns/op 3.34 MB/s
--- BENCH: BenchmarkNsqdCoordPub3Replicator1024-4
3000 380150 ns/op 0.34 MB/s
--- BENCH: BenchmarkNsqdCoordPub5Replicator128-4
5000 427857 ns/op 2.39 MB/s
--- BENCH: BenchmarkNsqdCoordPub5Replicator1024-4
5000 325689 ns/op 0.39 MB/s
--- BENCH: BenchmarkNsqdCoordPub4Replicator128-4
5000 344935 ns/op 2.97 MB/s
--- BENCH: BenchmarkNsqdCoordPub4Replicator1024-4

Async RPC call
PASS
10000 223244 ns/op 0.57 MB/s
--- BENCH: BenchmarkNsqdCoordPub1Replicator128-4
5000 344346 ns/op 0.37 MB/s
--- BENCH: BenchmarkNsqdCoordPub2Replicator128-4
5000 350341 ns/op 0.37 MB/s
--- BENCH: BenchmarkNsqdCoordPub3Replicator128-4
5000 205962 ns/op 4.97 MB/s
--- BENCH: BenchmarkNsqdCoordPub1Replicator1024-4
5000 322390 ns/op 3.18 MB/s
--- BENCH: BenchmarkNsqdCoordPub2Replicator1024-4
5000 457972 ns/op 2.24 MB/s
--- BENCH: BenchmarkNsqdCoordPub3Replicator1024-4
3000 441307 ns/op 0.29 MB/s
--- BENCH: BenchmarkNsqdCoordPub5Replicator128-4
3000 392285 ns/op 2.61 MB/s
--- BENCH: BenchmarkNsqdCoordPub5Replicator1024-4
5000 361425 ns/op 0.35 MB/s
--- BENCH: BenchmarkNsqdCoordPub4Replicator128-4
3000 377754 ns/op 2.71 MB/s
--- BENCH: BenchmarkNsqdCoordPub4Replicator1024-4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants