-
Notifications
You must be signed in to change notification settings - Fork 805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic replication batch size #4301
Conversation
Pull Request Test Coverage Report for Build 2a76fb8d-1110-483a-acad-b0b1184ceb47
💛 - Coveralls |
@@ -52,6 +53,8 @@ var ( | |||
errUnknownQueueTask = errors.New("unknown task type") | |||
errUnknownReplicationTask = errors.New("unknown replication task") | |||
defaultHistoryPageSize = 1000 | |||
minReadTaskSize = 20 | |||
maxReplicationLatency = int64(40) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding time unit suffix (I guess Seconds
) or even better use time.Duration instead. As of now it is not clear what 40 means.
t.taskLock.Lock() | ||
taskLatency := int64(time.Now().Sub(t.lastTaskCreationTime) / time.Second) | ||
t.taskLock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to understand how useful this lock is: it feels like having and not having the lock here are the same to me. Could you you explain a bit what it is achieving here?
Are the time assignments in go not atomic that if we try to read its value while assigning somewhere else we get a crash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most likely it will not be used. I added it to prevent concurrent calls to the method. Updated it with atomic.Value
…into dynamic-batch-size
…into dynamic-batch-size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Feel free to land after addressing the comments.
common/dynamicconfig/constants.go
Outdated
// ReplicatorUpperLatency indicates the max allowed replication latency between clusters | ||
// KeyName: history.replicatorUpperLatencyInSeconds | ||
// Value type: Duration | ||
// Default value: 40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 40 * time.Second
rateLimiter: rateLimiter, | ||
retryPolicy: retryPolicy, | ||
lastTaskCreationTime: atomic.Value{}, | ||
maxAllowedLatencyFn: config.ReplicatorUpperLatencyInSeconds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config.ReplicatorUpperLatencyInSeconds
is no longer defined I think.
if t.lastTaskCreationTime.Load() == nil { | ||
return defaultBatchSize | ||
} | ||
taskLatency := now.Sub(t.lastTaskCreationTime.Load().(time.Time)) / time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to / time.Second
here I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Updated
What changed?
Dynamic replication batch size
Why?
The replication batch size should be calculated based on the backlog in a particular shard. This could help with hot shards
How did you test it?
Unit tests
TODO: bench tests
Potential risks
Release notes
Documentation Changes