-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Share buffer pool across all partitions #310
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, the change LGTM+1, just only some concerns whether the use of sync.pool here will cause GC pressure because sync.Pool
cannot specify a size, which is only subject to the GC threshold.
@@ -62,6 +63,8 @@ func newProducerCommand() *cobra.Command { | |||
"Publish rate. Set to 0 to go unthrottled") | |||
flags.IntVarP(&produceArgs.BatchingTimeMillis, "batching-time", "b", 1, | |||
"Batching grouping time in millis") | |||
flags.IntVarP(&produceArgs.BatchingMaxSize, "batching-max-size", "", 128, | |||
"Max size of a batch (in KB)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In pulsar-perf produce
about --batch-max-messages
is abbreviated and the default value is as follows, can we consider keeping the two consistent?
-bm, --batch-max-messages
Maximum number of messages per batch
Default: 1000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This are 2 different settings. One is the number of messages in 1 batch and the other is the max size of the batch
Using the pool will not causing GC pressure itself: it's actually there to avoid the GC pressure. The pool having no max size is not a big problem. The max amount of memory is still determined by the "pending" messages whose payloads are buffered until they get acknowledged by the broker. This change is to use a single pool to avoid that each pool will have a few buffers that cannot be immediately reused. After this change, the memory usage is the same as without the pooling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
Motivation
When a producer is publishing on many partitions, there can be significant memory overhead in maintaining a per-partition pool. Instead, there's not significant perf impact in using a single shared buffer pool.