-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improve][client][PIP-389] Add a producer config to improve compression performance #23525
base: master
Are you sure you want to change the base?
Conversation
f56999a
to
08c2853
Compare
Please add the PIP number to the PR title as we usually do. |
pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ProducerConfigurationData.java
Show resolved
Hide resolved
@liangyepianzhou Regarding performance optimizations for compression in Pulsar, there's also work that should be done. pulsar/pulsar-common/src/main/java/org/apache/pulsar/common/compression/CompressionCodecZLib.java Lines 60 to 85 in 82237d3
Another detail is that the current implementation isn't using "zero copy" approaches that are available. For example in Snappy: Lines 34 to 64 in 82237d3
In BookKeeper, I added zero-copy for calculating checksums in apache/bookkeeper#4196. The ByteBufVisitor approach could be used to avoid copying source buffers to an extra nio buffer. Calling Netty's io.netty.buffer.CompositeByteBuf#nioBuffer will allocate a new nio ByteBuffer in the heap and copy the content there. That's not very great from performance perspective, especially when we want to reduce allocations and garbage. With the ByteBufVisitor approach it's possible to read the source direct byte buffers without extra copies. Have you considered in addressing this performance issue in the Pulsar message compression solution? |
Sounds good, maybe I can try optimizing it in other PRs |
+1, In the Pulsar code base, we have a special module called |
Thanks for the reminder. |
@lhotari I try to optimize it yesterday. but I found that Pulsar did not use |
@liangyepianzhou I created #23586 to clarify the possible optimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use Pulsar code style. IDE instructions: https://pulsar.apache.org/contribute/setup-ide/#configure-code-style
PIP: #23526
Motivation
The motivation of this PIP is to provide a way to improve the compression performance by skipping the compression of small messages.
We want to add a new configuration compressMinMsgBodySize to the producer configuration.
This configuration will allow the user to set the minimum size of the message body that will be compressed.
If the message body size is less than the compressMinMsgBodySize, the message will not be compressed.
Verifying this change
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
(or)
This change is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: