-
Notifications
You must be signed in to change notification settings - Fork 336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consume Performance drops when set EnableBatchIndexAcknowledgment = true #949
Comments
I think the root cause is the lack of the ACK grouping tracker feature in Golang client. Here is a similar issue and fix for C++ client: apache/pulsar#6534 |
It might take some time for that, I will start the feature catch up next week. Assign this issue to me first, if someone else is interested, please ping me in this issue. |
Ok, thanks. Looking forward to new PR. |
@BewareMyPower Hi, is there any progress ? |
It's almost complete except a few failed tests: BewareMyPower#1 And I found a problem that the Golang client only supports synchronous ACK APIs. If you're going to enable ACK with response, the ACK grouping tracker won't work. |
OK, we disable the AckWithResponse option now, it's no problem. |
Fixes apache#949 ### Motivation Currently the Go client does not support grouping ACK requests, so each time `Ack` (or similar APIs) is called, a ACK request will be sent, which could downgrade the performance. We need to support configuring the time and size to cache `MessageID` before sending ACK requests. ### Modifications - Add an `AckGroupingOptions` field to `ConsumerOptions`, when it's nil, use 100ms as the max time and 1000 as the max size. - Add an `ackGroupingTracker` interface to support grouping ACK requests - When `AckWithResponse` is false, adding the `MessageID` instance to the tracker instead of sending the requests to `eventsCh`. ### Verifying this change - [ ] Make sure that the change passes the CI checks. This change added tests and can be verified as follows: - Added `ack_grouping_tracker_test.go` to verify `ackGroupingTracker` in various cases - The consumer side change can be covered by existing tests because the default `AckGroupingOptions` config is `{ MaxSize: 1000, MaxTime: 100*time.Millisecond }`.
* Support grouping ACK requests by time and size Fixes #949 ### Motivation Currently the Go client does not support grouping ACK requests, so each time `Ack` (or similar APIs) is called, a ACK request will be sent, which could downgrade the performance. We need to support configuring the time and size to cache `MessageID` before sending ACK requests. ### Modifications - Add an `AckGroupingOptions` field to `ConsumerOptions`, when it's nil, use 100ms as the max time and 1000 as the max size. - Add an `ackGroupingTracker` interface to support grouping ACK requests - When `AckWithResponse` is false, adding the `MessageID` instance to the tracker instead of sending the requests to `eventsCh`. ### Verifying this change - [ ] Make sure that the change passes the CI checks. This change added tests and can be verified as follows: - Added `ack_grouping_tracker_test.go` to verify `ackGroupingTracker` in various cases - The consumer side change can be covered by existing tests because the default `AckGroupingOptions` config is `{ MaxSize: 1000, MaxTime: 100*time.Millisecond }`. * Fix flushAndClean race * Use unbuffered channel for flush operations * Apply different AckGroupingOptions and expose this config
Let's reopen this issue for further discussion under #957. I tested acknowledging by list in my local branch, it does not make a significant difference (About 48 Mbps to 54 Mbps). I'm going to investigate more this week. |
ok. |
Updated In my latest code, I changed the grouping config: ./perf consume --profile \
--enable-batch-index-ack \
--ack-group-max-ms 100 \
--ack-group-max-size 10000000 \
my-topic And now, the consumer can catch up the producer in my local env, though the produce rate is only about 20 MB/s.
I won't push the PR at the moment because I think there is something wrong with the ACK grouping tracker implementation. Ideally, we should not configure such a large value of |
New PR merged, I will take a test. |
@BewareMyPower I made a test today.
There are a lot of efforts put on the improvements under enable_batch_ack scenes, could you have a look at the disable_batch_ack scenes? Usually, I think |
The performance test results are close between these two cases in my local env. BTW, this issue has been resolved. You can open a new issue for that. Please describe how did you test. |
OK, I do more tests in local env. |
According to PR: #938
use master version(v0.9.1-0.20230117072740-d9b18d0690c1) to consume messages while EnableBatchIndexAcknowledgment set
true,but consume performance drops to 2/3 of previous。
The test situation is as follows:
Topic has 5 partitions, producer production rate is 20MB/s , 300000 rows/s.
and consumers consume situations:
Analyze the problem by pprof,we found that internal.(*connection).internalSendRequest and pulsar.(*partitionConsumer).internalAck are much more resource intensive when set EnableBatchIndexAcknowledgment as true.
Review the code:
Maybe problem is that partitionConsumer will send ack request to Pulsar Server by every MessageID, without waiting all msg of one batch be acked by ackTracker, it leads to ack requests becoming much more than BatchIndexAck disabled, performance drops bacause of much more processing requests. And backlog is lasting increasing, could not catch up with the production rate.
So, enableBatchIndexAck should follow the previous processing method or there is another way.
@BewareMyPower Thank you for developing this feature. Could you take a look at this problem ? Thanks a lot!
The text was updated successfully, but these errors were encountered: