-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pubsub: Consider a blocking publisher to prevent OOMing #16
Comments
@evanj, thanks for raising the issue, and the clear explanation. @plamut can you please check this out? The only thing we can likely tackle in the near term is the extra copies. We'd need to reevaluate the current non-blocking approach, which is the default across all clients. It would be worthwhile to explore a blocking option in the long term. @kamalaboulhosn we'll need you to make the call about this feature. |
@sduskis you have unassigned @plamut. @kamalaboulhosn: What was the determination about creating a back-pressure mechanism? |
I think it's something we'll still need to discuss. We had considered it at some point. We also considered exposing the stats for the outstanding requests so that the user could make the decision about what they wanted to do when there were a lot of outstanding sends. |
FWIW, I believe the official Go client will return an error when the number of buffered bytes exceeds a threshold, by default: 10 * maxMessageLength: https://github.com/googleapis/google-cloud-go/blob/master/pubsub/topic.go#L103 |
@evanj I have an update on this, #96 will add the desired functionality to the publisher client. It will allow configuring the limits for in-flight messages being published, and the action to take if these limits are exceeded. The default action is IGNORE, but one will also be able to specify BLOCK or ERROR. |
Amazing! Thank you! |
Publishing medium-sized (32 kiB) messages in a loop causes the process's memory usage to quickly run out. I set a 2 GiB memory limit, and the process runs out of memory after publishing about 20000 messages. I believe there are two issues:
As far as I can see, there no good way to use the current client library to publish large batches of messages without HUGE memory consumption. I believe this can be fixed in two ways:
Reduce memory of the current library:
Release messages when the batch is done: In
thread.Batch._commit
, setself._messages
toNone
after it is published. This seems to make memory consumption better, but I didn't test it super carefully.Eliminate an extra copy: Change thread.Batch to build a PubsubRequest, and pass that into the gapic.PublisherClient, rather than a list of messages that then need to be copied. This reduces one copy of messages, which seems to make memory consumption better (and possibly make things faster? again not tested carefully)
Back Pressure
Change the publisher to keep a queue of batches, and block after there are more than N in flight at once.
I implemented a client that takes the blocking queue approach. It can publish this workload without ever exceeding 300 MiB of memory. Only 3 in-flight batches were necessary to have the process be CPU bound when running in Google Cloud, so I don't think the queue doesn't even need to be too large.
Environment details
OS: Linux, ContainerOS (GKE), Container is Debian9 (using distroless)
Python: 3.5.3
API: google-cloud-python 0.41.0
Steps to reproduce
Code example
The text was updated successfully, but these errors were encountered: