-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for compressed batches of messages #63
Comments
I'm not 100% sure of the level of maintenance at this point. Your feature sounds very very interesting though. If you end up at a dead end and end up forking please let me know. My vote would be a single PR so others could pull into their forks if the maintenance here isn't where we need it. |
@benwh @GLStephen Thank you for your comments. I'm sorry for my late replying. I think I've looked at the Pull Requests presented. Is it possible to split the pull requests so that the changes are easy to understand? |
I added compression/decompression feature on v1.5.0. |
@mia-0032 Thanks for your work on this! Apologies that I haven't had the time to split up the proposed features and PR them. For our use-case having decompression enabled 'globally' at the subscription level isn't feasible, as we have some producers which can compress and others that cannot, hence rely on an attribute in the Pub/Sub message to mark the message as compressed. Additionally, we've found that batching messages provides much of the benefit when compressing. If I were to rebase and implement the features listed above, would you be willing to merge these in? The only potential downside is that it adds a lot of complexity compared to the current implementation. |
@benwh Thank you for your replying. I understand that globally compression and decompression is inconvenient in some use cases. I thought that the current compression feature can't support migrating no-compression to compression. So I feel it is good solution to determine if messages was compressed by attribute. I don't have the environment publishing much data so I can't verify the benefits of batch publishing. |
@mia-0032 This is something I've taken over from @benwh for the time being. For about 800 million records per day, we have on average $300 dollars/day, which adds up to a decent sum over the course of a month/year. I can't give you access to our grafana dashboards showing this, but based on that, I'd like to rebase our changes on top of yours and try to integrate them upstream |
Hi @mia-0032
Firstly thanks for your work on this plugin - we've been using it at GoCardless with great success for some time now.
We've recently developed a new feature for this plugin, that batches as many messages up as it can within the
flush_interval
, then compresses these with Zlib and sends them via a single Pub/Sub message, with an attribute set on it to indicate that it contains a compressed batch of messages.The implementation for this can be found in this PR.
Given that we can easily achieve a 90% compression ratio, and that we're sending upwards of 10TiB of logs over Pub/Sub each month, this results in significant cost savings while still maintaining decoupling between the sender and receiver. We've been running this code for a couple of weeks now without any issues.
We'd like to contribute this piece of work upstream to your repository. But before we do that, I wanted to check with you whether this is something that you'd accept, and what's the easiest way of us doing this for you? The PR consists of a few main pieces:
compress_batches
feature (this also adds some more Prometheus metrics, related to compression)For us, it would be ideal if we could get all of these changes merged back in to your upstream, to make any future features easy to develop without having a large divergence of the fork.
Which of these commits would you be willing for us to submit as a PR, and in what format: one PR per commit, or one PR with all of the commits? Also would there be any changes you'd like up-front before we open a PR?
Thanks!
The text was updated successfully, but these errors were encountered: