-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#38 - Support zlib compression #39
Conversation
Hi @rahul007ashok , This is very interesting. I'm checking it out. |
Hi @riywo , I was wondering if you have any questions or suggestions. Thanks |
Hi @rahul007ashok , Could you confirm this contribution is released under Apache 2.0? Best |
Hi @riywo |
Thank you for confirmation! |
@@ -220,7 +222,11 @@ def get_key(name, record) | |||
end | |||
|
|||
def build_data_to_put(data) | |||
Hash[data.map{|k, v| [k.to_sym, v] }] | |||
if @gzip_compression | |||
Hash[data.map{|k, v| [k.to_sym, k=="data" ? Zlib::Deflate.deflate(v) : v] }] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In write
method, this plugin checks whether the size of record exceeds 1MB limitation or not. So, deflating after the checking will cause some trouble.
Hi @rahul007ashok , I've checked somehow and added a comment. Could you modify it to deflate before the checking size of records? Also, could you add the description for Best, |
Hi @riywo , Added documentation to the README and also modified the write method to deflate before checking the size of records. Thanks, |
@rahul007ashok Thanks for update. Let me check... Could you rebase your changes to a single commit please? |
6048d7c
to
a9577f4
Compare
@riywo |
a9577f4
to
8cfdc25
Compare
8cfdc25
to
0f9cb69
Compare
@riywo |
Thank you for the PR! I will release the new version later. |
Don't really know anything about fluentd plugins, but why isn't this done in the format method instead? Then we save space buffering on disk as well (and waste less memory during processing). |
The format method returns a MessagePack serialized array. MessagePack is not compatible with zlib/gzip(or other compression methods). So it's not possible for the format method to do the compression(although this would be better). |
But the format method can also return a raw string, AFAIK. Would make it somewhat harder to do, but not impossible? e.g. first 4 bytes of each record is length of record. Or you can get fancy and use some encoding like protobuf (i.e. anything which allows concatenation of binary outputs to make a valid sequence). |
Hmm, also, why doesn't MessagePack 'support zlib/gzip'? Can't it just store the zipped field in the binary format? (ok, I should try this myself) |
v0.4.0 is also released at rubygems. Thanks! |
Added Gzip support for Fluent plugin kinesis