-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
out_kinesis_firehose: compression extraction and firehose integration -> master #4371
Conversation
src/aws/compression/arrow/compress.c
Outdated
@@ -0,0 +1,147 @@ | |||
/* | |||
* This converts S3 plugin's request buffer into Apache Arrow format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is moved from S3.
@@ -0,0 +1,7 @@ | |||
set(src | |||
compress.c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified from S3
@@ -0,0 +1,30 @@ | |||
add_subdirectory(compression) | |||
|
|||
set(src |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extract AWS specific source code to aws target
|
||
/* | ||
* Compression options configuration lifted from plugins into aws core | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, to add a new compression option, just add on a struct to this list referencing the id, keyword, and compression function. No changes (to s3 or firehose) are required.
@@ -427,6 +438,14 @@ static struct flb_config_map config_map[] = { | |||
"Custom endpoint for the STS API." | |||
}, | |||
|
|||
{ | |||
FLB_CONFIG_MAP_STR, "compression", NULL, | |||
0, FLB_FALSE, 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please feel free to let me know if this text should be more or less descriptive. Mostly just copied over from S3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking about if we need to update the description. 'gzip' is not the only supported value if we enable arrow at compile time. Maybe we can make it more accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is more accurate? How about:
"Compression type for Firehose records. Each log record is individually compressed "
"and sent to Firehose. 'gzip' and 'arrow' are the supported values. "
"'arrow' is only an available if Apache Arrow was enabled at compile time. "
"Defaults to no compression."
2a90504
to
b0a0f31
Compare
note the leaks at exit:
|
This exists on 1.8 branch as well.
Are "in use at exit" leaks? I thought these were acceptable, and the directly, indirectly, and possibly lost metrics were the ones I need to pay attention to. Running again with:
|
@@ -427,6 +438,14 @@ static struct flb_config_map config_map[] = { | |||
"Custom endpoint for the STS API." | |||
}, | |||
|
|||
{ | |||
FLB_CONFIG_MAP_STR, "compression", NULL, | |||
0, FLB_FALSE, 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking about if we need to update the description. 'gzip' is not the only supported value if we enable arrow at compile time. Maybe we can make it more accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. BTW, I saw [aws][compress]
a few times. not sure if we could improve here. Or maybe at least make it [aws_compress]
? Checked other files, this might be more consist.
"Compression type for S3 objects. 'gzip' and 'arrow' are the supported values. " | ||
"'arrow' is only an available if Apache Arrow was enabled at compile time. " | ||
"Defaults to no compression. " | ||
"If 'gzip' is selected, the Content-Encoding HTTP Header will be set to 'gzip'." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me know if this wording needs adjustment too.
tests/internal/aws_compress.c
Outdated
*pos++ = '\n'; | ||
line_len = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little different than our mbedtls implementation since it adds newlines. Our tests have short outputs where this newline is not utilized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove this or add a comment for the benefit of future maintainers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the newline.
@fujimotos, If tested we may be able to include arrow support into fluent 1.8.x I would really appreciate your help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, thanks for the extensive unit test cases 👏
truncated_in_len = in_len; | ||
truncated_in_buf = in_data; | ||
is_truncated = FLB_FALSE; | ||
b64_compressed_len = SIZE_MAX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does SIZE_MAZ come from??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SIZE_MAX comes from stdint.h providing a Limit of size_t type
/* Limit of `size_t' type. */
# if __WORDSIZE == 64
# define SIZE_MAX (18446744073709551615UL)
# else
# if __WORDSIZE32_SIZE_ULONG
# define SIZE_MAX (4294967295UL)
# else
# define SIZE_MAX (4294967295U)
# endif
# endif
Please see: https://en.cppreference.com/w/c/types/size_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do ( (size_t) -1 ) if SIZE_MAX is not standard enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do ( (size_t) -1 ) if SIZE_MAX is not standard enough
tests/internal/aws_compress.c
Outdated
*pos++ = '\n'; | ||
line_len = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove this or add a comment for the benefit of future maintainers?
1fb0013
to
2c58255
Compare
2c58255
to
fcf90f5
Compare
Note: the CI is failing some checks. |
fcf90f5
to
7ca40b2
Compare
I think there are make file issues causing the unit test failure on on of the builds... Please don't merge. I'm looking into this. |
7ca40b2
to
dbda644
Compare
0265f67
to
8a34c73
Compare
8a34c73
to
2f39d95
Compare
restructure aws cmake to make maintaining nested directories easier Signed-off-by: Matthew Fala <falamatt@amazon.com>
Signed-off-by: Matthew Fala <falamatt@amazon.com>
Signed-off-by: Matthew Fala <falamatt@amazon.com>
2f39d95
to
53cc8aa
Compare
Was able to test Arrow Compression. It works. |
Goal
Add Firehose compression option, and increase compression option maintainability across aws plugins
Changes
Configuration S3 Compression Migration
Datajet configuration
Without compression
With compression
Configuration for Firehose testing
Datajet configuration
Without compression
With compression
Unit Test Results
Unit Tests Valgrind
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
Documentation
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.