Add S3Uploader to Flow Aggregator #4143

heanlan · 2022-08-22T18:11:59Z

This PR adds S3Uploader as a new exporter of Flow Aggregator. It
periodically exports expired flow records from Flow Aggregator
to AWS S3 storage bucket.

Signed-off-by: heanlan hanlan@vmware.com

codecov · 2022-08-22T19:17:37Z

Codecov Report

Merging #4143 (58ee55f) into main (222c2b6) will decrease coverage by 6.24%.
The diff coverage is 67.41%.

❗ Current head 58ee55f differs from pull request most recent head 8a85ca8. Consider uploading reports for the commit 8a85ca8 to get more accurate results

@@            Coverage Diff             @@
##             main    #4143      +/-   ##
==========================================
- Coverage   66.31%   60.06%   -6.25%     
==========================================
  Files         304      383      +79     
  Lines       46613    54256    +7643     
==========================================
+ Hits        30911    32589    +1678     
- Misses      13296    19222    +5926     
- Partials     2406     2445      +39

Flag	Coverage Δ
integration-tests	`35.62% <ø> (+0.72%)`	⬆️
kind-e2e-tests	`48.66% <28.55%> (-0.59%)`	⬇️
unit-tests	`41.29% <67.41%> (-3.96%)`	⬇️

Impacted Files	Coverage Δ
pkg/flowaggregator/exporter/s3.go	`0.00% <0.00%> (ø)`
pkg/flowaggregator/options/options.go	`31.25% <11.11%> (-5.49%)`	⬇️
pkg/flowaggregator/flowaggregator.go	`67.56% <42.50%> (-2.49%)`	⬇️
pkg/flowaggregator/s3uploader/s3uploader.go	`72.18% <72.18%> (ø)`
pkg/flowaggregator/flowrecord/record.go	`75.00% <75.00%> (ø)`
pkg/config/flowaggregator/default.go	`100.00% <100.00%> (+56.25%)`	⬆️
...lowaggregator/clickhouseclient/clickhouseclient.go	`73.59% <100.00%> (-9.13%)`	⬇️
pkg/controller/ipam/antrea_ipam_controller.go	`75.25% <0.00%> (-3.02%)`	⬇️
pkg/agent/util/ipset/ipset.go	`69.23% <0.00%> (-2.57%)`	⬇️
pkg/controller/networkpolicy/store/addressgroup.go	`88.37% <0.00%> (-2.33%)`	⬇️
... and 145 more

pkg/flowaggregator/exporter/s3.go

pkg/flowaggregator/flowrecord/record.go

pkg/flowaggregator/s3uploader/s3uploader.go

pkg/flowaggregator/flowrecord/record.go

pkg/flowaggregator/s3uploader/s3uploader.go

heanlan · 2022-08-24T17:18:12Z

I found an issue, and I haven't work out a solution. The .gz file generated by the latest commit CANNOT be opened by Mac's app Archive Utility ↑, but CAN be opened with command gunzip xxx.gz, also CAN be opened with Mac's app Console. But the previous commit works well with Archive Utility. Probably it's the same with a known issue: https://apple.stackexchange.com/questions/388759/archive-utility-cant-open-some-gzipped-text-files-based-on-their-contents , but I don't know how to verify it. I'm wondering do you believe it will cause any issue for consuming the file? @antoninbas @dreamtalen

Compare the latest commit - write record to the buffer one by one,
with the previous commit - write all records from the queue to the buffer at once,
I think the difference is how we use the gzipWriter, previously we create it before and close it after a batch writing. Now I put it in the struct field, before each upload, I close it, and after each upload, I reset it.

pkg/flowaggregator/s3uploader/s3uploader.go

antoninbas · 2022-08-24T17:36:38Z

@heanlan I didn't take an in-depth look since you still need to make changes to the code, but maybe you need to call Flush on the gzip writer if you are going to use Reset and re-use it.

Note that I think it's simpler just to allocate a new Gzip writer each time.

heanlan · 2022-08-26T16:01:11Z

I didn't take an in-depth look since you still need to make changes to the code, but maybe you need to call Flush on the gzip writer if you are going to use Reset and re-use it.

Note that I think it's simpler just to allocate a new Gzip writer each time.

Thanks. I called Close, which includes Flush. And Reset does the same thing with NewWriter. I also tried with NewWriter, the issue still exists.

pkg/flowaggregator/s3uploader/s3uploader.go

antoninbas · 2022-08-26T19:05:09Z

@heanlan
I was thinking about it more, and I think with a single queue, it will not be possible to achieve all the goals:

avoid copies
avoid holding a lock during IO
avoid blocking in CacheRecord
handle upload failures in a reasonable way

we probably need 2 queues:

1 for buffers which are complete but for which we haven't attempted upload (no need to enforce a max length, it will typically have 0 or 1 buffers) only. Call it Q1.
1 for buffers for which upload has failed (max length 5). Call it Q2.

we need a single lock to protect currentBuffer and Q1.

then in pseudo-code:

func CacheRecord(r) {
    lock()
    defer unlock()
    writeToCurrentBuffer(r)
    if currentBufferIsFull() {
        addCurrentBufferToQ1()
        resetCurrentBuffer()
    }
}

func Upload() {
    lock()
    newBuffers = getAllBuffersFromQ1()
    unlock()
    sendAllBuffers() // newBuffers + Q2 buffers
    foreach failedBuffer {
        addBufferToQ2(failedBuffer) // enforce max length for Q2
    }
}

func TimerFunction() {
    lock()
    addCurrentBufferToQ1()
    resetCurrentBuffer()
    unlock()
    Upload()
}

let me know what you think. Sorry for some of my comments that were misleading.

When reading from the buffers during upload, we need to wrap the *bytes.Buffer with bytes.Reader. Otherwise, data may be truncated as you pointed out before.

reader := `bytes.NewReader(buffer.Bytes())`

This doesn't make any copy.

heanlan · 2022-08-26T20:35:14Z

Thanks @antoninbas . I have two questions:

1 for buffers which are complete but for which we haven't attempted upload (no need to enforce a max length, it will typically have 0 or 1 buffers) only. Call it Q1.

Why it will typically have <=1 buffers? Because we set the maxRecordPerFile big enough?

I'm trying to understand why we need Q2. Is it only for avoid holding the lock while handling upload failures ⬇️ ?

    foreach failedBuffer {
        addBufferToQ2(failedBuffer) // enforce max length for Q2
    }

In the design, we are making actual CSV data copies for newBuffers, right? When upload fails, we can also grab the lock and append the failedBuffer to Q1?

antoninbas · 2022-08-26T20:47:34Z

Why it will typically have <=1 buffers? Because we set the maxRecordPerFile big enough?

Yes, but we can also consider waking up the upload goroutine when we add a buffer to Q1, without waiting for the next timer to fire. This can be done with a signal channel for example.

I'm trying to understand why we need Q2. Is it only for avoid holding the lock while handling upload failures

Yes, it it only to handle upload failures. If we didn't need to handle upload failures, we would be ok with a single queue.
We would: 1) pop the buffer from the queue (with lock), 2) upload to S3 (without lock), 3) ignore failures

In the design, we are making actual CSV data copies for newBuffers

No we don't make any copy. We remove the *byte.Buffer from Q1 (these are just pointers to the data) and we just add them to the list of buffers we need to send. One option is to add them to Q2 directly, drop as needed if the max length is exceeded, and then attempt to upload all buffers in Q2. If any upload fails, the buffer will stay in Q2. Can you clarify why you think we need a copy?

heanlan · 2022-08-26T21:12:24Z

Can you clarify why you think we need a copy?

I didn't realize before that we are removing buffers from Q1. It looks good then. Thanks.

Signed-off-by: Antonin Bas <abas@vmware.com>

heanlan · 2022-08-29T19:11:22Z

Hi @antoninbas , thanks for the suggestion. I've implemented it. Please help check.

Yes, but we can also consider waking up the upload goroutine when we add a buffer to Q1, without waiting for the next timer to fire. This can be done with a signal channel for example.

I didn't add this for now. I think it will bring possible concurrent read&write to Q2. And the second half of batchUploadAll is not protected by lock.

I didn't add the logging for number of flow records has been uploaded either, just for simplicity.

Please let me know if you believe we do need any of these two.

antoninbas

do you still have the same issue with the generated gzip files? because I don't remember having this issue when I prototyped that code in the past.

pkg/flowaggregator/s3uploader/s3uploader.go

antoninbas · 2022-08-29T19:35:14Z

I didn't add this for now. I think it will bring possible concurrent read&write to Q2. And the second half of batchUploadAll is not protected by lock.

I don't think so. All the accesses to Q2 would still happen in flowRecordPeriodicCommit which runs in a single goroutine. But there would be one more select case, so that the goroutine can be notified when a buffer is full and has been moved to Q1. That being said, I agree that it's not necessary to implement this now.

heanlan · 2022-08-29T20:20:23Z

do you still have the same issue with the generated gzip files? because I don't remember having this issue when I prototyped that code in the past.

Yes, I have. Yes, your poc code doesn't have this issue. And my first commit doesn't have this issue. Both of them creates the gzipWriter, write the file at one time, and close it. The current version write the file every time it caches a record. I suppose it should be the known issue with Archive Utility: https://apple.stackexchange.com/questions/388759/archive-utility-cant-open-some-gzipped-text-files-based-on-their-contents

heanlan · 2022-08-29T20:32:38Z

I don't think so. All the accesses to Q2 would still happen in flowRecordPeriodicCommit which runs in a single goroutine. But there would be one more select case, so that the goroutine can be notified when a buffer is full and has been moved to Q1. That being said, I agree that it's not necessary to implement this now.

Yes, you are right. I was thinking wrong. I have implemented a uploadCh for receiving the "buffer-full" signal in the initial commit, but without the Q1 to temporarily store the full currentBuffer, and reset it to let it continue caching new records. So the new records won't be cached until currentBuffer has been reset.

Yes, it should work well with the current Q1&Q2 approach. We can add it later if we need it.

antoninbas · 2022-08-29T20:39:24Z

https://apple.stackexchange.com/questions/388759/archive-utility-cant-open-some-gzipped-text-files-based-on-their-contents

could you check that you do get Missing type keyword in mtree specification when you run the Archive Utility in the terminal?

heanlan · 2022-08-29T20:51:26Z

could you check that you do get Missing type keyword in mtree specification when you run the Archive Utility in the terminal?

I wanted to check that. But I didn't find a way to run the app in terminal. I don't know how they did it. That's why I say I don't know how to verify it is the cause.

I've tried the following, they still pop out the UI window.

➜ Applications open /Users/hanlan/Downloads/records-qfm1xxnkxyf1.csv.gz -b com.apple.archiveutility
➜ Archive Utility.app open /Users/hanlan/Downloads/records-qfm1xxnkxyf1.csv.gz
➜ Downloads open records-qfm1xxnkxyf1.csv.gz

build/charts/flow-aggregator/README.md

build/images/flow-aggregator/Dockerfile

build/images/flow-aggregator/Dockerfile.coverage

pkg/flowaggregator/options/options.go

pkg/flowaggregator/s3uploader/s3uploader.go

pkg/flowaggregator/s3uploader/s3uploader_test.go

antoninbas

LGTM

pkg/flowaggregator/s3uploader/s3uploader_test.go

This PR adds S3Uploader as a new exporter of Flow Aggregator. It periodically exports expired flow records from Flow Aggregator to AWS S3 storage bucket. Signed-off-by: heanlan <hanlan@vmware.com>

heanlan · 2022-08-31T19:42:12Z

/test-all

antoninbas

Approving again after squash

antoninbas · 2022-08-31T22:24:07Z

@dreamtalen any more comments on this?

dreamtalen · 2022-09-01T16:53:22Z

@dreamtalen any more comments on this?

No, LGTM

heanlan force-pushed the s3-uploader branch from 1375fe7 to 4fb0551 Compare August 22, 2022 18:13

heanlan force-pushed the s3-uploader branch 2 times, most recently from 5c03a61 to 80a6123 Compare August 22, 2022 20:39

heanlan requested review from antoninbas and dreamtalen August 22, 2022 20:57

antoninbas reviewed Aug 22, 2022

View reviewed changes

dreamtalen reviewed Aug 23, 2022

View reviewed changes

antoninbas reviewed Aug 24, 2022

View reviewed changes

pkg/flowaggregator/s3uploader/s3uploader.go Outdated Show resolved Hide resolved

antoninbas reviewed Aug 24, 2022

View reviewed changes

pkg/flowaggregator/s3uploader/s3uploader.go Outdated Show resolved Hide resolved

heanlan force-pushed the s3-uploader branch from 2ef611d to 953729b Compare August 26, 2022 03:16

heanlan commented Aug 26, 2022

View reviewed changes

pkg/flowaggregator/s3uploader/s3uploader.go Outdated Show resolved Hide resolved

antoninbas reviewed Aug 26, 2022

View reviewed changes

Add configs of s3Uploader to Flow Aggregator

4686458

Signed-off-by: Antonin Bas <abas@vmware.com>

heanlan force-pushed the s3-uploader branch from 953729b to 36164c3 Compare August 29, 2022 18:30

antoninbas reviewed Aug 29, 2022

View reviewed changes

heanlan force-pushed the s3-uploader branch from f30cfbe to fa47ef9 Compare August 29, 2022 22:01

antoninbas reviewed Aug 29, 2022

View reviewed changes

antoninbas reviewed Aug 30, 2022

View reviewed changes

pkg/flowaggregator/s3uploader/s3uploader_test.go Outdated Show resolved Hide resolved

pkg/flowaggregator/s3uploader/s3uploader_test.go Outdated Show resolved Hide resolved

heanlan force-pushed the s3-uploader branch from d3a309c to 7958d48 Compare August 30, 2022 21:09

antoninbas reviewed Aug 30, 2022

View reviewed changes

pkg/flowaggregator/s3uploader/s3uploader_test.go Show resolved Hide resolved

pkg/flowaggregator/s3uploader/s3uploader_test.go Outdated Show resolved Hide resolved

antoninbas previously approved these changes Aug 31, 2022

View reviewed changes

pkg/flowaggregator/s3uploader/s3uploader_test.go Show resolved Hide resolved

Add S3Uploader to Flow Aggregator

8a85ca8

This PR adds S3Uploader as a new exporter of Flow Aggregator. It periodically exports expired flow records from Flow Aggregator to AWS S3 storage bucket. Signed-off-by: heanlan <hanlan@vmware.com>

heanlan dismissed antoninbas’s stale review via 8a85ca8 August 31, 2022 19:38

heanlan force-pushed the s3-uploader branch from 58ee55f to 8a85ca8 Compare August 31, 2022 19:38

antoninbas approved these changes Aug 31, 2022

View reviewed changes

antoninbas merged commit dfc4b7d into antrea-io:main Sep 1, 2022

heanlan deleted the s3-uploader branch September 1, 2022 17:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add S3Uploader to Flow Aggregator #4143

Add S3Uploader to Flow Aggregator #4143

heanlan commented Aug 22, 2022

codecov bot commented Aug 22, 2022 •

edited

Loading

heanlan commented Aug 24, 2022 •

edited

Loading

antoninbas commented Aug 24, 2022 •

edited

Loading

heanlan commented Aug 26, 2022

antoninbas commented Aug 26, 2022

heanlan commented Aug 26, 2022

antoninbas commented Aug 26, 2022 •

edited

Loading

heanlan commented Aug 26, 2022 •

edited

Loading

heanlan commented Aug 29, 2022

antoninbas left a comment

antoninbas commented Aug 29, 2022

heanlan commented Aug 29, 2022

heanlan commented Aug 29, 2022

antoninbas commented Aug 29, 2022

heanlan commented Aug 29, 2022 •

edited

Loading

antoninbas left a comment

heanlan commented Aug 31, 2022

antoninbas left a comment

antoninbas commented Aug 31, 2022

dreamtalen commented Sep 1, 2022

Add S3Uploader to Flow Aggregator #4143

Add S3Uploader to Flow Aggregator #4143

Conversation

heanlan commented Aug 22, 2022

codecov bot commented Aug 22, 2022 • edited Loading

Codecov Report

heanlan commented Aug 24, 2022 • edited Loading

antoninbas commented Aug 24, 2022 • edited Loading

heanlan commented Aug 26, 2022

antoninbas commented Aug 26, 2022

heanlan commented Aug 26, 2022

antoninbas commented Aug 26, 2022 • edited Loading

heanlan commented Aug 26, 2022 • edited Loading

heanlan commented Aug 29, 2022

antoninbas left a comment

Choose a reason for hiding this comment

antoninbas commented Aug 29, 2022

heanlan commented Aug 29, 2022

heanlan commented Aug 29, 2022

antoninbas commented Aug 29, 2022

heanlan commented Aug 29, 2022 • edited Loading

antoninbas left a comment

Choose a reason for hiding this comment

heanlan commented Aug 31, 2022

antoninbas left a comment

Choose a reason for hiding this comment

antoninbas commented Aug 31, 2022

dreamtalen commented Sep 1, 2022

codecov bot commented Aug 22, 2022 •

edited

Loading

heanlan commented Aug 24, 2022 •

edited

Loading

antoninbas commented Aug 24, 2022 •

edited

Loading

antoninbas commented Aug 26, 2022 •

edited

Loading

heanlan commented Aug 26, 2022 •

edited

Loading

heanlan commented Aug 29, 2022 •

edited

Loading