S3 cost optimisation: Remember the last index value that was used #160

tomfitzhenry · 2016-07-29T15:20:12Z

The problem

Suppose you use the defaults:

s3_object_key_format of %{path}%{time_slice}_%{index}.%{file_extension}
time_slice_format of %Y%m%d%H

And suppose you flush every 30 seconds. So 120 files per hour.

The first file will check whether foo-2016010109_1.gz exists via an S3 HEAD request, see it doesn't exist, and then upload to that filename.

The next file will be first check whether foo-2016010109_1.gz exists via an S3 HEAD request, see it exists, and so increment the index to foo-2016010109_2.gz, check whether it exists via with an S3 HEAD request, see it doesn't exist, and then upload to the filename.

This will continue. When we get to the final file of the hour (the 120th file), we'll first do 119 HEAD requests!

That's 1+2+...+119 = 7140 S3 requests over the hour. And that's per input log file, per instance.

S3 HEAD requests are "$0.004 per 10,000 requests". So the monthly cost of the above, for 5 log files for 100 instances amounts to 7140_5_100_24_30*$0.004/1000 = $1028

More generally, 1+2+...+n is O(n^2) and we can reduce this to O(n).

Solutions

(a) The user can modify time_slice_format to include %M. Or the default could include %M.
(b) fluent-plugin-s3 could remember the last index it uploaded to, and so not have to check whether the n-1 earlier files already exist: fluentd would know they do.

If either solution was implemented, we'd've reduced the number of HEAD requests from O(n^2) to O(n). (Technically (a) doesn't reduce the solution to O(n^2), it just makes our n tiny.)

So rather than 7140 S3 requests per hour per log file per instance, we'd only do 120.

This reduces the monthly cost from $1028 to $17.

The text was updated successfully, but these errors were encountered:

cnorthwood · 2016-07-29T15:21:46Z

The issue with (b) is if you have multiple instances writing to the same path, you'd still need to have the collision check.

tomfitzhenry · 2016-07-29T15:24:59Z

Oops. I updated (b) to clarify that the HEAD request would still occur for the index we're trying, but it then wouldn't need to try the n-1 earlier requests.

If we're up to index n, we needn't check 1, 2, ..., n-1. But fluentd-plugin-s3 seems to: See https://github.com/fluent/fluent-plugin-s3/blob/master/lib/fluent/plugin/out_s3.rb#L190 and line 219.

tomfitzhenry · 2016-07-29T15:32:14Z

I should say also it was @cnorthwood who spotted this issue. :)

tomfitzhenry · 2016-07-29T15:42:29Z

For less sensationalism, perhaps a more common case of having 20 instances uploading 5 files every 5 minutes costs $2.2 with the O(n^2) and would cost $0.34 with the O(n).

repeatedly · 2016-08-01T08:56:08Z

Hard point of (b) is s3 plugin has num_threads.
If num_threads is larger than 2, "last index" is unclear in flushing threads.

Hmm...

repeatedly · 2016-08-04T08:00:15Z

(a) The user can modify time_slice_format to include %M. Or the default could include %M.

If default includes %M, flush cycle is shorter. Maybe, almost users is not expeced because it pus lots of files on S3.
This is not efficient for several processing engines, e.g. Hadoop or imporintg files into DWH.

Some users add %{hex_random} to s3_object_key_format.
This reduces requests and improves performance.

For reducing HEAD cost, one way is adding skip_existence_check like option with ${hex_random} / ${uuid_flush}.
But I'm not sure this is good or not...

ozonni · 2019-06-17T15:35:06Z

we also encountered the same problem. The main reason for us to use s3 plugin was to reduce costs for storing huge log files. We use fluentd+s3 next to regular ELK-like solution. But now actually we are paying more for s3 as we would pay to increase our main log storage.

see fluent#160

caleb15 · 2020-10-02T23:50:48Z

We ran into this issue as well. There's been months where we received literally billions of HEAD requests. It's hard to say how much it has cost us exactly but based on our historical usage I'd guess around 5,000 dollars†. That's right, Jeff Bezos has gotten so much money off this bug he could buy a used 2005 mazda mazda3:

(Just kidding, but the point is we should probably warn newcomers or fix the bug unless we want to buy Bezos another car :P )

Hard point of (b) is s3 plugin has num_threads.

You could create a variable to store the last log name and lock it behind a mutex so threads can set & access it safely.
See #3 in https://blog.arkency.com/3-ways-to-make-your-ruby-object-thread-safe/

I'm tempted to try it out but I don't know ruby... hmm.

There's also a slight problem with the approach in that at the start of the program you would still have to go through all the prefixed s3 items to find the last index. If td-agent is restarted frequently or if there is a crazy amount of logs in the bucket this could also lead to high costs.

I think it would be better to make ${hex_random} and check_object true the default. That would require no complicated code changes and would avoid the problem entirely. The downside to that is it becomes slightly harder for a user to find the latest log in s3. They would have to look through the s3 last modified timestamp in all logs in the latest prefixed hour/hour/day/month. For most users this is probably not a big deal. The other downside is that this would require a new major release version. I still think it's worth it, assuming no other downsides. Any objections to this approach?

As a temporary workaround we have simplified our td-agent config and upgraded to td-agent v4. I'm not even sure how we ran into this problem - the config is set to flush only once every 24 hours, but somehow we flushed about once a minute. The really odd thing is the rapid flushing happened in a periodic cycle, where it would continue for weeks and then pause for long time before starting back up again. I'll be monitoring the # of requests to see if our flushing problem (and thus this problem) is solved for us.

† This is a very rough guess, I won't know for sure until I see how much costs come down by.

repeatedly · 2020-10-06T21:30:19Z

You could create a variable to store the last log name and lock it behind a mutex so threads can set & access it safely.

I think this has slow upload issue with multiple threads. If previous upload takes 20 seconds, other threads wait 20 seconds or more. It causes buffer overflow easily.

Dlder s3 plugin doesn't have check_object parameter, so some old comments are outdated.
I think documentation update is needed for this case. Add S3 cost optimization section and mention check_object / random placeholder is better for high volume environment.

The s3 plugin uses a default object key that is problematic in a few ways. 1. It makes HEAD requests for each chunk it uploads, starting from 1 each time. If you have uploaded 2000 log files within the same time slice, it will make 2001 HEAD requests to figure out if it exists. fluent/fluent-plugin-s3#160 2. The above check is not thread-safe, and two threads can race and decide to use the same `%{index}` value, with the loser of the race overwriting the chunk from the winner. fluent/fluent-plugin-s3#326 This is planned to change for v2, but there's no clear path to v2 right now. The plugin does warn already if you use multiple threads and don't use either `%{chunk_id}` or `%{uuid_hash}` in the object key.

As mentioned in a warning, as well as fluent#326 and fluent#160, the process of determining the index added to the default object key is not thread-safe. This adds some thread-safety until version 2.x is out where chunk_id is used instead of an index value. This is not a perfect implementation, since there can still be races between different workers if workers are enabled in fluentd, or if there are multiple fluentd instances uploading to the same bucket. This commit is just to resolve this problem short-term in a way that's backwards compatible.

As mentioned in a warning, as well as fluent#326 and fluent#160, the process of determining the index added to the default object key is not thread-safe. This adds some thread-safety until version 2.x is out where chunk_id is used instead of an index value. This is not a perfect implementation, since there can still be races between different workers if workers are enabled in fluentd, or if there are multiple fluentd instances uploading to the same bucket. This commit is just to resolve this problem short-term in a way that's backwards compatible. Signed-off-by: William Orr <will@worrbase.com>

see fluent#160 Signed-off-by: caleb15 <caleb@15five.com>

formanojhr · 2020-11-11T23:16:47Z

@caleb15 Apart from the warning added - is there a recommended solution to avoid the index checking ? which in turn will avoid the HEAD requests? This is the %{path}%{hostname}-%{time_slice}_%{index}.%{file_extension} s3_object_key_format we are using. We are seeing high number of HEAD requests. Does this #326 locally cache the last index for the file between threads to avoid the issue? Thanks!

repeatedly · 2020-11-11T23:26:51Z

For avoiding HEAD request call, set check_object false.
In addition. you need to change s3_object_key_format to use %{uuid_flush} or ${chunk_id} for unique path.

formanojhr · 2020-11-11T23:34:01Z

thanks @repeatedly what is the implication of avoiding HEAD request call ? Would it overwrite existing file ? or know from cache what index it needs to write to? Guessing %{uuid_flush} is the id then picked up from cache

caleb15 · 2020-11-13T00:34:49Z

@formanojhr uuid = universally unique id. Because the id is totally unique overwriting becomes virtually impossible.

https://stackoverflow.com/questions/1155008/how-unique-is-uuid

formanojhr · 2020-11-19T20:54:43Z

@repeatedly I think since you are the dev on this might be something you can answer. Once I add the check_object false and adding s3_object_key_format %{path}%{hostname}-%{uuid_flush}-%{time_slice}_%{index}.%{file_extension}
check_object=false what we notice is the HEAD requests really drop to zero but also the PUT requests go down to zero for though we do see the logs being written. Any idea if a different protocol is being used to write?

github-actions · 2021-07-06T10:05:34Z

This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days

caleb15 · 2021-07-06T18:38:01Z

@cosmo0920 can you remove the stale lifecycle and add a bug label please. This is a rare but serious issue.

sabdalla80 · 2022-07-19T19:28:03Z

I agree this seems to be a bug. I have also experienced this default behavior where we only found out about it after $20K bill at the end of the month due to the billions of HEAD requests. I switched to check_object=false, and used a combo of UID and other variables to ensure uniqueness and avoid conflicts - of course the downside is the challenge of looking up objects sequentially. I do believe this needs to be treated with higher urgency or at least switch the default behavior to where it is not making billions of HEAD calls. Some may argue this can be a feature with "suffix" as a variable with options and documenting the pros and cons of the options.

caleb15 added a commit to caleb15/fluent-plugin-s3 that referenced this issue Oct 2, 2020

add warning to readme

bbbb63d

see fluent#160

caleb15 mentioned this issue Oct 2, 2020

add warning to readme #353

Merged

worr mentioned this issue Oct 22, 2020

Use a more resilient s3 object name by default kube-logging/logging-operator#606

Closed

worr mentioned this issue Oct 24, 2020

Update index safely between threads. Fixes #160 and #326. #355

Closed

caleb15 added a commit to caleb15/fluent-plugin-s3 that referenced this issue Oct 29, 2020

add warning to readme

c536058

see fluent#160 Signed-off-by: caleb15 <caleb@15five.com>

github-actions bot added the stale label Jul 6, 2021

kenhys added bug and removed stale labels Jul 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3 cost optimisation: Remember the last index value that was used #160

S3 cost optimisation: Remember the last index value that was used #160

tomfitzhenry commented Jul 29, 2016 •

edited

Loading

cnorthwood commented Jul 29, 2016

tomfitzhenry commented Jul 29, 2016

tomfitzhenry commented Jul 29, 2016

tomfitzhenry commented Jul 29, 2016

repeatedly commented Aug 1, 2016

repeatedly commented Aug 4, 2016

ozonni commented Jun 17, 2019

caleb15 commented Oct 2, 2020

repeatedly commented Oct 6, 2020

formanojhr commented Nov 11, 2020 •

edited

Loading

repeatedly commented Nov 11, 2020

formanojhr commented Nov 11, 2020 •

edited

Loading

caleb15 commented Nov 13, 2020

formanojhr commented Nov 19, 2020

github-actions bot commented Jul 6, 2021

caleb15 commented Jul 6, 2021

sabdalla80 commented Jul 19, 2022 •

edited

Loading

S3 cost optimisation: Remember the last index value that was used #160

S3 cost optimisation: Remember the last index value that was used #160

Comments

tomfitzhenry commented Jul 29, 2016 • edited Loading

The problem

Solutions

cnorthwood commented Jul 29, 2016

tomfitzhenry commented Jul 29, 2016

tomfitzhenry commented Jul 29, 2016

tomfitzhenry commented Jul 29, 2016

repeatedly commented Aug 1, 2016

repeatedly commented Aug 4, 2016

ozonni commented Jun 17, 2019

caleb15 commented Oct 2, 2020

repeatedly commented Oct 6, 2020

formanojhr commented Nov 11, 2020 • edited Loading

repeatedly commented Nov 11, 2020

formanojhr commented Nov 11, 2020 • edited Loading

caleb15 commented Nov 13, 2020

formanojhr commented Nov 19, 2020

github-actions bot commented Jul 6, 2021

caleb15 commented Jul 6, 2021

sabdalla80 commented Jul 19, 2022 • edited Loading

tomfitzhenry commented Jul 29, 2016 •

edited

Loading

formanojhr commented Nov 11, 2020 •

edited

Loading

formanojhr commented Nov 11, 2020 •

edited

Loading

sabdalla80 commented Jul 19, 2022 •

edited

Loading