Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluentd memory leak in Kubernetes with "buffer_type file" #2236

Closed
qingling128 opened this issue Dec 14, 2018 · 48 comments · Fixed by #2524
Closed

Fluentd memory leak in Kubernetes with "buffer_type file" #2236

qingling128 opened this issue Dec 14, 2018 · 48 comments · Fixed by #2524

Comments

@qingling128
Copy link

[Fluentd version]
v1.2.5

[Environment]
Kubernetes

[Configuration]
https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/fluentd-gcp/fluentd-gcp-configmap.yaml

[Problems]
We are seeing Fluentd memory leaks with Kubernetes versions >= v1.10. After some investigation, we realized the log rotation mechanism changed from Kubernetes v1.9 and v1.10, which might trigger the memory leak. Guess the logs are rotated in a way that Fluentd does not support / handle well.

Some experiment with exactly the same Fluentd version, output plugin version but different Kubernetes versions are shown below:
image

Is there any known limitation for log rotation mechanism that might trigger this? We report a similar issue before, but I've verified that we are using a Fluentd version with that fix.

[The log rotation manager]
In case it helps, the log rotation mechanism is https://github.com/kubernetes/kubernetes/blob/a3ccea9d8743f2ff82e41b6c2af6dc2c41dc7b10/pkg/kubelet/logs/container_log_manager.go#L150.

@repeatedly
Copy link
Member

Is the problem klog implementation, right?
I want to know the differences between older logger and newer logger to debug the problem.
Do we reproduce the problem with https://github.com/kubernetes/klog ?
Or need some k8s or gke specific setting?

@qingling128
Copy link
Author

I haven't managed to reproduce it outside the k8s world.

Does Fluentd support copytruncate? I noticed some of the logs are rotated with copytruncate (https://github.com/kubernetes/kubernetes/blob/8ab955f82dddd3725b543a4bb01c367a4c4a2286/cluster/gce/gci/configure-helper.sh#L365).

I just set up a regular VM and some logs with copytruncate. Will report back when I get some memory data.

@repeatedly
Copy link
Member

in_tail itself considers truncated case:

if inode == @pe.read_inode # truncated

But I will re-check object management for truncated case.

I just set up a regular VM and some logs with copytruncate. Will report back when I get some memory data.

Good. I want to know which point causes this problem, configuration, klog, k8s or gke.
I checked klog diff but it doesn't contains the changes for log generation.
So I assume k8s or something setting causes this problem: golang/glog@master...kubernetes:master

@qingling128
Copy link
Author

Still waiting for the memory trend to show up. I haven't seen an obvious leaking pattern. But I realized I forgot to specify read_from_head true so I just turn that on. In GKE, we are using the Fluentd read_from_head true config together with copytruncate. Wondering if that somehow messes with pos_file and leave objects hanging around.

@jkohen
Copy link
Contributor

jkohen commented Dec 26, 2018

Hi, I work with @qingling128. We had a customer report high CPU usage with fluentd, running outside Kubernetes, and it had in common with this issue that they were using read_from_head true together with copytruncate.

I reached the same conclusion as you that in_tail should handle this correctly, but I was able to confirm the CPU spikes in the Ruby interpreter (so not logrotate or the user application) where no spikes were expected based on the configuration, which had fluentd watch a single file (not *).

@qingling128
Copy link
Author

@repeatedly

For the reference, the related logrotation config and in_tail config that has logrotation CPU issue are like below:

Log Rotation Setting

/etc/logrotate.d $ cat my-app
/var/log/my-app/*.log
{
  prerotate
    rm -f /var/log/my-app/access.*.log
    rm -f /var/log/my-app/access_log.*.log
  endscript
  rotate 0
  maxsize 10M
  missingok
  notifempty
  nocompress
  copytruncate
  postrotate
    rm -f /var/log/my-app/access.*.log
    rm -f /var/log/my-app/access_log.*.log
  endscript
}

In Tail Config

$ cat my-app-fluentd.conf
<source>
  @type tail
  format none
  path /var/log/my-app/access.log
  pos_file /var/lib/pos/service.access.pos
  read_from_head true
  tag service_log
</source>

<source>
  @type tail
  format none
  path /var/log/my-app/service.log
  pos_file /var/lib/pos/service.output.pos
  read_from_head true
  tag service_output_log
</source>

<source>
  @type tail
  format none
  path /var/log/my-app/gc.log
  pos_file /var/lib/pos/service.gc.pos
  read_from_head true
  tag service_gc_log
</source>

@repeatedly
Copy link
Member

So no spike without copytruncate, right?
I will test with your logrotation script.

@qingling128
Copy link
Author

@repeatedly Correct.

@qingling128
Copy link
Author

@repeatedly Any luck with reproducing the memory leak or CPU spike?

@repeatedly
Copy link
Member

Sorry for the late. I took new year holidays. I will start working tomorrow.

@qingling128
Copy link
Author

@repeatedly Cool cool. Thanks for the heads up!

@repeatedly
Copy link
Member

I tested simple truncate case several hours but no memory leak.

5.times.map { |i|
  Thread.new {
    loop {
      f = File.open("./output_#{i}.log", "a")
      10.times {
        f.puts "hello"
        f.flush
        sleep 1
      }
      f.truncate(0)
      f.close
    }
  }
}.each { |t| t.join }

I will test more complicated truncate case.

@repeatedly
Copy link
Member

Now testing on my Ubuntu 18.04 box with fluentd v1.3.2 and ruby 2.4.5.
Currently, no memory leak. Continue to run script...

@repeatedly
Copy link
Member

I tested several hours but no memory leak and cpu spike.
Any other information for the issue?

Env:

OS: Ubuntu 18.04
fluentd: v1.3.3
ruby: 2.4.5

conf:

<source>
  @type tail
  format none
  path /home/repeatedly/tail/log/output1.log
  pos_file /home/repeatedly/tail/application-log_1.pos
  read_from_head true
  tag test
</source>

<source>
  @type tail
  format none
  path /home/repeatedly/tail/log/output2.log
  pos_file /home/repeatedly/tail/application-log_2.pos
  read_from_head true
  tag test
</source>

<source>
  @type tail
  format none
  path /home/repeatedly/tail/log/output3.log
  pos_file /home/repeatedly/tail/application-log_3.pos
  read_from_head true
  tag test
</source>

<match test>
  @type null
</match>

logrotate:

/home/repeatedly/tail/log/*.log
{
  prerotate
    rm -f /home/repeatedly/tail/log/output1.*.log
    rm -f /home/repeatedly/tail/log/output1_log.*.log
  endscript
  rotate 0
  maxsize 2M  # smaller size for testing
  missingok
  notifempty
  nocompress
  copytruncate
  postrotate
    rm -f /home/repeatedly/tail/log/output1.*.log
    rm -f /home/repeatedly/tail/log/output1_log.*.log
  endscript
}

Call logrotate every 10 seconds:

$ while sleep 10; do logrotate -f /etc/logrotate.conf ; done

Generate logs by following script:

paths = 3.times.map { |i| "./log/output#{i + 1}.log" }
ths = paths.map { |path|
  Thread.new {
    f = File.open(path, "a")
    loop {
      f.puts "hello fluuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuend world! " * 10
      f.flush
      sleep 0.01
    }
    f.close
  }
}

ths.each { |t| t.join }

@repeatedly
Copy link
Member

I'm not sure filter/output plugins. Does using null output plugin also have a memory leak issue?

@mikejuliettetango
Copy link

Our client was experiencing this issue with older versions of Ubuntu (14.04.1)

@repeatedly
Copy link
Member

Okay. Will check on Ubuntu 14.04

@qingling128
Copy link
Author

I'll try the null plugin.

@repeatedly
Copy link
Member

Didn't reproduce the issue on Ubuntu 14.04. Need more detailed information.

@mikejuliettetango
Copy link

Our client was using fluentd version 1.2.5 and Stackdriver agent 1.6.0-1 on Ubuntu 14.04.1

@repeatedly
Copy link
Member

How about null plugin case on your side?

And I want to know my test case is good or not: #2236 (comment)
logrotate script is copied version so this is same as your environment. It should not be the problem.
But log-gen script is ruby script, not klog. If klog is important to reproeudce the issue, I will test it with klog.

@qingling128
Copy link
Author

Sorry for the delay. I got pulled away for some urgent matters. Just set up two clusters, one with https://github.com/GoogleCloudPlatform/fluent-plugin-google-cloud and one with null plugin. Will check the memory growth after a day.

@qingling128
Copy link
Author

Hmm... I tried the fake-output-plugin, seems there is no memory leak.
image

I'm not exactly sure what caused the leak, but it might be something with our output plugin. I'll close this ticket for now and investigate more. If it turned out to be something Fluentd core related, I'll reopen this ticket.

BTW, if there is any recommended tools for Ruby memory debugging/profiling, please feel free to advise. I was using https://github.com/ruby-prof/ruby-prof before. It did not help much with memory leak though (or maybe I used it the wrong way)

@repeatedly
Copy link
Member

@nurse Do you have any idea for debugging memory leak? I assume C ext or something has a problem.

@qingling128
Copy link
Author

We once had a memory leak (that one was fixed) in the C core code of gRPC gem. Let me give it a quick try to turn gRPC off and use REST.

@qingling128
Copy link
Author

After some investigation, it turned out the measurement in #2236 (comment) did not have exactly the same buffer settings between the out_google_cloud plugin and the fake plugin (https://github.com/qingling128/fluent-plugin-buffer-output-stub/blob/master/lib/fluent/plugin/out_buffer_output_stub.rb).

As I dug further, the memory leak exists for the fake gem as well.
(This is tested under the load of 1000 log entries per second.
A sample log entry looks like:
vWz5PnzTwYyFmTB6YvrHCIjoLnOw3eLESMVlWTwmhGXFFj2EUSbLk5O35iUuVFWGH9RsLnhJ7FBbeqsx4OJ3KaV4NvrcoeUpY8rv).

When the buffer type is memory, there does not seem to be a leak.
When the buffer type is file, there is a leak, even though the trend is not obvious until 2 days later.
image

[lingshi-memory-buffer-fake-gem]

    <match **>
      @type buffer_output_stub
    </match>

[lingshi-file-buffer-fake-gem]

    <match **>
      @type buffer_output_stub
      buffer_type file
      buffer_path /var/log/k8s-fluentd-buffers/kubernetes.system.buffer
    </match>

However, when I changed the buffer configuration to specify certain limit, the memory growth is quite obvious.
image
[lingshi-file-fake-gem-buffer-params]

    <match **>
      buffer_type file
      buffer_path /var/log/k8s-fluentd-buffers/kubernetes.stdoutstderr.buffer
      buffer_queue_full_action block
      buffer_chunk_limit 512k
      buffer_queue_limit 6
    </match>

I’m not sure if buffer_queue_full_action block would make a difference here. BTW I checked in the fluentd log and found thtat there is no errors about the buffer queue being full.

I also tried the following two configurations. They both showed a growing pattern. But it might be too early (only 3 hours) to draw a conclusion.
image
[lingshi-file-fake-gem-queue]

    <match **>
      buffer_type file
      buffer_path /var/log/k8s-fluentd-buffers/kubernetes.stdoutstderr.buffer
      buffer_chunk_limit 512k
      buffer_queue_limit 6
    </match>

[lingshi-file-fake-gem-block]

    <match **>
      buffer_type file
      buffer_path /var/log/k8s-fluentd-buffers/kubernetes.stdoutstderr.buffer
      buffer_queue_full_action block
    </match>

Might there be some leak with the buffer file handler?

BTW, not sure if this is related at all. But this reminds me of one observation we had with buffer files before:

// Watch for 10 seconds.
// In total we ingest 8000 log entries for these 10 seconds.
// There were 23990 (roughly 3X) I/O accesses.
$  inotifywatch -t 10 /var/log/google-fluentd/fluentd.buffer 
Establishing watches...
Finished establishing watches, now collecting statistics.
total  access  modify  close_write  close_nowrite  open  moved_from  moved_to  create  delete  filename
23990  155     23787   6            6              12    6           6         6       6       /var/log/google-fluentd/fluentd.buffer/

@qingling128 qingling128 reopened this Feb 4, 2019
@qingling128 qingling128 changed the title Log rotation in Kubernetes triggered Fluentd memory leak Fluentd memory leak in Kubernetes Feb 4, 2019
@qingling128
Copy link
Author

Forgot to attach a comparison using our real gem (fluent-plugin-google-cloud) with memory buffer and file buffer. There is no leak with memory buffer. Yet we could not simply switch to memory buffer because we want to preserve the logs in case of agent crashes.

image

@nurse
Copy link
Contributor

nurse commented Feb 6, 2019

Ah, I missed this notification.
First it should be separated between Ruby and C. You can use sigdump for Ruby and valgrind for C, but maybe valgrind outputs so many logs...

@qingling128
Copy link
Author

Good to know. Thanks!

@qingling128
Copy link
Author

The memory (RSS) comparison among the 3 configurations after a few days:
image
The memory increases at different speed.

[lingshi-file-fake-gem-buffer-params]

    <match **>
      buffer_type file
      buffer_path /var/log/k8s-fluentd-buffers/kubernetes.stdoutstderr.buffer
      buffer_queue_full_action block
      buffer_chunk_limit 512k
      buffer_queue_limit 6
    </match>

[lingshi-file-fake-gem-queue]

    <match **>
      buffer_type file
      buffer_path /var/log/k8s-fluentd-buffers/kubernetes.stdoutstderr.buffer
      buffer_chunk_limit 512k
      buffer_queue_limit 6
    </match>

[lingshi-file-fake-gem-block]

    <match **>
      buffer_type file
      buffer_path /var/log/k8s-fluentd-buffers/kubernetes.stdoutstderr.buffer
      buffer_queue_full_action block
    </match>

For the reference: buffer_chunk_limit by default is 8MB, and buffer_queue_limit by default is 256.

@qingling128
Copy link
Author

@repeatedly Any luck with reproducing this issue?

@repeatedly
Copy link
Member

Currently, I have tested your configurations on our mac but no luck.
I'm now setup debian for more tests.

Here is test conf:

<source>
  @type dummy
  rate 1000
  size 1
  dummy {"log":"vWz5PnzTwYyFmTB6YvrHCIjoLnOw3eLESMVlWTwmhGXFFj2EUSbLk5O35iUuVFWGH9RsLnhJ7FBbeqsx4OJ3KaV4NvrcoeUpY8rv"}
  tag test
</source>

<match test>
  @type buffer_output_stub
  buffer_type file
  buffer_path ./buf/kubernetes.stdoutstderr.buffer
  buffer_queue_full_action block
  buffer_chunk_limit 512k
  buffer_queue_limit 6
</match>

@qingling128
Copy link
Author

Not sure if this is related, but someone mentioned some CPU usage benefit with threading when using wildcards in the buffer_path: Stackdriver/kubernetes-configs#16 (comment).

@qingling128
Copy link
Author

On the other side, I tried setting up a Kubernetes cluster with dummy input plugin:

    <source>
      @type dummy
      rate 1000
      size 1
      dummy {"log":"thisislingtestingsomesillymemoryleakthatdriveshercrazy"}
      tag test
    </source>

The memory leak still exists.
image

That double confirmed the leak is probably with the buffer output plugin.

@repeatedly
Copy link
Member

Not sure if this is related, but someone mentioned some CPU usage benefit with threading when using wildcards in the buffer_path:

This is interesting. We will check it.

On the other side, I tried setting up a Kubernetes cluster with dummy input plugin:

Other metrics are healthy, right? e.g. low CPU usage.

@repeatedly
Copy link
Member

dentry cache is not increased?

@qingling128
Copy link
Author

Wow. The CPU usage decreased from 70% to 30% of one CPU core after I added the wildcard to buffer_path. I will test that a bit in our formal pipeline to see if other performance criteria is met.

image

Memory leak seems a bit lower.
image

Seems it's still going up though. Will update this ticket with more info after it soaks for a bit.

@qingling128
Copy link
Author

Does this imply dentry cache keeps increasing?

lingshi@gke-lingshi-memory-leak ~ $ sudo slabtop | grep dentry
923002 895093  96%    0.21K  24946       37    199568K dentry

lingshi@gke-lingshi-memory-leak ~ $ sudo slabtop | grep dentry
923039 895130  96%    0.21K  24947       37    199576K dentry

lingshi@gke-lingshi-memory-leak ~ $ sudo slabtop | grep dentry
924223 897027  97%    0.21K  24979       37    199832K dentry

lingshi@gke-lingshi-memory-leak ~ $ sudo slabtop | grep dentry
924815 897753  97%    0.21K  24995       37    199960K dentry

@repeatedly
Copy link
Member

The CPU usage decreased from 70% to 30% of one CPU core after I added the wildcard to buffer_path.

This means 1000 msg/sec by in_dummy and fake output combo consumes 70% CPU, right?

I will test that a bit in our formal pipeline to see if other performance criteria is met.

If your pipeline uses record_transformer with enable_ruby, replacing it with record_modifier also helps CPU usage.

@qingling128
Copy link
Author

Oops, missed the previous email.

This means 1000 msg/sec by in_dummy and fake output combo consumes 70% CPU, right?

Yes

If your pipeline uses record_transformer with enable_ruby, replacing it with record_modifier also helps CPU usage.

Good to know. Will give that a try.

@StevenYCChou
Copy link

StevenYCChou commented Apr 3, 2019

We are seeing Fluentd memory leaks with Kubernetes versions >= v1.10. After some investigation, we realized the log rotation mechanism changed from Kubernetes v1.9 and v1.10, which might trigger the memory leak. Guess the logs are rotated in a way that Fluentd does not support / handle well.
...
Is there any known limitation for log rotation mechanism that might trigger this? We report a similar issue before, but I've verified that we are using a Fluentd version with that fix.

I observed that kubernetes/kubernetes#58899 Kubernetes v1.10.0 includes kubernetes/kubernetes#58899 , which suggested that log file will be reopened after log file is rotated.

@Sytten
Copy link

Sytten commented Jul 22, 2019

Any progress on that issue?

@qingling128
Copy link
Author

I was just looking at this issue again recently. The latest hypothesis is that the Fluentd buffer files have the postfix .log and Kubernetes might rotate those files. Somehow the file handler might not be freed correctly and caused a memory leak. I'm trying to verify this hypothesis.

[Sample Fluentd buffer file directory]

lingshi@gke-lingshi-master-default-pool-44cd09e6-9jn1 /var/log/fluentd-buffers/kubernetes.containers.buffer $ ls -l
h
total 712K
-rw-r--r-- 1 root root 708K Jul 24 00:04 buffer.b58e620cb8bf96fc5b412a83f11d2b1b6.log
-rw-r--r-- 1 root root   66 Jul 24 00:04 buffer.b58e620cb8bf96fc5b412a83f11d2b1b6.log.meta

@repeatedly Do you think this might be a possible cause?

@repeatedly
Copy link
Member

repeatedly commented Jul 24, 2019

the Fluentd buffer files have the postfix .log and Kubernetes might rotate those files

Ah, really? It means k8s automatically scans directories and rotate log suffix files?
If so, we can change buffer file suffix but we want more information.

@qingling128
Copy link
Author

@repeatedly - Seems like that is the cause:

image

k8s scans files under /var/log with .log postfixes and rotate them (it does not generate files like .log.gz or .log.archive so we never realized that. I tried to move the buffer file folder to /tmp and that stopped the memory leak. It's a bit weird to put things under /tmp though (the Fluentd doc seems to recommend /var/log: https://docs.fluentd.org/buffer/file#path).

Can we change the buffer file suffix? e.g. .buf?

@qingling128
Copy link
Author

We will move the buffer file path out of /var/log for now in GKE. Changing the buffer file suffix might still help prevent similar cases for other systems (e.g. Kubernetes users) though.

@qingling128
Copy link
Author

@repeatedly - Thanks for the quick fix. This is very helpful!

repeatedly added a commit that referenced this issue Jul 26, 2019
buf_file: Add path_suffix parameter. ref #2236
repeatedly added a commit that referenced this issue Jul 29, 2019
buf_file: Add path_suffix parameter. ref #2236
Signed-off-by: Masahiro Nakagawa <repeatedly@gmail.com>
ganmacs pushed a commit to ganmacs/fluentd that referenced this issue Jul 30, 2019
Signed-off-by: Masahiro Nakagawa <repeatedly@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants