Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File descriptor leak detected #672

Open
patagona-anas opened this issue Jan 29, 2020 · 6 comments
Open

File descriptor leak detected #672

patagona-anas opened this issue Jan 29, 2020 · 6 comments
Labels

Comments

@patagona-anas
Copy link

Describe the bug
The numbers for the open files increase constantly for the process dkron/dkron-executor-http. The resource (= file descriptor) might be released but it takes a bit long ( see Additional context for details)

To Reproduce
Steps to reproduce the behavior:

  1. Create 100 jobs that execute after every 6 minutes ( executor = HTTP)
  2. Check the output of the lsof | wc -l before and after the execution.
  3. The open file value increases at every execution

Expected behavior
It should release the file descriptor a bit fast

Screenshots

  • Sample 1 when no jobs are executed
  • Sample 2-4 every 6 minutes interval

Screenshot_2020-01-29  PM-5643  Performance test for scheduler - JIRA

Specifications:

  • OS: amazon linux
  • Docker image: dkron/dkron:v2.0.0

Additional context

  • Soft/hard limit inside docker container: 1024/4096
  • Maybe the resources are released but it is taking too long. As a result, the job failed because of the socket: too many open files

Selection_017

@patagona-anas
Copy link
Author

Bottom line

  • There is no bug and the open file descriptor is not increasing constantly but getting freed over time.
  • The used limit was too low and it has been increased.

Selection_019
X-Axis: Datetime
Y-Axis: Open file descriptor

@yvanoers
Copy link
Collaborator

@patagona-anas
Even so, it is still remarkable that the FDs remain open longer than we'd expect. If I get the time I'll see if I can pinpoint what those descriptors are used for, because I was looking at the code and the httpExecutor is closing all its handles as far as I've seen. Maybe there's something going on with the module system.

@vcastellm
Copy link
Member

Thanks for raising this @patagona-anas, great investigation here

@patagona-anas
Copy link
Author

@Victorcoder: thanks. I am then re-opening it

@patagona-anas patagona-anas reopened this Feb 10, 2020
@vcastellm vcastellm added the bug label Mar 29, 2020
@vcastellm
Copy link
Member

FTR my tests behave the same, it grows until a certain point where it remains stable. Not causing any issues at all.

@vcastellm vcastellm added question and removed bug labels May 5, 2020
@vcastellm
Copy link
Member

Related to #904 and #1062

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants