-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
100% CPU usage without further logs or ports opened #122
Comments
This comment was marked as outdated.
This comment was marked as outdated.
TL;DR: (I've collapsed the original content to focus on where the problem is)
Earlier information / investigationAdditional information: The Ignoring any init scripts from the installed package and running the binary directly with If I do not provide the command the I assume that means that the command does not reach this point?: Lines 518 to 533 in 6e701fa
but the command does reach the earlier Lines 483 to 487 in 6e701fa
That would mean something in-between is likely where the command is stalling? (Confirmed: after the long delay, the Lines 489 to 517 in 6e701fa
The most likely culprit then would perhaps be: Lines 500 to 506 in 6e701fa
# Docker container (Debian 11 Bullseye base image)
$ getconf -a | grep OPEN_MAX
OPEN_MAX 1073741816
_POSIX_OPEN_MAX 1073741816
# VM guest Fedora 36 (Docker host)
$ getconf -a | grep OPEN_MAX
OPEN_MAX 1024
_POSIX_OPEN_MAX 1024
# NOTE: `ulimit -n` and `sysctl fs.nr_open` also outputs the same value So the for loop is doing Confirmation of issue and resolution - various workaroundsUPDATE: Yes, this seems to be the problem. Others have experienced this issue before with Docker, noting that it sets this massively larger value for the container, but only for the root user (fairly common). I have confirmed that UPDATE 2 (alternative workarounds): Docker containers can use a There's also a Docker daemon config approach when viable, that would enforce that limit across all containers. That's the official upstream Docker issue AFAIK regarding the problems with software hitting these perf issues in docker containers. Suggested FixOriginal suggestionThe issue I referenced of another users experience with the problem also mentioned a fix that sounds reasonable?I am not familiar with the reason of the logic in your code, but that users similar code made this change (with a slightly more helpful comment about the purpose): //close all file descriptors before exit, otherwise they can segfault
for (int i = 3; i < sysconf(_SC_OPEN_MAX); i++) {
if(i != failure[w]){
int err = close(i);
if(i > 200 && err < 0)
break;
}
} They iterate the first 200 FD (I'm familiar with FD 200 being common with You probably know better how problematic that is. If that's not a viable solution, perhaps adding to the README (and maybe Other alternatives I saw:
|
Thank you for that excellent investigation. The loop you found has been added by #65; to be honest, I always found this a bit iffy, but I failed to realize that the file descriptor limit can be this insanely high. The file descriptors are assigned by the kernel in a somewhat ascending order, so it's unlikely to hit a FD greater than 200 unless 200 files have been opened by whatever process spawns PostSRSd. And while I was writing this, I saw you added |
I was of the understanding that you could specify an arbitrary FD number for example: (
flock -s 200
# ... commands executed under lock ...
) 200 < /tmp/config-file Is that not FD 200? I am not that knowledgeable in this area, so I could be misunderstanding.
Done with my editing 😅 Whatever makes most sense to you is fine by me 👍 I was just confused why a test we run in our CI was working fine but was having issues with Documented here for the benefit of others who stumble upon it :)
From what I've read, Docker / |
|
Awesome thanks for the quick fix! ❤️ |
Sure, you can do that in the shell, in regular programs with Also, the general rule is, you open it, you close it, so I'm just being nice with closing all the inherited FDs, and it got me a bug into the code as a reward... |
Describe the bug
I have a server where postsrsd runs as part of docker-mailserver. On this instance, the main postsrsd takes 100% of the CPU cycles and logs nothing, even when being started manually on the command line (without
-D
). None of the ports (10001, 10002) are opened, too.Relevant log output
Nothing.
System Configuration
The text was updated successfully, but these errors were encountered: