Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

100% CPU usage without further logs or ports opened #122

Closed
agross opened this issue Apr 2, 2022 · 7 comments
Closed

100% CPU usage without further logs or ports opened #122

agross opened this issue Apr 2, 2022 · 7 comments
Labels
bug Confirmed bug

Comments

@agross
Copy link

agross commented Apr 2, 2022

Describe the bug

I have a server where postsrsd runs as part of docker-mailserver. On this instance, the main postsrsd takes 100% of the CPU cycles and logs nothing, even when being started manually on the command line (without -D). None of the ports (10001, 10002) are opened, too.

Relevant log output

Nothing.

System Configuration

  • OS: docker
  • Mailer daemon: Postfix
  • Version: 1.10
@polarathene

This comment was marked as outdated.

@polarathene
Copy link

polarathene commented Aug 15, 2022

TL;DR: (I've collapsed the original content to focus on where the problem is)

  • This part of the main() method takes approx 10 minutes to iterate through a billion close() calls, and is most likely to be encountered via Docker containers running as the root user.
  • The most convenient workaround for users in the meantime is probably using --ulimit option on a container.
  • Suggested fix for postsrsd is to iterate through /proc/self/fd instead.

Earlier information / investigation

Additional information:

The docker-mailserver project runs a Debian Bullseye (11) base image docker container with the postsrsd package installed (v1.10), and I reproduced this in a VM guest running Fedora (as the referenced issue mentions).

Ignoring any init scripts from the installed package and running the binary directly with /usr/sbin/postsrsd -d example.test reproduces the failure.


If I do not provide the command the -d option, it will fail. This is expected to be the same for -s being absent too (according to -h output), but it's stalling. I don't see any reason for it to believe it had been provided a secret otherwise?

I assume that means that the command does not reach this point?:

postsrsd/postsrsd.c

Lines 518 to 533 in 6e701fa

/* Read secret. The default installation makes this root accessible only. */
if (secret_file != NULL)
{
sf = fopen(secret_file, "rb");
if (sf == NULL)
{
fprintf(stderr, "%s: Cannot open file with secret: %s\n", self,
secret_file);
return EXIT_FAILURE;
}
}
else
{
fprintf(stderr, "%s: You must set a secret (-s)\n", self);
return EXIT_FAILURE;
}

but the command does reach the earlier -d check:

postsrsd/postsrsd.c

Lines 483 to 487 in 6e701fa

if (domain == NULL || *domain == 0)
{
fprintf(stderr, "%s: You must set a home domain (-d)\n", self);
return EXIT_FAILURE;
}

That would mean something in-between is likely where the command is stalling? (Confirmed: after the long delay, the -s check fails the command):

postsrsd/postsrsd.c

Lines 489 to 517 in 6e701fa

if (separator != '=' && separator != '+' && separator != '-')
{
fprintf(stderr, "%s: SRS separator character must be one of '=+-'\n",
self);
return EXIT_FAILURE;
}
if (forward_service == NULL)
forward_service = strdup("10001");
if (reverse_service == NULL)
reverse_service = strdup("10002");
/* Close all file descriptors (std ones will be closed later). */
maxfd = sysconf(_SC_OPEN_MAX);
for (fd = 3; fd < maxfd; fd++)
close(fd);
/* The stuff we do first may not be possible from within chroot or without
* privileges */
/* Open pid file for writing (the actual process ID is filled in later) */
if (pid_file)
{
pf = fopen(pid_file, "w");
if (pf == NULL)
{
fprintf(stderr, "%s: Cannot write PID: %s\n\n", self, pid_file);
return EXIT_FAILURE;
}
}

The most likely culprit then would perhaps be:

postsrsd/postsrsd.c

Lines 500 to 506 in 6e701fa

/* Close all file descriptors (std ones will be closed later). */
maxfd = sysconf(_SC_OPEN_MAX);
for (fd = 3; fd < maxfd; fd++)
close(fd);
/* The stuff we do first may not be possible from within chroot or without
* privileges */

# Docker container (Debian 11 Bullseye base image)
$ getconf -a | grep OPEN_MAX

OPEN_MAX                           1073741816
_POSIX_OPEN_MAX                    1073741816


# VM guest Fedora 36 (Docker host)
$ getconf -a | grep OPEN_MAX

OPEN_MAX                           1024
_POSIX_OPEN_MAX                    1024

# NOTE: `ulimit -n` and `sysctl fs.nr_open` also outputs the same value

So the for loop is doing close() 1 billion times?

Confirmation of issue and resolution - various workarounds

UPDATE: Yes, this seems to be the problem. Others have experienced this issue before with Docker, noting that it sets this massively larger value for the container, but only for the root user (fairly common).

I have confirmed that su docker -c '<command here>' (we have an unprivileged user named docker) works (both for outputting OPEN_MAX=1024 and of course postsrsd being effectively instant).

UPDATE 2 (alternative workarounds):
You can still kind of run a command as root with reduced FD limit with ulimit -n 2048 && <command here> (limit set to 2048), but it fails to allow you to use ulimit like that again in the container (even as root) with "bash: ulimit: open files: cannot modify limit: Operation not permitted". ulimit -n and getconf -a commands will both show the reduced limit, while sysctl fs.nr_open remains unchanged. (doesn't seem like good advice, but was someones solution for using with Dockerfile builds)

Docker containers can use a --ulimit option for per container limits docker run --rm -it --ulimit 'nofile=1024' alpine ash -c 'ulimit -n'. Works well as a workaround in the meantime.

There's also a Docker daemon config approach when viable, that would enforce that limit across all containers. That's the official upstream Docker issue AFAIK regarding the problems with software hitting these perf issues in docker containers.

Suggested Fix

Original suggestion The issue I referenced of another users experience with the problem also mentioned a fix that sounds reasonable?

I am not familiar with the reason of the logic in your code, but that users similar code made this change (with a slightly more helpful comment about the purpose):

    //close all file descriptors before exit, otherwise they can segfault
    for (int i = 3; i < sysconf(_SC_OPEN_MAX); i++) {
      if(i != failure[w]){
        int err = close(i);
        if(i > 200 && err < 0)
          break;
      }
    }

They iterate the first 200 FD (I'm familiar with FD 200 being common with flock() examples), and then continue until close() has an error. I assume that means an FD of 202 with no 201 (should error with close()?) would mean that FD 202 would not get closed.

You probably know better how problematic that is. If that's not a viable solution, perhaps adding to the README (and maybe -h output) that Docker containers running as the root user will have this problem (and link to this issue for more details). Additionally consider an option that allows setting the maxfd limit (although as a user, I don't know what scenarios with postsrsd would lead to the failure it's trying to prevent).

Other alternatives I saw:

@roehling
Copy link
Owner

Thank you for that excellent investigation. The loop you found has been added by #65; to be honest, I always found this a bit iffy, but I failed to realize that the file descriptor limit can be this insanely high.

The file descriptors are assigned by the kernel in a somewhat ascending order, so it's unlikely to hit a FD greater than 200 unless 200 files have been opened by whatever process spawns PostSRSd.

And while I was writing this, I saw you added close_range(). I did not know about that function yet, but it seems to be the best alternative. The manual page even has close_range(3, ~0U, ...) as a use-case.

@polarathene
Copy link

polarathene commented Aug 15, 2022

The file descriptors are assigned by the kernel in a somewhat ascending order, so it's unlikely to hit a FD greater than 200 unless 200 files have been opened by whatever process spawns PostSRSd.

I was of the understanding that you could specify an arbitrary FD number for example:

(
flock -s 200

# ... commands executed under lock ...

) 200 < /tmp/config-file

Is that not FD 200? I am not that knowledgeable in this area, so I could be misunderstanding.


And while I was writing this, I saw you added close_range().

Done with my editing 😅

Whatever makes most sense to you is fine by me 👍

I was just confused why a test we run in our CI was working fine but was having issues with postsrsd when I was running tests on our container locally. I assume Github configures the Docker daemon to have more sane limits.

Documented here for the benefit of others who stumble upon it :)

I always found this a bit iffy, but I failed to realize that the file descriptor limit can be this insanely high.

From what I've read, Docker / containerd needs this to do it's thing across many containers, but the containers themselves don't. I was surprised at the staggering difference myself 😄

@roehling
Copy link
Owner

close_range() seems to be relatively new (I have it on my Debian unstable, but not my Ubuntu 20.04), but it is so nice that I decided to use it anyway and add some fallback code for older systems.

@polarathene
Copy link

Awesome thanks for the quick fix! ❤️

@roehling
Copy link
Owner

roehling commented Aug 15, 2022

I was of the understanding that you could specify an arbitrary FD number for example:

(
flock -s 200

# ... commands executed under lock ...

) 200 < /tmp/config-file

Is that not FD 200? I am not that knowledgeable in this area, so I could be misunderstanding.

Sure, you can do that in the shell, in regular programs with open() calls, file descriptors typically won't be assigned randomly.
Besides, it's not like any file descriptors have specific semantics besides the first 3 (stdin, stdout, stderr); I suspect the idea with 200 was to go high so you you don't conflict with existing open files, which ended up as cargo cult.

Also, the general rule is, you open it, you close it, so I'm just being nice with closing all the inherited FDs, and it got me a bug into the code as a reward...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bug
Projects
None yet
Development

No branches or pull requests

3 participants