Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whoami: cannot find name for user ID 101 after ssh-ing in pod container #4503

Closed
vepatel opened this issue Oct 11, 2023 · 17 comments · Fixed by #4575
Closed

whoami: cannot find name for user ID 101 after ssh-ing in pod container #4503

vepatel opened this issue Oct 11, 2023 · 17 comments · Fixed by #4575
Assignees
Labels
backlog Pull requests/issues that are backlog items bug An issue reporting a potential bug
Milestone

Comments

@vepatel
Copy link
Contributor

vepatel commented Oct 11, 2023

Describe the bug

  • when running command whoami or 'cat /etc/passwd' after exec into NIC (v3.2.0 >= plus only) pod:
    $ whoami
          whoami: cannot find name for user ID 101

    $ cat /etc/passwd
          root:x:0:0:root:/root:/bin/bash
          [...]
          nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
          nginx:x:999:999:nginx user:/nonexistent:/usr/sbin/nologin
  • when running command whoami or 'cat /etc/passwd' after exec into NIC (v3.2.0 < plus only) pod:
     $ whoami
            nginx
            
      $ cat /etc/passwd
            root:x:0:0:root:/root:/bin/bash
            [...]
            _apt:x:100:65534::/nonexistent:/usr/sbin/nologin
            nginx:x:101:101:nginx user,,,:/nonexistent:/bin/false  

To Reproduce

  • Install an IC instance with plus images for v3.2.0+ and ssh into pod container and run above commands.
@vepatel vepatel added the backlog Pull requests/issues that are backlog items label Oct 11, 2023
@github-actions
Copy link

Hi @vepatel thanks for reporting!

Be sure to check out the docs and the Contributing Guidelines while you wait for a human to take a look at this 🙂

Cheers!

@vepatel vepatel added the bug An issue reporting a potential bug label Oct 11, 2023
@vepatel
Copy link
Contributor Author

vepatel commented Oct 18, 2023

hey @sigv, any opinion or thoughts on this one?

@sigv
Copy link
Contributor

sigv commented Oct 18, 2023

Taking a quick look, there do not appear to be relevant changes in this repository comparing v3.1.1...v3.2.0.


ubi-plus and ubi-plus-nap should not be affected as the user is configured in this repository's Dockerfile (ref: L157 and L178).

OSS debian is not affected because upstream configures user 101 explicitly (ref).
OSS alpine is not affected for same reason (ref).

$ docker run -it --rm --entrypoint=/bin/sh ghcr.io/nginxinc/kubernetes-ingress:3.2.0 -c 'whoami; grep nginx /etc/passwd'
nginx
nginx:x:101:101:nginx user:/nonexistent:/bin/false

Base images did change:

  • alpine-plus changed from alpine:3.17 to alpine:3.18
  • debian-plus changed from debian:11-slim to debian:12-slim

But the base images are not responsible for the nginx user. (Officially, debian package even uses www-data user.)

I have a suspicion about the NGINX_PLUS_VERSION change from R28 to R29.
Did something change in how the Nginx Plus package is prepared, or how installation scripts are run?

Additionally, does this affect Debian only, or Alpine only, or other Plus variants?
What I am also thinking (if Nginx Plus packaging/installation did not change) then maybe the newer distro version changed allocation of UIDs, where lowest available UID was previously allocated (101) but now highest available UID is allocated (999).

@sigv
Copy link
Contributor

sigv commented Oct 18, 2023

Alpine has BusyBox's adduser and the behavior there appears unchanged.
Nginx is probably explicitly specifying UID 101, and my gut feel tells me the Alpine Plus images are fine.

$ docker run -it --rm --entrypoint=/bin/sh alpine:3.17 -c 'adduser -D nginx; grep nginx /etc/passwd'
nginx:x:1000:1000:Linux User,,,:/home/nginx:/bin/ash
$ docker run -it --rm --entrypoint=/bin/sh alpine:3.18 -c 'adduser -D nginx; grep nginx /etc/passwd'
nginx:x:1000:1000:Linux User,,,:/home/nginx:/bin/ash

Debian has adduser as higher level tool with various defaults (man: bullseye (11) / bookworm (12)).
It also has useradd as a lower level tool that defaults to empty values (man: bullseye (11) / bookworm (12)).

$ docker run -it --rm --entrypoint=/bin/sh debian:11-slim -c 'adduser --quiet --system nginx; grep nginx /etc/passwd'
nginx:x:101:65534::/home/nginx:/usr/sbin/nologin

$ docker run -it --rm --entrypoint=/bin/sh debian:12-slim -c 'adduser --quiet --system nginx; grep nginx /etc/passwd'
nginx:x:100:65534::/nonexistent:/usr/sbin/nologin
$ docker run -it --rm --entrypoint=/bin/sh debian:11-slim -c 'useradd --system nginx; grep nginx /etc/passwd'
nginx:x:999:999::/home/nginx:/bin/sh

$ docker run -it --rm --entrypoint=/bin/sh debian:12-slim -c 'useradd --system nginx; grep nginx /etc/passwd'
nginx:x:999:999::/home/nginx:/bin/sh

@vepatel, I think the Nginx Plus .deb package was switched from previously running adduser to now running useradd.
This would explain the UID change from 101 to 999.


Based on manpage: adduser will choose the first available UID from the range specified by FIRST_SYSTEM_UID and LAST_SYSTEM_UID in the configuration file. This can be overridden with the --uid option. This value defaults to 100.

Debian 11 image had UID 100 already taken by _apt user, so UID 101 was assigned for anyone new.
However, Debian 12 image has UID 42 assigned for _apt so anyone new can get UID 100.

One dirty way how to mitigate this is running adduser --system --firstuid 101 --firstgid 101 nginx but you don't want this to apply to all users (for example, if some user has system accounts in range 1000-9999 and regular user accounts in the 10k+ range).

Therefore, I would propose to handle this by setting FIRST_SYSTEM_UID=101 in /etc/adduser.conf before installing the nginx-plus package. This way, change affects only Nginx images (essentially mitigating the "regression"). In latest 3.3.1, for debian-plus that would need to be before L79, and for debian-plus-nap before L104. The latter is not currently affected, but it would be annoying to regress into it later when base image gets updated.

@sigv
Copy link
Contributor

sigv commented Oct 18, 2023

I opened #4540 with my proposed change to FIRST_SYSTEM_UID.

However, please check how the Nginx Plus Debian package creates the user, as you want adduser for creating the lowest UID.

I would suggest to change packaging for the Debian package, switching back to adduser, resulting in UID 100.
If that is observed, then it signals my PR is relevant for bumping UID 100 up to 101.

@vepatel
Copy link
Contributor Author

vepatel commented Oct 19, 2023

thanks @sigv, I'll have a look at the changes.

@coolbry95
Copy link
Contributor

Just debugged this because it was breaking app-protect after 3.2.1 which came down to a difference in R29 and R30. The main issue is that on 3.2.0 the nginx user is 101 while on 3.2.1 the nginx user is 999.

For reference the debian package adds the nginx user if it does not exist.

It looks like @sigv was right and it switched from adduser to useradd.

R29

root@90d58e7853d2:~# cat tmp2/DEBIAN/preinst
#! /bin/sh
# preinst script for nginx

set -e

addnginxuser() {
    # creating nginx group if he isn't already there
    if ! getent group nginx >/dev/null; then
        addgroup --system nginx >/dev/null
    fi

    # creating nginx user if he isn't already there
    if ! getent passwd nginx >/dev/null; then
        adduser \
          --system \
          --disabled-login \
          --ingroup nginx \
          --no-create-home \
          --home /nonexistent \
          --gecos "nginx user" \
          --shell /bin/false \
          nginx  >/dev/null
    fi
}

R30

root@90d58e7853d2:~/tmp# cat DEBIAN/preinst
#! /bin/sh
# preinst script for nginx

set -e

addnginxuser() {
    # creating nginx group if he isn't already there
    if ! getent group nginx >/dev/null; then
        groupadd --system nginx >/dev/null
    fi

    # creating nginx user if he isn't already there
    if ! getent passwd nginx >/dev/null; then
        useradd \
          --system \
          --gid nginx \
          --no-create-home \
          --home /nonexistent \
          --comment "nginx user" \
          --shell /usr/sbin/nologin \
          nginx  >/dev/null
    fi
}

@coolbry95
Copy link
Contributor

Why do we not want to just add the user ourselves like we do for the UBI image?
https://github.com/nginxinc/kubernetes-ingress/blob/main/build/Dockerfile#L157

@sigv
Copy link
Contributor

sigv commented Oct 25, 2023

Why do we not want to just add the user ourselves like we do for the UBI image?

Generally there is a benefit in having the upstream package configure its expected environment -- you avoid repeating the same logic. But here it could make sense as to ensure the account is created with a specific UID.

If creating the account in this chart's Dockerfile is preferred, then #4540 (configuring account ranges) is redundant and can be skipped.

@coolbry95
Copy link
Contributor

I think that we should explicity set the UID for all images otherwise we are relying on the package to do what we expect still. What if Alpine changes in the future? Again another breakage will happen.

Explicit is better than implicit.

@sigv
Copy link
Contributor

sigv commented Oct 25, 2023

Explicit is not always better. It's more about alignment, compromise and accepting risks. In the Debian Plus scenario, there are no notable drawbacks to creating the user early (before installing the package) -- it's just a bit longer command for the builder.

At the same time, for Debian OSS and Alpine OSS images (built from nginxinc/docker-nginx) I do not think you want to add logic to recreate user, and can instead accept the risk that upstream needs to keep the UID as 101 (ref: debian and alpine-slim).

@coolbry95
Copy link
Contributor

This is not about the open source images. Those are not relevant here.

Sure there are trade offs as always. The build time is negligible in this case. Broken releases and debug time aren't.

If the user was explicitly set to begin with then we would not have upgraded to a broken version of nginx-ic-appprotect. Maybe other things broke because of this as well.

@sigv
Copy link
Contributor

sigv commented Oct 25, 2023

Oh of course. I am just highlighting that reverting the packaging change and applying #4540 has its own benefits of re-using upstream an implementation. But doing explicit user/group creation in Dockerfile here is a way forward too! (The user creation doesn't slow anything down. I fully agree it's negligible from a size and speed point of view.)

On a different topic though, it is interesting to me how this was not picked up in test scenarios, if app-protect is broken.
I am an OSS user, so I am not aware of the failure scenario here. Just strange that CI did not pick up issues.

@brianehlert
Copy link
Collaborator

I am an old systems person and not a fan of hard coding anything. It always returns to bite you in the end.
Lets give a chance for review to happen, since we have wider impacts than previously known around this issue.
It was unknown until today that there was any break associated with this.

@lucacome
Copy link
Member

I think we should stick to manually setting the uid and gid for the user like in the OSS images and in ubi-plus

&& groupadd --system --gid 101 nginx \
&& useradd --system --gid nginx --no-create-home --home-dir /nonexistent --comment "nginx user" --shell /bin/false --uid 101 nginx \

@brianehlert
Copy link
Collaborator

Sounds like a plan.

@vepatel vepatel self-assigned this Oct 26, 2023
@coolbry95
Copy link
Contributor

Should the same change be done for alpine as well?

@danielnginx danielnginx added this to the v3.4.0 milestone Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog Pull requests/issues that are backlog items bug An issue reporting a potential bug
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants