Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: failed to lookup nobody user: user: unknown user nobody #228

Closed
jdoss opened this issue Apr 4, 2023 · 4 comments · Fixed by hashicorp/nomad#16904
Closed

panic: failed to lookup nobody user: user: unknown user nobody #228

jdoss opened this issue Apr 4, 2023 · 4 comments · Fixed by hashicorp/nomad#16904
Assignees

Comments

@jdoss
Copy link
Contributor

jdoss commented Apr 4, 2023

It looks like we are hitting the same issue that happened in hashicorp/nomad#14737 where the nomad-driver-podman isn't respecting NSS to look up users.

I am running into this issue on Fedora CoreOS 37.20230303.3.0

# /etc/nomad/plugins/nomad-driver-podman
panic: failed to lookup nobody user: user: unknown user nobody

goroutine 1 [running]:
github.com/hashicorp/nomad/helper/users.init.0()
	github.com/hashicorp/nomad@v1.5.2/helper/users/lookup_unix.go:37 +0x1b4
# id nobody
uid=99(nobody) gid=99(nobody) groups=99(nobody)
@shoenig
Copy link
Member

shoenig commented Apr 4, 2023

I think this might be due to building the driver with CGO_ENABLED=0, where the pure-Go implementation of the standard library users package only reads from /etc/passwd - and (IIUC) Fedora CoreOS externalizes the nobody user to an nss lookup.

@jdoss
Copy link
Contributor Author

jdoss commented Apr 4, 2023

Ahhhh yep. That makes sense. Here's what I am seeing on my end.

Plugin from releases.hashicorp.com

file /etc/nomad/plugins/nomad-driver-podman 
/etc/nomad/plugins/nomad-driver-podman: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=tA-nyhXEkFEKjKMCrbWV/9xjUPy06QVQ82b4TatHu/Ra2CBcZm5LKHkBrSQw8R/30CfM25FK3BlHy4UB_3X, with debug_info, not stripped

Plugin that I compiled and tested while looking into #227

$ file build/nomad-driver-podman
build/nomad-driver-podman: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=c35039ea25dc9a6adf735a8b4710fc75ee51b6f6, for GNU/Linux 3.2.0, with debug_info, not stripped

Can we build the driver with CGO_ENABLED=1 or is there a pure-Go way to look up users using NSS? https://go.dev/src/net/nss.go maybe?

@shoenig
Copy link
Member

shoenig commented Apr 5, 2023

Can we build the driver with CGO_ENABLED=1

Yeah I think that's the most reasonable thing to do - it's what we do for the nomad binary itself as well. I plan on digging into Podman work in the next few weeks, I'll add this to my list of things to do.

@shoenig shoenig self-assigned this Apr 5, 2023
@shoenig
Copy link
Member

shoenig commented Apr 17, 2023

There might be a way to fix this without resorting to building with CGO. The podman driver itself doesn't need any of the code which is causing a problem here - but rather it's being invoked as an init block of a package being transitively imported.

➜ go mod why github.com/hashicorp/nomad/helper/users
# github.com/hashicorp/nomad/helper/users
github.com/hashicorp/nomad-driver-podman
github.com/hashicorp/nomad/plugins/drivers
github.com/hashicorp/nomad/client/allocdir
github.com/hashicorp/nomad/helper/users

Let me see if we can play some musical chairs with these packages and eliminate that transitive dependency.

shoenig added a commit to hashicorp/nomad that referenced this issue Apr 17, 2023
This PR eliminates code specific to looking up and caching the uid/gid/user.User
object associated with the nobody user in an init block. This code existed before
adding the generic users cache and was meant to optimize the one search path we
knew would happen often. Now that we have the cache, seems reasonable to eliminate
this init block and use the cache instead like for any other user.

Also fixes a constraint on the podman (and other) drivers, where building without
CGO became problematic on some OS like Fedora IoT where the nobody user cannot
be found with the pure-Go standard library.

Fixes github.com/hashicorp/nomad-driver-podman/issues/228
shoenig added a commit to hashicorp/nomad that referenced this issue Apr 17, 2023
This PR eliminates code specific to looking up and caching the uid/gid/user.User
object associated with the nobody user in an init block. This code existed before
adding the generic users cache and was meant to optimize the one search path we
knew would happen often. Now that we have the cache, seems reasonable to eliminate
this init block and use the cache instead like for any other user.

Also fixes a constraint on the podman (and other) drivers, where building without
CGO became problematic on some OS like Fedora IoT where the nobody user cannot
be found with the pure-Go standard library.

Fixes github.com/hashicorp/nomad-driver-podman/issues/228
shoenig added a commit to hashicorp/nomad that referenced this issue Apr 17, 2023
(manual cherry-pick of ed0dfd2)

This PR eliminates code specific to looking up and caching the uid/gid/user.User
object associated with the nobody user in an init block. This code existed before
adding the generic users cache and was meant to optimize the one search path we
knew would happen often. Now that we have the cache, seems reasonable to eliminate
this init block and use the cache instead like for any other user.

Also fixes a constraint on the podman (and other) drivers, where building without
CGO became problematic on some OS like Fedora IoT where the nobody user cannot
be found with the pure-Go standard library.

Fixes github.com/hashicorp/nomad-driver-podman/issues/228
shoenig added a commit to hashicorp/nomad that referenced this issue Apr 24, 2023
(manual cherry-pick of ed0dfd2)

This PR eliminates code specific to looking up and caching the uid/gid/user.User
object associated with the nobody user in an init block. This code existed before
adding the generic users cache and was meant to optimize the one search path we
knew would happen often. Now that we have the cache, seems reasonable to eliminate
this init block and use the cache instead like for any other user.

Also fixes a constraint on the podman (and other) drivers, where building without
CGO became problematic on some OS like Fedora IoT where the nobody user cannot
be found with the pure-Go standard library.

Fixes github.com/hashicorp/nomad-driver-podman/issues/228

Co-authored-by: Seth Hoenig <shoenig@duck.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants