Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad panic during RPM post install script execution. #16179

Closed
apollo13 opened this issue Feb 14, 2023 · 5 comments · Fixed by #16180
Closed

Nomad panic during RPM post install script execution. #16179

apollo13 opened this issue Feb 14, 2023 · 5 comments · Fixed by #16180
Assignees
Labels
stage/waiting-on-upstream This issue is waiting on an upstream PR review theme/crash theme/dependencies Pull requests that update a dependency file type/bug
Milestone

Comments

@apollo13
Copy link
Contributor

apollo13 commented Feb 14, 2023

I am adding nomad to a fedora coreos system of mine during the build process. I am not exactly sure in which kind of chroot or whatever an rpm-ostree build runs. Nevertheless the postinstall script of the nomad RPM fails like this:

nomad.post: panic: cannot statfs cgroup root: no such file or directory
nomad.post: 
nomad.post: goroutine 1 [running]:
nomad.post: github.com/opencontainers/runc/libcontainer/cgroups.IsCgroup2UnifiedMode.func1()
nomad.post: 	github.com/opencontainers/runc@v1.1.4/libcontainer/cgroups/utils.go:45 +0x1d9
nomad.post: sync.(*Once).doSlow(0x4b97dd?, 0x320c780?)
nomad.post: 	sync/once.go:74 +0xc2
nomad.post: sync.(*Once).Do(...)
nomad.post: 	sync/once.go:65
nomad.post: github.com/opencontainers/runc/libcontainer/cgroups.IsCgroup2UnifiedMode()
nomad.post: 	github.com/opencontainers/runc@v1.1.4/libcontainer/cgroups/utils.go:35 +0x31
nomad.post: github.com/hashicorp/nomad/client/lib/cgutil.init()
nomad.post: 	github.com/hashicorp/nomad/client/lib/cgutil/cgutil_linux.go:21 +0x17

I assume this is because the nomad postinstall script executes the following:

if [[ $(nomad version) == *+ent* ]]; then
    echo "
The following shall apply unless your organization has a separately signed Enterprise License Agreement or Evaluation Agreement governing your use of the software:
Software in this repository is subject to the license terms located in the software, copies of which are also available at https://eula.hashicorp.com/ClickThruELA-Global.pdf or https://www.hashicorp.com/terms-of-evaluation as applicable. Please read the license terms prior to using the software. Your installation and use of the software constitutes your acceptance of these terms. If you do not accept the terms, do not use the software.
"
fi

and nomad version might already panic? I think it would be nice if nomad version would run without to many checks and if that is not possible then maybe at least fix the postinstall script to hide errors?

@apollo13 apollo13 changed the title Nomad panic during post install script. Nomad panic during ROM post install script execution. Feb 14, 2023
@tgross tgross added this to Needs Triage in Nomad - Community Issues Triage via automation Feb 14, 2023
@tgross
Copy link
Member

tgross commented Feb 14, 2023

Hi @apollo13! Can you clarify for me which version of Nomad you're pulling down and whether the RPM you're installing is the official one for that version?

@tgross tgross self-assigned this Feb 14, 2023
@tgross tgross moved this from Needs Triage to Triaging in Nomad - Community Issues Triage Feb 14, 2023
@apollo13
Copy link
Contributor Author

I am on fedora 37 and pulling the rpm via the official repo:

[hashicorp]
name=Hashicorp Stable - $basearch
baseurl=https://rpm.releases.hashicorp.com/fedora/$releasever/$basearch/stable
enabled=1
gpgcheck=1
gpgkey=https://rpm.releases.hashicorp.com/gpg

the rpm in question is nomad-1.4.3-1.x86_64.rpm and the post scripts are:

postinstall scriptlet (using /bin/sh):
#!/bin/bash

mkdir -p /opt/nomad/data
chown nomad:nomad /opt/nomad/data
chown -R nomad:nomad /etc/nomad.d

if [ -d /run/systemd/system ]; then
    systemctl --system daemon-reload >/dev/null || true
fi

if [[ $(nomad version) == *+ent* ]]; then
    echo "
The following shall apply unless your organization has a separately signed Enterprise License Agreement or Evaluation Agreement governing your use of the software:
Software in this repository is subject to the license terms located in the software, copies of which are also available at https://eula.hashicorp.com/ClickThruELA-Global.pdf or https://www.hashicorp.com/terms-of-evaluation as applicable. Please read the license terms prior to using the software. Your installation and use of the software constitutes your acceptance of these terms. If you do not accept the terms, do not use the software.
"
fi

@shoenig
Copy link
Member

shoenig commented Feb 14, 2023

opencontainers/runc@8290c4c

Should be fixed in libcontainer in the next release

is almost the fix we need - but we use the helper function right above that one!

I'll open a PR with upstream.

@apollo13
Copy link
Contributor Author

Lovely thanks!

@tgross
Copy link
Member

tgross commented Feb 14, 2023

In a sidebar discussion we noticed that the panic fix linked above is for a different panic in the same file in runc. The panic you're hitting here is a different one that should probably have been fixed in that PR but is still present on runc's main. @shoenig is going to take this issue over and try to get a patch upstream to fix that.

@tgross tgross assigned shoenig and unassigned tgross Feb 14, 2023
@tgross tgross added theme/dependencies Pull requests that update a dependency file theme/crash stage/waiting-on-upstream This issue is waiting on an upstream PR review labels Feb 14, 2023
@tgross tgross moved this from Triaging to In Progress in Nomad - Community Issues Triage Feb 14, 2023
@tgross tgross added this to the 1.5.0 milestone Feb 14, 2023
shoenig added a commit that referenced this issue Feb 14, 2023
This PR wraps the cgroups.IsCgroup2UnifiedMode() helper method from
runc in a defer/recover block because it might panic in some cases.

Upstream fix in: opencontainers/runc#3745

Closes #16179
shoenig added a commit that referenced this issue Feb 14, 2023
This PR wraps the cgroups.IsCgroup2UnifiedMode() helper method from
runc in a defer/recover block because it might panic in some cases.

Upstream fix in: opencontainers/runc#3745

Closes #16179
shoenig added a commit that referenced this issue Feb 14, 2023
This PR wraps the cgroups.IsCgroup2UnifiedMode() helper method from
runc in a defer/recover block because it might panic in some cases.

Upstream fix in: opencontainers/runc#3745

Closes #16179
@apollo13 apollo13 changed the title Nomad panic during ROM post install script execution. Nomad panic during RPM post install script execution. Feb 14, 2023
shoenig added a commit that referenced this issue Feb 14, 2023
This PR wraps the cgroups.IsCgroup2UnifiedMode() helper method from
runc in a defer/recover block because it might panic in some cases.

Upstream fix in: opencontainers/runc#3745

Closes #16179
Nomad - Community Issues Triage automation moved this from In Progress to Done Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/waiting-on-upstream This issue is waiting on an upstream PR review theme/crash theme/dependencies Pull requests that update a dependency file type/bug
Projects
Development

Successfully merging a pull request may close this issue.

3 participants