Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra tmpfs mount on /sys/fs/cgroup causing problems with Hashicorp Nomad cgroup version detection #1598

Closed
googol opened this issue Jan 30, 2024 · 13 comments

Comments

@googol
Copy link

googol commented Jan 30, 2024

Currently on unraid 6.12.6, the path /sys/fs/cgroup has two mounts:

# mount -l | grep cgroup
cgroup_root on /sys/fs/cgroup type tmpfs (rw,relatime,size=8192k,mode=755,inode64)
none on /sys/fs/cgroup type cgroup2 (rw,relatime)

As I've come to understand this, the tmpfs mount is something that is normally only used for cgroup v1, and for v2 it is not necessary at all (it is completely hidden by the actual cgroup2 mount). These mounts come from /etc/rc.d/rc.S on my system, and this repository seems to host the same snippet in /etc/rc.d/rc.S.cont.

This is not a problem for actual usage of cgroup v2, with version 1.7.0 Hashicorp Nomad changed the way they detect the supported cgroup version, to check the first mount type on the path /sys/fs/cgroup. Since cgroup v1 is at least typically configured with a tmpfs, Nomad 1.7.x versions are now tripping to thinking unraid's cgroup version is 1, and failing all jobs I'm trying to run.

Could the tmpfs mount be removed in the cgroup v2 branch of the init script?

Some links of previous investigations:

@googol
Copy link
Author

googol commented Feb 5, 2024

Hi @limetech, I hope you don't mind me tagging you directly. Would you have an idea about this or could you tag in someone who could help with this?

@ljm42
Copy link
Member

ljm42 commented Feb 6, 2024

Let me start by saying I personally am not the right person to help with this, but I have raised awareness of the issue. Thank you for starting in the forums, and for the link to the issue with the technical details.

To give you a little more info on this repo...

The 6.12 branch in this repo corresponds to the 6.12.6 version of Unraid on your server. This is the release where we started moving /etc/rc.d/ scripts into the repo but it does not contain everything, for instance you won't find rc.S in that branch.

The master branch corresponds to our next major release, and we have moved rc.S and others into the repo to make collaborating on them easier. There have been many changes so the master branch is quite different from what is on your server.

@ich777
Copy link
Contributor

ich777 commented Feb 7, 2024

tmpfs mount is something that is normally only used for cgroup v1, and for v2 it is not necessary at all (it is completely hidden by the actual cgroup2 mount).

That's true but it makes no difference if you mount it on a tmpfs and cgroup v2 is working correctly on Unraid as far as I'm concerned, also Docker and to that extent LXC is detecting the correct cgroup version.

This is not a problem for actual usage of cgroup v2, with version 1.7.0 Hashicorp Nomad changed the way they detect the supported cgroup version

Sorry I'm not familiar with Hashicorp Nomad.

Usually you would do something like the following in bash to detect the cgroup:
stat -f -c %T /sys/fs/cgroup, cat /proc/filesystems | grep cgroup2 or actually check the hierarchy of /sys/fs/cgroup

Since cgroup v1 is at least typically configured with a tmpfs

That's true but that doesn't mean that other Distributions maybe not also mount cgroup v2 as a tmpfs to maybe limit the size of the mount point itself <- however that shouldn't matter much because it would be very hard to reach the 8M that are set for cgroup2 tmpfs on Unraid.

Could the tmpfs mount be removed in the cgroup v2 branch of the init script?

This needs further discussion and I would also like to discuss this with @limetech
cgroup v1 and cgroup v2 have a completely different hierarchy in /sys/fs/cgroup and that should be possible to detect.

My question would be why and how did Hashicorp changed the way to detect the cgroup version? Where there some issues?
Would this problem be really solved if the tmpfs is removed?

Nomad issue tracker ticket:

BTW: I saw that you where referring to the default Slackware rc.S but default Slackware is using cgroup v1 and not cgroup v2 that's why it's different.

@googol
Copy link
Author

googol commented Feb 7, 2024

My question would be why and how did Hashicorp changed the way to detect the cgroup version? Where there some issues?

I don't have the full context, but they did a complete overhaul of their use of cgroups in 1.7.0 for better NUMA system scheduling (release, MR)

Would this problem be really solved if the tmpfs is removed?

The problem definitely goes away if the tmpfs is removed. The new logic in Nomad is to find the first mount for /sys/fs/cgroup that has either type tmpfs or cgroup2, and returns either v1 or v2 respectively. So if there's no tmpfs mount underlying the cgroup2 mount, it will be correctly detected.

I will try to argue to the Nomad team that they should change the logic anyways (to maybe ignore the tmpfs mount if there's also a cgroup2 mount), but I don't really expect to get very much priority for that change, since as far as I know I'm the only user that's hitting this issue and I'm just doing this on a homelab :D most distros use systemd and systemd creates the cgroup mount without the tmpfs underneath, so this just works for them

BTW: I saw that you where referring to the default Slackware rc.S but default Slackware is using cgroup v1 and not cgroup v2 that's why it's different.

I thought so as well :)

@ich777
Copy link
Contributor

ich777 commented Feb 7, 2024

The problem definitely goes away if the tmpfs is removed. The new logic in Nomad is to find the first mount for /sys/fs/cgroup that has either type tmpfs or cgroup2, and returns either v1 or v2 respectively. So if there's no tmpfs mount underlying the cgroup2 mount, it will be correctly detected.

I think this should do the job also fine and circumvent that even if looking at /sys/fs/cgroup (in my opinion this is not the best way to detect the cgroup version):

func scan(in io.Reader) Mode {
	scanner := bufio.NewScanner(in)
	for scanner.Scan() {
		tokens := set.From(strings.Fields(scanner.Text()))
		if tokens.Contains("/sys/fs/cgroup") {
			if tokens.Contains("cgroup2") {
				return CG2
			} else if tokens.Contains("tmpfs") {
				return CG1
			}
		}
	}
	return OFF
}

I will try to argue to the Nomad team

I already made a draft here for Unraid: #1610

I thought so as well :)

It was mainly added for LXC to be honest since it limited some functionality from LXC.

@googol
Copy link
Author

googol commented Feb 7, 2024

I don't think flipping the cgroup2 and tmpfs cases helps, since the mounted filesystems are read line-by-line, and the tmpfs line comes before the cgroup2 line, so it will still return CG1

I already made a draft here for Unraid: #1610

That's already looking good to me!

It was mainly added for LXC to be honest since it limited some functionality from LXC.

I'm really happy that it's available, since Nomad really wants to use cgroup2 functionality.

Thanks for the attention on this!

@googol
Copy link
Author

googol commented Feb 12, 2024

I see the change was merged already, thanks! Any timeline for when to expect a new release with that patch included?

@ich777
Copy link
Contributor

ich777 commented Feb 12, 2024

When the next public RC/beta from Unraid is released.

SOON™ ;)

@ljm42
Copy link
Member

ljm42 commented Feb 13, 2024

Unraid 6.12.7-rc2 is now available:
https://forums.unraid.net/bug-reports/prereleases/unraid-os-version-6127-rc2-available-r2840/

@googol
Copy link
Author

googol commented Feb 13, 2024

Cool, I'll give it a go later today!

@googol
Copy link
Author

googol commented Feb 15, 2024

With 6.12.7-rc2 the latest Nomad runs happily! Thanks to both of you for being so helpful!

@ich777
Copy link
Contributor

ich777 commented Feb 16, 2024

@googol 6.12.8 was released a few hours ago, can you test if it is working for you and close this issue when everything works as expected now?

@googol
Copy link
Author

googol commented Feb 18, 2024

I've updated to 6.12.8 and Nomad is running happily. As the fix is in a stable version now, I'm happy to close this. Thanks again!

@googol googol closed this as completed Feb 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants