Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster Intel RDT init if the feature is unavailable #3306

Merged
merged 3 commits into from
Jan 19, 2022

Conversation

kolyshkin
Copy link
Contributor

In a (quite common) case RDT is not supported by the kernel/hardware,
it does not make sense to parse /proc/cpuinfo and /proc/self/mountinfo,
and yet the current code does it (on every runc exec, for example).

Fortunately, there is a quick way to check whether RDT is available --
if so, kernel creates /sys/fs/resctrl directory. Check its existence,
and skip all the other initialization if it's not present.

(plus some cleanups)

@kolyshkin
Copy link
Contributor Author

@marquiz @xiaochenshen @Creatone PTAL (I don't even have hardware to test this)

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @kolyshkin, good improvement. I can verify this patch on a system with RDT support

Comment on lines 286 to 298
rootOnce.Do(func() {
f, err := os.Open("/proc/self/mountinfo")
if err != nil {
intelRdtRootErr = err
return
}
root, err := findIntelRdtMountpointDir(f)
f.Close()
if err != nil {
intelRdtRootErr = err
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to take into account that resctrl might get mounted after the first check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. If we do, we have to re-read /proc/self/mountinfo over and over (assuming it's some long-lived daemon that imports libcontainer/intelrdt, such as kubelet; for runc itself it's not a issue, since it's a short-lived binary).

I suppose that resctrl is mounted during system init, and it should be mounted by the time we start runc or e.g. kubernetes. If not, it's a configuration issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K, fair enough

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that resctrl is mounted during system init, and it should be mounted by the time we start runc or e.g. kubernetes. If not, it's a configuration issue.

Are there specific systemd targets to take this into account and that should be recommended for that? I know containerd has local-fs.target, does that cover this as well?

After=network.target local-fs.target

@xiaochenshen
Copy link
Contributor

@kolyshkin Thank you for working on this improvement.
Intel RDT kernel code checks hardware capabilities via CPUID, sets corresponding CPU flags in /proc/cpuinfo if features are supported by hardware, and then creates directory /sys/fs/resctrl/ as the default mount point for resctrl filesystem. So checking existence of /sys/fs/resctrl/ could fast return on platforms which don't support Intel RDT.

FYI - Actually on Intel RDT enabled platforms, we don't have to mount resctrl filesystem into directory /sys/fs/resctrl/ every time. Here is a rare case to mount to another mount point (e.g., /mnt/resctrl/):

mount -t resctrl resctrl /mnt/resctrl/
cat /proc/self/mountinfo | grep resctrl
63 45 0:48 / /mnt/resctrl rw,relatime shared:40 - resctrl /mnt/resctrl rw

In this case, we have to call findIntelRdtMountpointDir() in Root() to get the real mount point /mnt/resctrl/ instead of /sys/fs/resctrl/

So in my opinion, checking existence of /sys/fs/resctrl/ is more like a capability check rather than a root directory check. We don't have to check it inside Root(). How about moving it out of Root() to the beginning of featuresInit(), prior to parseCpuInfoFile()? We could return even earlier on platforms which don't support Intel RDT.

@kolyshkin
Copy link
Contributor Author

How about moving it out of Root() to the beginning of featuresInit(), prior to parseCpuInfoFile()?

Sorry, this was my intention indeed I switched to other work and then back, and published a WIP patch as a result.

Will fix and let you know.

@kolyshkin kolyshkin marked this pull request as draft December 3, 2021 19:23
@kolyshkin kolyshkin changed the title Faster Intel RDT init if the feature is unavailable [WIP] Faster Intel RDT init if the feature is unavailable Dec 3, 2021
In case resctrl filesystem can not be found in /proc/self/mountinfo
(which is pretty common on non-server or non-x86 hardware), subsequent
calls to Root() will result in parsing it again and again.

Use sync.Once to avoid it. Make unit tests call it so that Root() won't
actually parse mountinfo in tests.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This test was written back in the day when findIntelRdtMountpointDir
was using its own mountinfo parser. Commit f1c1fdf changed that,
and thus this test is actually testing moby/sys/mountinfo parser, which
does not make much sense.

Remove the test, and drop the io.Reader argument since we no longer need
to parse a custom file.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In a (quite common) case RDT is not supported by the kernel/hardware,
it does not make sense to parse /proc/cpuinfo and /proc/self/mountinfo,
and yet the current code does it (on every runc exec, for example).

Fortunately, there is a quick way to check whether RDT is available --
if so, kernel creates /sys/fs/resctrl directory. Check its existence,
and skip all the other initialization if it's not present.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin kolyshkin marked this pull request as ready for review December 4, 2021 00:39
@kolyshkin
Copy link
Contributor Author

OK this is how I intended it -- do /sys/fs/resctrl existence check in Root, and call Root() the first thing from initFeatures, in order to skip everything else, if directory is not there.

NOTE we could skip parsing mountinfo entirely by doing statfs on /sys/fs/resctrl and checking the filesystem type to have resctrl magic. Unfortunately this is impossible, since we also need to know mount flags, only available from mountinfo.

@kolyshkin
Copy link
Contributor Author

@marquiz @xiaochenshen @Creatone PTAL

@kolyshkin kolyshkin changed the title [WIP] Faster Intel RDT init if the feature is unavailable Faster Intel RDT init if the feature is unavailable Dec 5, 2021
Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@xiaochenshen xiaochenshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Creatone Creatone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kolyshkin
Copy link
Contributor Author

As this modifies intelrdt.Root which is used by e.g. cAdvisor, and the changes are pretty straightforward, perhaps we can squeeze this into 1.1.0? WDYT @AkihiroSuda @cyphar ?

@kolyshkin
Copy link
Contributor Author

@AkihiroSuda @cyphar @thaJeztah PTAL 🙏🏻

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 286 to 298
rootOnce.Do(func() {
f, err := os.Open("/proc/self/mountinfo")
if err != nil {
intelRdtRootErr = err
return
}
root, err := findIntelRdtMountpointDir(f)
f.Close()
if err != nil {
intelRdtRootErr = err
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that resctrl is mounted during system init, and it should be mounted by the time we start runc or e.g. kubernetes. If not, it's a configuration issue.

Are there specific systemd targets to take this into account and that should be recommended for that? I know containerd has local-fs.target, does that cover this as well?

After=network.target local-fs.target

intelRdtRootErr = err
return
}
root, err := findIntelRdtMountpointDir(f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess technically we wouldn't need the intermediate root and err variables (no need to change)

41 18 0:16 / /sys/fs/resctrl rw,relatime shared:24 - resctrl resctrl rw,mba_MBps
42 20 0:36 / /dev/hugepages rw,relatime shared:25 - hugetlbfs hugetlbfs rw
43 19 0:37 / /proc/sys/fs/binfmt_misc rw,relatime shared:26 - autofs systemd-1 rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=35492
44 20 0:15 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any test-case here that we're not covering yet in the https://github.com/moby/sys repository?

@thaJeztah thaJeztah merged commit 62133e3 into opencontainers:master Jan 19, 2022
@kolyshkin kolyshkin added this to the 1.2.0 milestone Jan 21, 2022
kolyshkin added a commit to kolyshkin/runc that referenced this pull request Jan 21, 2022
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
cyphar added a commit that referenced this pull request Jan 21, 2022
Kir Kolyshkin (2):
  CHANGELOG: add #3306
  CHANGELOG.md: nit

LGTMs: AkihiroSuda cyphar
Closes #3344
@kolyshkin kolyshkin mentioned this pull request Aug 9, 2023
@cyphar cyphar mentioned this pull request Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants