Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion `cgroup_path_relative_to_mount[common_path_prefix_len] == '/'' failed #34287

Closed
omajid opened this issue Mar 30, 2020 · 1 comment · Fixed by #34291
Closed

Assertion `cgroup_path_relative_to_mount[common_path_prefix_len] == '/'' failed #34287

omajid opened this issue Mar 30, 2020 · 1 comment · Fixed by #34291
Labels
area-PAL-coreclr untriaged New issue has not been triaged by the area owner

Comments

@omajid
Copy link
Member

omajid commented Mar 30, 2020

I have been trying to add cgroupv2 support to coreclr and ran into this assertion. I have a test a program that #includes cgroup.cpp and then calls various cgroup methods in it.

#include "cgroup.cpp"

int main(int, char *[])
{
    InitializeCGroup();
    printf("initialized cgroup\n");

    size_t physical_memory_limit = GetRestrictedPhysicalMemoryLimit();
    printf("GetResitrictedPhysicalMemoryLimit: %lu\n", physical_memory_limit);

    size_t used_memory = 0;
    bool okay_memory_used = GetPhysicalMemoryUsed(&used_memory);
    printf("GetPhysicalMemoryUsed: %d %lu\n", okay_memory_used, used_memory);

    uint32_t cpu_limit = 0;
    bool okay_cpu_limit = GetCpuLimit(&cpu_limit);
    printf("GetCpuLimit: %d %u\n", okay_cpu_limit, cpu_limit);

    CleanupCGroup();
    printf("cleaned up cgroups\n");
    return 0;
}

I am running Fedora 31, with cgroupv1 (using systemd.unified_cgroup_hierarchy=0 kernel command line) and podman. This program crashes when run inside a container:

cg: cgroup.cpp:192: static char* CGroup::FindCgroupPath(bool (*)(const char*)): Assertion `cgroup_path_relative_to_mount[common_path_prefix_len] == '/'' failed.
                                                           
Program received signal SIGABRT, Aborted.       
0x00007fb062a41625 in raise () from /lib64/libc.so.6                                                                   
Missing separate debuginfos, use: dnf debuginfo-install libgcc-9.2.1-1.fc31.x86_64 libstdc++-9.2.1-1.fc31.x86_64
(gdb) bt                                                                                                               
#0  0x00007fb062a41625 in raise () from /lib64/libc.so.6
#1  0x00007fb062a2a8d9 in abort () from /lib64/libc.so.6                                                                                                                                                                                      
#2  0x00007fb062a2a7a9 in __assert_fail_base.cold () from /lib64/libc.so.6          
#3  0x00007fb062a39a66 in __assert_fail () from /lib64/libc.so.6                    
#4  0x0000000000401a4e in CGroup::FindCgroupPath (is_subsystem=0x4018dc <CGroup::IsMemorySubsystem(char const*)>) at cgroup.cpp:192
#5  0x00000000004015b4 in CGroup::Initialize () at cgroup.cpp:50
#6  0x000000000040126f in InitializeCGroup () at cgroup.cpp:456                          
#7  0x00000000004014fc in main () at cg.cpp:7                                   
(gdb) frame 4                                                                                                          
#4  0x0000000000401a4e in CGroup::FindCgroupPath (is_subsystem=0x4018dc <CGroup::IsMemorySubsystem(char const*)>) at cgroup.cpp:192
192             assert(cgroup_path_relative_to_mount[common_path_prefix_len] == '/');     
(gdb) p common_path_prefix_len 
$1 = 92
(gdb) p hierarchy_mount 
$2 = 0x2031bf0 "/sys/fs/cgroup/memory"
(gdb) p hierarchy_root 
$3 = 0x2031e30 "/machine.slice/libpod-8ec162730bf476f53bbbb9406b232f479fee5434c5ae08a939e7dbcd1a7283aa.scope"
(gdb) p cgroup_path_relative_to_mount 
$4 = 0x2032100 "/machine.slice/libpod-8ec162730bf476f53bbbb9406b232f479fee5434c5ae08a939e7dbcd1a7283aa.scope"
(gdb) p cgroup_path_relative_to_mount[92]
$5 = 0 '\000'

This seems to have been introduced by #980

Seems to me like the code just expects a trailing slash that's not present:

common_path_prefix_len = strlen(hierarchy_root);
if ((common_path_prefix_len == 1) || strncmp(hierarchy_root, cgroup_path_relative_to_mount, common_path_prefix_len) != 0)
{
    common_path_prefix_len = 0;
}

assert(cgroup_path_relative_to_mount[common_path_prefix_len] == '/');

strcat(cgroup_path, cgroup_path_relative_to_mount + common_path_prefix_len);

A fix might be as simple as generalizing the assert to handle the case where cgroup_path_relative_to_mount[common_path_prefix_len] is NULL.

cc @janvorli

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-PAL-coreclr untriaged New issue has not been triaged by the area owner labels Mar 30, 2020
@janvorli
Copy link
Member

@omajit I am almost ready to send out a PR for this, the assert was incorrect for unnamed cgroups. It never fired because the cgroups are initialized in the PAL before the debugging support and so the assert was silently ignored.
The fix is exactly what you've said.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-PAL-coreclr untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants