Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd does not work when host has cgroup2 #12

Open
lostiniceland opened this issue Jan 8, 2021 · 24 comments
Open

systemd does not work when host has cgroup2 #12

lostiniceland opened this issue Jan 8, 2021 · 24 comments

Comments

@lostiniceland
Copy link

Currently any systemd operation fails on this image.
The issue is probably linked to this one
ansible/ansible#71528

There seems to be some changes on systemd-side which had some follow-up changes in Ansible, but even after updating to 2.10.4 the error persists. I can only assume that the additional modifications to make systemd work inside a container have to be adjusted...

@ollie1
Copy link

ollie1 commented May 11, 2021

I've just hit this issue too. @lostiniceland did you ever find a fix / workaround?

@geerlingguy
Copy link
Owner

I haven't encountered this issue on my CI images in GitHub Actions, and verified things are working locally too... can you give an example to reproduce the issues you're seeing?

@ollie1
Copy link

ollie1 commented May 11, 2021

Very possibly I'm doing something wrong, but if you can spot what I'd be very grateful!

Here is a minimal example (assuming standard role structure generated with molecule init role docker-centos7-ansible --driver-name docker):

molecule.yml

dependency:
  name: galaxy
driver:
  name: docker
platforms:
  - name: instance
    image: geerlingguy/docker-centos7-ansible
    pre_build_image: true
    privileged: true
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
provisioner:
  name: ansible
verifier:
  name: ansible

converge.yml

- name: Converge
  hosts: all
  tasks:
    - name: Install firewalld
      package:
        name: firewalld
        state: present
    - name: Enable and start firewalld
      service:
        name: firewalld
        state: started

Running molecule converge works fine until it gives the following error:

PLAY [Converge] ****************************************************************

TASK [Gathering Facts] *********************************************************
ok: [instance]

TASK [Install firewalld] *******************************************************
changed: [instance]

TASK [Enable and start firewalld] **********************************************
fatal: [instance]: FAILED! => {"changed": false, "msg": "Service is in unknown state", "status": {}}

PLAY RECAP *********************************************************************
instance                   : ok=2    changed=1    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

Execing into the container and running systemctl status firewalld gives

[root@instance /]# systemctl status firewalld
Failed to get D-Bus connection: Operation not permitted

@geerlingguy
Copy link
Owner

@ollie1 - You're missing the command override—molecule injects a command that needs to be removed for systemd to be the first process in the container; see https://github.com/geerlingguy/ansible-role-apache/blob/master/molecule/default/molecule.yml#L9

command: ${MOLECULE_DOCKER_COMMAND:-""}

@Vakhrushev
Copy link

Same issue

TASK [Enable and start firewalld] **********************************************                                                                                                               
fatal: [instance]: FAILED! => {
    "changed": false,
    "cmd": "/usr/bin/systemctl",
    "invocation": {
        "module_args": {
            "daemon_reexec": false,
            "daemon_reload": false,
            "enabled": null,
            "force": null,
            "masked": null,
            "name": "firewalld",
            "no_block": false,
            "scope": "system",
            "state": "started"
        }
    },
    "msg": "Failed to get D-Bus connection: No such file or directory",
    "rc": 1,
    "stderr": "Failed to get D-Bus connection: No such file or directory\n",
    "stderr_lines": [
        "Failed to get D-Bus connection: No such file or directory"
    ],
    "stdout": "",
    "stdout_lines": []
}

It is certainly from operating system.

Linux va 5.12.2-arch1-1 #1 SMP PREEMPT Fri, 07 May 2021 15:36:06 +0000 x86_64 GNU/Linux

@geerlingguy
Copy link
Owner

@Vakhrushev - Did you modify your molecule config to override the command as I mentioned above?

@Vakhrushev
Copy link

Vakhrushev commented May 11, 2021

Yes. Do it from zero.

vls@va:~/tmp                                                                                                                                                                                            > git clone https://github.com/geerlingguy/ansible-role-apache
Cloning into 'ansible-role-apache'...
remote: Enumerating objects: 1097, done.                                                            
remote: Counting objects: 100% (6/6), done.                                                                                                                                                             remote: Compressing objects: 100% (6/6), done.                                                                                                                                                          
remote: Total 1097 (delta 0), reused 2 (delta 0), pack-reused 1091
Receiving objects: 100% (1097/1097), 171.87 KiB | 215.00 KiB/s, done.           
Resolving deltas: 100% (566/566), done.                                                             

vls@va:~/tmp                                                                                        
> cd ansible-role-apache/         

vls@va:~/tmp/ansible-role-apache
> molecule create                                                                                                                                                                      master [5b2e65d]
INFO     default scenario test matrix: dependency, create, prepare              
INFO     Performing prerun...                                                                                                                                                                           INFO     Using .cache/roles/geerlingguy.apache symlink to current repository in order to enable Ansible to find the role using its expected full name.                                                  INFO     Added ANSIBLE_ROLES_PATH=~/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles:./.cache/roles
INFO     Running default > dependency
WARNING  Skipping, missing the requirements file.                                                   
WARNING  Skipping, missing the requirements file.                                                                                                                                                       
INFO     Running default > create
INFO     Sanity checks: 'docker'  
                                                                                                    
PLAY [Create] ******************************************************************
                                                  
TASK [Log into a Docker registry] **********************************************                                                                                                                        
skipping: [localhost] => (item={'command': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/f
s/cgroup:ro']})              
                                                                                                    
TASK [Check presence of custom Dockerfiles] ************************************
ok: [localhost] => (item={'command': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgro
up:ro']})

TASK [Create Dockerfiles from image names] *************************************
skipping: [localhost] => (item={'command': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/f
s/cgroup:ro']})

TASK [Discover local Docker images] ********************************************
ok: [localhost] => (item={'changed': False, 'skipped': True, 'skip_reason': 'Conditional result was False', 'item': {'command': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'insta
nce', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgroup:ro']}, 'ansible_loop_var': 'item', 'i': 0, 'ansible_index_var': 'i'})

TASK [Build an Ansible compatible image (new)] *********************************
skipping: [localhost] => (item=molecule_local/geerlingguy/docker-centos7-ansible:latest) 

TASK [Create docker network(s)] ************************************************

TASK [Determine the CMD directives] ********************************************
ok: [localhost] => (item={'command': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgro
up:ro']})

TASK [Create molecule instance(s)] *********************************************
changed: [localhost] => (item=instance)


TASK [Wait for instance(s) creation to complete] *******************************
FAILED - RETRYING: Wait for instance(s) creation to complete (300 retries left).
changed: [localhost] => (item={'started': 1, 'finished': 0, 'ansible_job_id': '584130753297.11722', 'results_file': '/home/vls/.ansible_async/584130753297.11722', 'changed': True, 'failed': False, 'it
em': {'command': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgroup:ro']}, 'ansible_l
oop_var': 'item'})

PLAY RECAP *********************************************************************
localhost                  : ok=5    changed=2    unreachable=0    failed=0    skipped=4    rescued=0    ignored=0

INFO     Running default > prepare
WARNING  Skipping, prepare playbook not configured.

vls@va:~/tmp/ansible-role-apache
> molecule login                                                                                                                                                                       master [5b2e65d]
INFO     Running default > login
[root@instance /]# systemctl 
Failed to get D-Bus connection: No such file or directory

> docker --version                                                                                                                                                                     master [5b2e65d]
Docker version 20.10.6, build 370c28948e

> docker-compose version                                                                                                                                                               master [5b2e65d]
docker-compose version 1.29.2, build unknown
docker-py version: 5.0.0
CPython version: 3.9.5
OpenSSL version: OpenSSL 1.1.1k  25 Mar 2021

> molecule --version                                                                                                                                                                   master [5b2e65d]
molecule 3.3.0 using python 3.9 
    ansible:2.11.0
    delegated:3.3.0 from molecule
    docker:0.2.4 from molecule_docker

@geerlingguy
Copy link
Owner

geerlingguy commented May 11, 2021

@Vakhrushev - I just did the exact same thing (molecule create, molecule login, then systemctl) and got:

[root@instance /]# systemctl
  UNIT                           LOAD   ACTIVE     SUB       DESCRIPTION
  dev-vda1.device                loaded activating tentative /dev/vda1
  -.mount                        loaded active     mounted   /
  dev-mqueue.mount               loaded active     mounted   POSIX Message Queue File System
  etc-hostname.mount             loaded active     mounted   /etc/hostname
  etc-hosts.mount                loaded active     mounted   /etc/hosts
  etc-resolv.conf.mount          loaded active     mounted   /etc/resolv.conf

My stats:

$ docker --version  
Docker version 20.10.5, build 55c4c88

$ docker-compose version 
docker-compose version 1.29.0, build 07737305
docker-py version: 5.0.0
CPython version: 3.9.0
OpenSSL version: OpenSSL 1.1.1h  22 Sep 2020

$ molecule --version
molecule 3.2.0 using python 3.9 
    ansible:2.10.8
    delegated:3.2.0 from molecule
    docker:0.2.4 from molecule_docker

I'll update my version of molecule to latest and see if that makes a difference.

Edit: Works the same with Molecule 3.3.0. Are you using the latest HEAD on my apache repository?

@geerlingguy
Copy link
Owner

I just noticed my docker-centos7-ansible image was 7 months old so I'm updating it now...

Heh... it's still that one:

$ docker images
REPOSITORY                           TAG       IMAGE ID       CREATED        SIZE
geerlingguy/docker-centos7-ansible   latest    a727967c4d1d   7 months ago   573MB

I just realized I haven't updated this repository to build off GitHub Actions yet. I should probably do that. The last time it was built was 7 months ago, back when Travis CI still worked.

@ollie1
Copy link

ollie1 commented May 12, 2021

@geerlingguy Thank you so much - adding the command fixed it. Now it all works as expected.

@Vakhrushev
Copy link

My trouble with systemd 248
systemd/systemd#19245

Downgrade to systemd 247. work for me

@lostiniceland
Copy link
Author

@Vakhrushev did you downgrade on the Docker host or in the image?

@Vakhrushev
Copy link

@lostiniceland downgrade systemd on host.

@geerlingguy
Copy link
Owner

Can anyone try pulling the latest version of this image (once it builds and pushes... should happen soon after #14 was merged).

@jhg03a
Copy link

jhg03a commented Jun 25, 2021

I've pulled the new image and used it for doing some postgresql testing via molecule with systemd and not noticed a problem. The new ansible version does show up too.

@fleroux514
Copy link

Just noticed this problem this morning on my windows machine (ubuntu 20.04 wsl2) while I had no issues before.

$ docker --version
Docker version 20.10.6, build 370c289
	
$ docker-compose --version
docker-compose version 1.29.1, build c34c88b2

$ molecule --version
molecule 3.3.4 using python 3.8 
    ansible:2.11.2
    delegated:3.3.4 from molecule
    docker:0.2.4 from molecule_docker

The same configuration / code base works fine on Centos 7 vmware host.

$ docker --version
Docker version 20.10.7, build f0df350

$ docker-compose --version
docker-compose version 1.26.2, build eefe0d31

$ molecule --version
molecule 3.3.4 using python 3.6
    ansible:2.11.2
    delegated:3.3.4 from molecule
    docker:0.2.4 from molecule_docker

@markwort
Copy link

markwort commented Aug 6, 2021

I wonder if this is a cgroup v1/v2 issue...
CentOS 7 only supports cgroup v1 and consequently you cannot properly use systemd in such containers when your container host is running cgroups v2.

Here's a relevant issue from podman, it might be a similar case with docker:
containers/podman#5153

Maybe this is something those with problems can cross-check in their environments.

It that is indeed the issue, then it seems we can't do much about it, except use a different container host that (also) supports cgroups v1.

@joshbenner
Copy link

joshbenner commented Dec 16, 2021

I hit this and eventually found docker/for-mac#6073

Issue is cgroups v1 vs v2. At time of comment, there are experimental builds to allow choice of cgroup version.

@geerlingguy
Copy link
Owner

Ah... that'd explain it. For some reason I'm still on 4.1.1.

@geerlingguy geerlingguy changed the title systemd is no longer working systemd is no longer working on macOS Dec 16, 2021
@bbaassssiiee
Copy link

Here is a script to use deprecatedCgroupv1 in Docker Desktop for Mac (4.6.0 at time of writing):
docker/for-mac#6073 (comment)

This fixes Failed to connect to bus: No such file or directory in Molecule tests on macOS that were working in older versions of Docker Desktop for Mac.

@wookietreiber
Copy link

@lostiniceland @geerlingguy would it be okay to change the title to systemd does not work when host has cgroup2?

@lostiniceland lostiniceland changed the title systemd is no longer working on macOS systemd does not work when host has cgroup2 Jul 24, 2023
@bbaassssiiee
Copy link

Here is a script to use deprecatedCgroupv1 in Docker Desktop for Mac (4.6.0 at time of writing): docker/for-mac#6073 (comment)

This fixes Failed to connect to bus: No such file or directory in Molecule tests on macOS that were working in older versions of Docker Desktop for Mac.

Change to deprecatedCgroupv1": true in ~/Library/Group\ Containers/group.com.docker/settings.json

@wookietreiber
Copy link

Change to deprecatedCgroupv1": true in ~/Library/Group\ Containers/group.com.docker/settings.json

Is there a way to add this to molecule.yml?

@bbaassssiiee
Copy link

bbaassssiiee commented Aug 5, 2023

Change to deprecatedCgroupv1": true in ~/Library/Group\ Containers/group.com.docker/settings.json

Is there a way to add this to molecule.yml?

Never tried Podman inside Docker

driver:
  name: podman
platforms:
  - name: podman-in-docker
    # ... other options
    cgroup_manager: cgroupfs
    storage_opt: overlay.mount_program=/usr/bin/fuse-overlayfs
    storage_driver: overlay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants