Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use systemd as the containerd cgroup driver #717

Merged

Conversation

stevehipwell
Copy link
Contributor

aws/containers-roadmap#1210

Description of changes:
This PR sets systemd as the cgroup driver when using containerd, as recommended by the Kubernetes docs. This is complimentary to #593 which adds support for the systemd cgroup driver to Docker but shouldn't be blocked by legacy eksctl constraints as containerd is newer than the systemd cgroup support there.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@stevehipwell
Copy link
Contributor Author

/assign @abeer91

@stevehipwell
Copy link
Contributor Author

FYI @ravisinha0506

@stevehipwell
Copy link
Contributor Author

@Callisto13 related to your #593 (comment), I assume that the changes in here aren't likely to cause issue with the legacy path as containerd wouldn't be being used?

@Callisto13
Copy link

@stevehipwell from what I can see that is correct. We are not exposing the containerd option in the legacy codepath, so there should be no driver conflict. I am happy to verify with a test build before you release if you have any concerns, but I think we should be fine.

@stevehipwell
Copy link
Contributor Author

Thanks @Callisto13 I think we should be good, just waiting to hear from a maintainer on this.

@stevehipwell stevehipwell force-pushed the containerd-systemd-cgroup-driver branch from 7cadb3a to a96413e Compare August 11, 2021 12:40
@stevehipwell
Copy link
Contributor Author

@abeer91 could you take a look at this?

@stevehipwell stevehipwell force-pushed the containerd-systemd-cgroup-driver branch from a96413e to 1d774f5 Compare August 31, 2021 08:22
@stevehipwell
Copy link
Contributor Author

@abeer91 if you're not the right person to look into this could you let me know who is?

@stevehipwell
Copy link
Contributor Author

Could one of the project maintainers please review or offer some feedback on this?

@rtripat
Copy link
Contributor

rtripat commented Oct 8, 2021

@stevehipwell Thanks for submitting the PR. We are prioritizing the testing of cgroup driver to systemd based on your changes. I don't have an ECD to share currently but will update within 2 weeks.

@stevehipwell
Copy link
Contributor Author

Thanks @rtripat, hopefully we can get this solved ASAP.

@stevehipwell
Copy link
Contributor Author

@rtripat how is the testing going?

@stevehipwell
Copy link
Contributor Author

@rtripat it's been almost 2 months since you said you'd get back in 2 weeks, how is this going?

@rtripat
Copy link
Contributor

rtripat commented Nov 29, 2021

@stevehipwell We are actively working on the change. Appreciate the patience.

@stevehipwell
Copy link
Contributor Author

@rtripat are you by any chance also looking at cgroup v2?

@cartermckinnon
Copy link
Member

We need to address this kubelet issue by updating our runc. That change is in motion, and I'll keep you updated!

Regarding cgroup v2, we are unable to make this change until a newer version of systemd is available to us. That will most likely happen in the next major release of Amazon Linux.

@stevehipwell
Copy link
Contributor Author

@cartermckinnon after having had a look through that issue it looks like it's only relevant for Kubernetes v1.22? If so and as EKS v1.22 isn't due until "early 2022" can't this be added for the current EKS versions?

RE cgroup v2 is it worth adding a new issue to this repo to track the progress?

@cartermckinnon cartermckinnon mentioned this pull request Dec 3, 2021
@cartermckinnon
Copy link
Member

cartermckinnon commented Dec 3, 2021

My understanding is:

  • The runc library, libcontainer, which is vendored in k/k, needed to be updated to 1.0.2 to fix #104280. This fix has only been back-ported to 1.21 as of 1.21.5.
  • The runc binary needs to be updated to at least 1.0.1 to avoid a similar (but more serious) issue. This is actually the more important issue to address.

Older Kubernetes release lines are affected by the first point regardless of the runc binary's version, because of libcontainer. However, as long as the binary is at 1.0.1+, the issue is not critical (and quite rare to begin with).

I've created #824 to track cgroup v2.

@stevehipwell
Copy link
Contributor Author

@rtripat could we get a status update on this?

@cartermckinnon
Copy link
Member

Apologies for the delay, our packaging team has had to prioritize issues related to the log4j vulnerabilities. We will proceed as soon as the runc available to us is at 1.0.1+, and we expect that to happen soon.

Thanks for your patience! 🙏

@cartermckinnon
Copy link
Member

Unfortunately, we aren't able to make this change in the near future. Our packaging team has to prioritize stability across a variety of Amazon Linux distributions, and the changelog of runc requires non-trivial evaluation and testing. I can assure you that we will make this change as soon as we are able to.

In the meantime, if anyone has experienced the sort of stability issues hinted at in the Kubernetes documentation, we would love to hear about it.

@stevehipwell
Copy link
Contributor Author

@cartermckinnon I'm interested in why this isn't seen as a high priority and what the blockers are here?

I guess this is the final nail in the coffin of running K8s workloads on AL 2 (and I assume AL 2022) and we should all move to Bottlerocket ASAP to stay inline with the community.

@reegnz
Copy link
Contributor

reegnz commented Jan 28, 2022

EDIT: ok, I looked at the runc changelog, that's a bummer.

If you don't intend to update runc as high prio though (as I understand that's the blocker for this issue), does that mean you'll also ignore security patches for runc? There has been some serious CVE-s in the past for runc and that worries me.
Maybe I'm just misunderstanding your comment.

@cartermckinnon
Copy link
Member

I'm not an authority on the priorities of the Amazon Linux packaging folks, but I can say: we have many users, and I'm not aware of any case in which we've attributed instability to the presence of two cgroup managers. Again, if you have observed this type of instability, we're very interested in hearing about it.

The runc available on Amazon Linux contains (and will continue to receive) security patches.

@cartermckinnon
Copy link
Member

@stevehipwell I have good news! We now have a version of runc (1.0.3-2.amzn2) that lets us move forward with this change.

Two things we need to get this merged:

I'm happy to open a separate PR if you don't have the cycles. 😄

@stevehipwell
Copy link
Contributor Author

@cartermckinnon I'll update my PR.

@stevehipwell stevehipwell force-pushed the containerd-systemd-cgroup-driver branch from 1d774f5 to fa88ad7 Compare May 5, 2022 10:25
@stevehipwell
Copy link
Contributor Author

@cartermckinnon this should be good now. I've copied the shell script patterns from the rest of the script rather than the ones I'd have used, but can update the PR to make them more correct if required.

@cartermckinnon
Copy link
Member

Thanks, @stevehipwell, LGTM! I'll do some sanity tests and try to get this merged this week.

@cartermckinnon cartermckinnon self-requested a review May 9, 2022 20:26
@cartermckinnon cartermckinnon self-assigned this May 9, 2022
@cartermckinnon
Copy link
Member

I built 1.19-1.22 AMIs from this branch, and created a nodegroup using each as follows:

---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: systemd-test-120
  region: us-west-2
  version: "1.20"
managedNodeGroups:
  - name: "119"
    ami: ami-0d4655ebca7f3a80a
    ssh:
      allow: true
    minSize: 1
    maxSize: 1
    desiredCapacity: 1
    overrideBootstrapCommand: |
      #!/bin/bash
      source /var/lib/cloud/scripts/eksctl/bootstrap.helper.sh
      /etc/eks/bootstrap.sh systemd-test-120 --container-runtime containerd --kubelet-extra-args "--node-labels=${NODE_LABELS}"
  - name: "120"
    ami: ami-0538a5514d4cfe90f
    ssh:
      allow: true
    minSize: 1
    maxSize: 1
    desiredCapacity: 1
    overrideBootstrapCommand: |
      #!/bin/bash
      source /var/lib/cloud/scripts/eksctl/bootstrap.helper.sh
      /etc/eks/bootstrap.sh systemd-test-120 --container-runtime containerd --kubelet-extra-args "--node-labels=${NODE_LABELS}"      

---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: systemd-test-122
  region: us-west-2
  version: "1.22"
managedNodeGroups:
  - name: "121"
    ami: ami-080237cf3a00f2e67
    ssh:
      allow: true
    minSize: 1
    maxSize: 1
    desiredCapacity: 1
    overrideBootstrapCommand: |
      #!/bin/bash
      source /var/lib/cloud/scripts/eksctl/bootstrap.helper.sh
      /etc/eks/bootstrap.sh systemd-test-122 --container-runtime containerd --kubelet-extra-args "--node-labels=${NODE_LABELS}"
  - name: "122"
    ami: ami-021b3b95e7c2eace1
    ssh:
      allow: true
    minSize: 1
    maxSize: 1
    desiredCapacity: 1
    overrideBootstrapCommand: |
      #!/bin/bash
      source /var/lib/cloud/scripts/eksctl/bootstrap.helper.sh
      /etc/eks/bootstrap.sh systemd-test-122 --container-runtime containerd --kubelet-extra-args "--node-labels=${NODE_LABELS}"

Pods came up, LGTM.

I want to ship this change in a dedicated AMI release; so I'll wait to merge until we ship #921.

@stevehipwell would you mind addressing the conflicts that came from #921 ?

@stevehipwell stevehipwell force-pushed the containerd-systemd-cgroup-driver branch 2 times, most recently from 3a49afd to 049ae30 Compare May 23, 2022 08:39
@stevehipwell
Copy link
Contributor Author

@cartermckinnon I've rebased this PR so it should be good to merge when you're ready.

@svyatoslavmo
Copy link

Any updates on this? Is there ETA for release?
Thanks

Signed-off-by: Steve Hipwell <steve.hipwell@gmail.com>
@stevehipwell stevehipwell force-pushed the containerd-systemd-cgroup-driver branch from 049ae30 to 427e7b8 Compare July 14, 2022 13:02
@stevehipwell
Copy link
Contributor Author

@cartermckinnon could we get this merged and released?

Copy link
Member

@suket22 suket22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm!

@mmerkes
Copy link
Member

mmerkes commented Jul 25, 2022

LGTM

@cartermckinnon
Copy link
Member

I'm going to merge this.

We have an AMI build currently making its way through our release process, and this change won't make it into that version; it will land in the subsequent version. I'll update here once it's available.

Thanks for all your work on this, @stevehipwell! 🙏

@kishoregv
Copy link

This is creating a mismatch in the kubelet and container cgroupDriver when creating unmananged nodegroups with eksctl.

@stevehipwell stevehipwell deleted the containerd-systemd-cgroup-driver branch September 14, 2022 16:03
@kishoregv
Copy link

This is effecting clusters using eksctl v0.90.0(or less).

until this version eksctl makes its own kubelet config which does not have cgroupDriver setting, this is causing a mis-match where containerd using SystemdCgroup but kubelet does not have systemd

@Callisto13
Copy link

AFAIK older versions of eksctl are not supported, so the solution here is to not use old versions and download the latest binary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants