Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calico-kube-controllers CPU throttling #8056

Closed
irizzant opened this issue Oct 7, 2021 · 7 comments
Closed

calico-kube-controllers CPU throttling #8056

irizzant opened this issue Oct 7, 2021 · 7 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@irizzant
Copy link
Contributor

irizzant commented Oct 7, 2021

After upgrading to the last version of Kubespray 2.17.0 I noticed a series of alerts started being raised from AlertManager showing high CPU throttling for Calico, see below:
screenshot-grafana k8s savinodelbene com-2021 10 07-12_06_35

Even if the CPU quota of the limit is low, the Pod is being heavily throttled (constantly around 25%), which means multithreading operations are responsible for hitting the limits.

I'd like to undersand if maybe newer versions requires an adjustment to the cpu default limits set here.

Environment:

  • Cloud provider or hardware configuration: bare metal

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):

Linux 5.4.0-88-generic x86_64
NAME="Ubuntu"
VERSION="20.04.2 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.2 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
  • Version of Ansible (ansible --version):
ansible 2.9.20
  config file = /home/kubespray/kubespray/ansible.cfg
  configured module search path = ['/home/kubespray/kubespray/library']
  ansible python module location = /usr/local/lib/python3.8/dist-packages/ansible
  executable location = /usr/local/bin/ansible
  python version = 3.8.10 (default, Sep 28 2021, 16:10:42) [GCC 9.3.0]
  • Version of Python (python --version):
Python 2.7.18

Kubespray version (commit) (git rev-parse --short HEAD):
2.17.0
Network plugin used:
Calico

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

Command used to invoke ansible:

Output of ansible run:

Anything else do we need to know:

@irizzant irizzant added the kind/bug Categorizes issue or PR as related to a bug. label Oct 7, 2021
@floryut
Copy link
Member

floryut commented Oct 7, 2021

@irizzant I would raise this on Calico issue board, maybe someone from the kubespray community can help you but you'll likely have help/answers from Calico maintainers and community 👍

@irizzant
Copy link
Contributor Author

irizzant commented Oct 8, 2021

Hello @floryut
it makes sense but I opened the issue here because the deployment manifests and the vars to set CPU limits are hosted in kubespray (https://github.com/kubernetes-sigs/kubespray/blob/5fcf0471914abbad8c7fc997b0b3b0e5992dbc3b/roles/kubernetes-apps/policy_controller/calico/defaults/main.yml), so I was wondering if this could be just a default value to update.

@olevitt
Copy link
Contributor

olevitt commented Oct 13, 2021

Witnessing the same on our cluster. We were at ~20-25% throttling. Manually increasing the cpu limit for calico got rid of the issue.
I think we should increase the kubespray defined default value, just like @irizzant said.

@floryut
Copy link
Member

floryut commented Oct 13, 2021

Well you could open up a PR and we'll discuss this in it 👍

@oomichi
Copy link
Contributor

oomichi commented Nov 11, 2021

Hi @irizzant
I think we can close this issue with the merged #8076
Is it correct?

@oomichi
Copy link
Contributor

oomichi commented Nov 11, 2021

/cc @oomichi

@irizzant
Copy link
Contributor Author

Yes @oomichi I'm closing it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants