Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cadvisor interaction with CPU topology / SMT #524

Closed
dghubble opened this issue Jun 5, 2020 · 3 comments
Closed

cadvisor interaction with CPU topology / SMT #524

dghubble opened this issue Jun 5, 2020 · 3 comments

Comments

@dghubble
Copy link
Member

dghubble commented Jun 5, 2020

The very newest Kubelets (v1.19.0-alpha.1 - v1.19.0-beta.1, latest at time of writing) have trouble couting vCPUs on some OSes (incl. Fedora CoreOS), related to updates to cadvisor with CPU topology features. Within those versions, vCPU is detected as 0 (definitely wrong).

A candidate fix detects the number of cores, unaware of disabled SMT, which I suspect is still wrong. I wanted to bubble this ongoing discussion to Fedora CoreOS folks in case someone has ideas on correctness.

State: idle
Deployments:
● ostree://fedora:fedora/x86_64/coreos/stable
                   Version: 31.20200517.3.0 (2020-06-01T17:15:29Z)
                    Commit: 967b7b8d624e6d10ff51c2e81ef198fae966c567ac2e9b479771c693d0987949
              GPGSignature: Valid signature by 7D22D5867F2A4236474BF7B850CB390B3C3359C4
@iwankgb
Copy link

iwankgb commented Jun 6, 2020

I think that root-cause was a fact that cAdvisor has always been ignoring online/offline CPU information. It must be taken into account now and I believe that it should solve the problem. See: google/cadvisor#2567 (comment)

@dustymabe
Copy link
Member

Considering the discussions going on in the upstream issues can someone summarize and let us know if there anything for FCOS to do in the context of this bug?

@dghubble
Copy link
Member Author

dghubble commented Jun 8, 2020

Great info @iwankgb thanks. AFAIK, this has been narrowed down to cadvisor changes that detect effective vCPU count as 0, which seems to stem in whole or part from not handling OSes that disable simultaenous multi-threading (SMT) (Fedora CoreOS, Google's internal distro).

cadvisor currently reports the wrong CPU count on FCOS, they're discussing fixes, and that would eventually get merged and then included in a Kubernetes v1.19 beta or rc. I don't think FCOS would need to make any changes, I can close this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants