Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support generating cpu partitioning file from infra flag #3335

Closed

Conversation

eggfoobar
Copy link
Contributor

@eggfoobar eggfoobar commented Sep 14, 2022

/hold
Needs PR: openshift/api#1284
EP: openshift/enhancement#1213

Signed-off-by: ehila ehila@redhat.com

- What I did

Add support for generating a machine config file that will configure kubelet for workload partitioning based off of the infrastructure flag.

- How to verify it
When the Infrastructure status resource is set to cpuPartitioning: AllNodes, we will generate two new MCs for workload partitioning named 01-master-cpu-partitioning and 01-worker-cpu-partitioning that will create a configuration file for kubelet to use on start up.

apiVersion: config.openshift.io/v1
kind: Infrastructure
metadata:
  name: cluster
status:
  cpuPartitioning: AllNodes

- Description for the changelog

Add new generated MCs for workload partitioning based off of infrastructure status.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 14, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 14, 2022

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: eggfoobar
Once this PR has been reviewed and has the lgtm label, please assign sinnykumari for approval by writing /assign @sinnykumari in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cgwalters
Copy link
Member

There's prior art for the installer generating MachineConfigs based on the install config, see e.g. https://github.com/openshift/installer/blob/master/pkg/asset/machines/machineconfig/hyperthreading.go

I personally don't have a strong opinion on this, but I would say that it makes sense to have the "install time configuration to MachineConfiguration" logic live mainly in one place, whether that's github.com/openshift/installer or here.

@@ -183,6 +188,14 @@ func (b *Bootstrap) Run(destDir string) error {
configs = append(configs, kconfigs...)
}

if infraConfig.Status.CPUPartitioning != apicfgv1.CPUPartitioningNone {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed on slack, but to clarify, if this only applies to bootstrap, if the MC does get generated, the install will fail due to a mismatch of configs generated by bootstrap and regular, causing a "can't find rendered-master-xxx" when the install happens. Basically #2114

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @yuqi-zhang , was looking through the code a bit, trying to understand this a little better. Does this mean that it's not possible to have a MC generated and maintain by MCO during bootstrap?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is possible. You must have both the bootstrap and the "in-cluster" flow generate the same configs.

As an example, see this: #3015

which adds both for nodes.config.openshift.io objects

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 14, 2022

@eggfoobar: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op 0bd1a77 link true /test e2e-gcp-op
ci/prow/unit 0bd1a77 link true /test unit
ci/prow/e2e-aws 0bd1a77 link true /test e2e-aws
ci/prow/e2e-agnostic-upgrade 0bd1a77 link true /test e2e-agnostic-upgrade
ci/prow/bootstrap-unit 0bd1a77 link false /test bootstrap-unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@eggfoobar
Copy link
Contributor Author

Thanks folks, appreciate the feedback, I'll close this PR for now.

@eggfoobar eggfoobar closed this Sep 14, 2022
@eggfoobar
Copy link
Contributor Author

/reopen

Re-opening this PR, after some discussion with the installer team, MCO seems to be the best place to enforce the CPU Partitioning configuration cluster wide.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 30, 2022

@eggfoobar: Failed to re-open PR: state cannot be changed. The cpu_partitioning_bootstrap branch was force-pushed or recreated.

In response to this:

/reopen

Re-opening this PR, after some discussion with the installer team, MCO seems to be the best place to enforce the CPU Partitioning configuration cluster wide.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@eggfoobar
Copy link
Contributor Author

New PR #3355

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants