Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c9s: remove / override mitigation kargs #1311

Closed
vrutkovs opened this issue Jun 9, 2023 · 10 comments
Closed

c9s: remove / override mitigation kargs #1311

vrutkovs opened this issue Jun 9, 2023 · 10 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@vrutkovs
Copy link
Member

vrutkovs commented Jun 9, 2023

When OKD SCOS is being installed it may use FCOS as a starting image (for clouds where we don't upload SCOS artifacts yet). This causes default FCOS kernel arguments to be retained - which means default mitigations=auto,nosmt are being kept and installation fails as only half the CPUs are being used.
For AWS the workaround is to use m6a.xlarge flavor where nosmt doesn't disable CPUs.

In OKD FCOS bootstrap automatically mixes in a MachineConfig which sets mitigations=off and all nodes boot without this karg after pivot. Ideally, we should find a solution to this in C9S manifest (and fall back to installer script if not possible)

@sdodson
Copy link
Member

sdodson commented Jun 9, 2023

Or use any other instance type which is either not subject to nosmt disabling of SMT cores or has four cores available even after smt cores have been disabled. ie: m6i.xlarge only yields 2 cores and doesn't work but m6i.2xlarge would yield 4.

@LorbusChris
Copy link
Member

This problem will go away once the installer switches to use SCOS bootimages, so there's no pivot that retains default kargs from FCOS.

More generally, this would be solved by coreos/rpm-ostree#3738

@travier
Copy link
Member

travier commented Jun 9, 2023

We're likely not going to be able to fix that here and the end goal is to not use the FCOS image to bootstrap OKD-SCOS as this is a downgrade operation and we don't support those.

@cgwalters
Copy link
Member

Just a side note: A bootc install --takeover flow would mean there's no "downgrade" because we entirely nuke and replace everything in the system (including the systemd journal, etc.). But, things like what image metadata flags are set are still relevant here.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 8, 2023
@LorbusChris
Copy link
Member

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 8, 2023
@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 7, 2023
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 7, 2024
@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci bot closed this as completed Feb 6, 2024
Copy link
Contributor

openshift-ci bot commented Feb 6, 2024

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants