Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP bootstrap: change bootstrap host to replace itself with machine-os-content too #2559

Closed
wants to merge 1 commit into from

Conversation

vrutkovs
Copy link
Member

@vrutkovs vrutkovs commented Oct 24, 2019

…pivot before proceeding

This ensures bootstrap node uses latest crio/kubelet from oscontainer. Bootimages may not be bumped frequent enough to reflect oscontainer updates

TODO:

  • Needs enhancement filed?
  • Find out why it fails

Fixes #2542

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Oct 24, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: vrutkovs
To complete the pull request process, please assign abhinavdahiya
You can assign the PR to them by writing /assign @abhinavdahiya in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment


mkdir bin/
bootkube_podman_run \
--volume "$PWD/bin:/host/usr/local/bin:z" \
Copy link
Member

@cgwalters cgwalters Oct 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will relabel the host's /usr/local/bin - dangerous. Let's instead do something like:

hostmcd=/usr/local/bin/machine-config-daemon
bootkube_podman_run --entrypoint=sh "${MACHINE_CONFIG_OPERATOR_IMAGE} cat /usr/bin/machine-config-daemon >${hostmcd}
chmod a+x ${hostmcd}
restorecon ${hostmcd}

@cgwalters
Copy link
Member

cgwalters commented Oct 24, 2019

Commit message title is overlong, and the body could use more information and links; how about:

Change bootstrap host to replace itself with machine-os-content too

Currently every machine instance we launch uses the same "bootimage", including the bootstrap host. However, everything except bootstrap (i.e. control plane and workers) replace their OS content with the machine-os-content from the release payload before joining the cluster.

For more information, see: https://github.com/openshift/machine-config-operator/blob/master/docs/OSUpgrades.md

This changes the bootstrap host to do the same, which will help avoid issues from "bootimage drift".

Closes: #2542

Currently every machine instance we launch uses the same "bootimage", 
including the bootstrap host. However, everything except bootstrap (i.e. 
control plane and workers) replace their OS content with the 
machine-os-content from the release payload before joining the cluster.

For more information, see: 
https://github.com/openshift/machine-config-operator/blob/master/docs/OSUpgrades.md

This changes the bootstrap host to do the same, which will help avoid 
issues from "bootimage drift".

Closes: openshift#2542
@abhinavdahiya
Copy link
Contributor

/hold

this needs an enhancement before we think about merging it.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 24, 2019
@vrutkovs
Copy link
Member Author

this needs an enhancement before we think about merging it.

I don't think this enhances anything really, its fixes the situation where bootstrap node runs a different version of kubelet / crio until bootimages are bumped in the installer. In openshift/enhancements#78 (comment) Clayton considers this to be a bug, and so do I.

I don't mind holding this for now, however later on RHCOS would stop producing boot images for .z releases. That might cause various bootstrap issues during new cluster installs

@cgwalters
Copy link
Member

I agree with abhinavdahiya that this is a notable architectural change that requires an enhancement; it's something that a lot of OpenShift developers and some advanced users will end up needing to understand. I also think we already have an enhancement for this in openshift/enhancements#78 right?

@vrutkovs
Copy link
Member Author

I also think we already have an enhancement for this in openshift/enhancements#78 right?

78 is a bit different, pivoting bootstrap in context of OKD-on-FCOS is the architectural decision to avoid maintaining a different FCOS stream. That is not the situation we have in RHCOS, so I don't think enhancement#78 covers this case.

@vrutkovs vrutkovs changed the title bootstrap: pull MCD image, copy machine-config-daemon binary and run … bootstrap: change bootstrap host to replace itself with machine-os-content too Oct 24, 2019
@vrutkovs vrutkovs changed the title bootstrap: change bootstrap host to replace itself with machine-os-content too WIP bootstrap: change bootstrap host to replace itself with machine-os-content too Oct 24, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 24, 2019
@LorbusChris LorbusChris added this to In Progress in OKD4 Oct 25, 2019
@LorbusChris LorbusChris removed this from In Progress in OKD4 Oct 31, 2019
@vrutkovs
Copy link
Member Author

vrutkovs commented Nov 7, 2019

/retest

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Nov 7, 2019

@vrutkovs: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-openstack a661f2e link /test e2e-openstack
ci/prow/e2e-aws a661f2e link /test e2e-aws
ci/prow/e2e-aws-scaleup-rhel7 a661f2e link /test e2e-aws-scaleup-rhel7
ci/prow/e2e-libvirt a661f2e link /test e2e-libvirt

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@abhinavdahiya
Copy link
Contributor

Closing this for the time being, when we have an enhancement for this change we can re-open linking to it.

/close

@openshift-ci-robot
Copy link
Contributor

@abhinavdahiya: Closed this PR.

In response to this:

Closing this for the time being, when we have an enhancement for this change we can re-open linking to it.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

change bootstrap to pivot
4 participants