Minimise Baremetal footprint, live-iso bootstrap #361

hardys · 2020-06-04T15:12:05Z

Work on defining the ideas around 3-master deployment on baremetal
where we wish to avoid the install-time dependency for a 4th host
either in the rack or connected directly to it.

hardys · 2020-06-04T15:12:57Z

/cc @avishayt @beekhof @dhellmann @markmc @mhrivnak

openshift-ci-robot · 2020-06-04T15:13:00Z

@hardys: GitHub didn't allow me to request PR reviews from the following users: avishayt.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @avishayt @beekhof @dhellmann @markmc @mhrivnak

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

hardys · 2020-06-04T15:15:16Z

This is just a first pass to kick off discussion around the idea of running the bootstrap services via the installer iso, and around the supportability of deploying an initially 2-master cluster, and rebooting the bootstrap host to be the 3rd master.

enhancements/baremetal/baremetal-compact-clusters.md

enhancements/baremetal/minimise-baremetal-footprint.md

dhellmann · 2020-06-05T20:18:26Z

enhancements/baremetal/minimise-baremetal-footprint.md

+
+### User Stories
+
+As a user of OpenShift, I should be able to install a 3-node cluster (no workers)


Rather than "no workers" don't we really end up with a control plane that can also run user workloads?

Yes, the masters end up schedulable so perhaps I should reword this, I just meant no dedicated workers since the goal is to deploy with a footprint of no more than 3 servers.

enhancements/baremetal/minimise-baremetal-footprint.md

avishayt · 2020-06-07T06:00:34Z

enhancements/baremetal/minimise-baremetal-footprint.md

+3 target nodes with an image which enables installation to proceed, and the
+end state should be a full supportable 3-node cluster.
+
+


I believe Alex mentioned that it isn't good practice to run the bootstrap logic on a host that will become a worker, because the API VIP shouldn't ever be on a host that will later serve user workloads. If that's indeed the case, an extra host is needed for deployments with workers as well.

I guess that means we'd always reboot the bootstrap host to be the 3rd master, and never to become a worker? It doesn't necessarily follow that an extra host is required though?

I mean that we limited this proposal to 3-node clusters, but it's beneficial for any size cluster. Today if you want 3 masters and 2 workers you will need 6 hosts, and with this proposal only 5.

Ah I see, thanks - I'll clarify that.

Would be interesting to hear from @crawford re the API VIP best practice though, as AFAIK that would work OK with our current keepalived solution (the VIP would just move over to one of the masters as soon as the API comes up there)

So the issue according to @crawford is that this worker node might still get traffic meant for the API VIP after the pivot, for example do to a load balancer configuration not being updated. Then some malicious actor could run a MCS on that node and load other software onto additional nodes being deployed. I don't think that this would be possible because pods would be on a different network, but I'm not sure.

enhancements/baremetal/minimise-baremetal-footprint.md

romfreiman · 2020-06-07T21:10:16Z

Why do we focus on 3 nodes rather than general enhancement of removing bootratp dedicate node. Why there less 'pain' in having extra node for 6 nodes deployment?

hardys · 2020-06-08T14:49:51Z

Why do we focus on 3 nodes rather than general enhancement of removing bootratp dedicate node. Why there less 'pain' in having extra node for 6 nodes deployment?

Yes ref the thread with @avishayt we could expand the use-case to consider the more general case, but I'd been assuming you could just run the bootstrap iso on one of the worker nodes in any environment with more than 3 nodes.

markmc · 2020-06-15T10:47:08Z

enhancements/baremetal/minimise-baremetal-footprint.md

+
+## Proposal
+
+### User Stories


I see two separate improvements here:

Using a machine running the live ISO to run the openshift-install binary and bootstrap services (replacing our current RHEL provisioning host hosting a bootstrap VM)

The ability to pivot such a bootstrap machine to being a third master (allowing a 3 node cluster without the need for a temporary 4th node)

Maybe it's fair to say that (2) is the more interesting new feature, and it requires (1).

However, I think we could include a user story for (1) on its own:

The IPI experience we have today would be simplified through the use of an RHCOS live ISO instead of a RHEL provisioning host?

i.e. booting an RHCOS live ISO would be easier than installing RHEL on the machine? Not needing a bootstrap VM means the machine you boot with this ISO could be a VM?

Related, if you're not installing a "compat 3 node cluster", wouldn't it be better to avoid the bootstrap-pivot-to-master and instead do bootstrap-pivot-to-worker once the install has fully completed?

Agree, good points. In the past @crawford has said that pivot-to-worker may have some security risk - Alex can you please comment on that?

So the issue according to @crawford is that this worker node might still get traffic meant for the API VIP after the pivot, for example do to a load balancer configuration not being updated. Then some malicious actor could run a MCS on that node and load other software onto additional nodes being deployed. I don't think that this would be possible because pods would be on a different network, but I'm not sure.

Yep, that sums it up. If we can pivot to a control plane node or a compute node (and we've proven we can), it's easier for us and the customer to just avoid the potential issues and only pivot to the control plane node.

That also implies to me that we don't see any downsides to pivoting the bootstrap machine to be a control plane node - like difficulty debugging/recovering if the pivot fails. Is that the case?

(i.e. if there are downsides to this bootstrap-to-master pivot, then that would be a reason to choose to pivot to a worker, in cases where we have a choice)

In our discussions about running the assisted service and UI in standalone/disconnected cases, we kept coming back to the negative implications of bootstrap-to-master as a reason to not run that stuff on the bootstrap node. That's what got me thinking about whether bootstrap-to-master was a thing we want to use in cases where that's our only choices, or a thing we want to use in all cases.

main benefit of bootstrap-to-master is it covers both use cases (with and without workers). If we enable bootstrap-to-worker we are potentially doubling the testing matrix?

Using a machine running the live ISO to run the openshift-install binary and bootstrap services (replacing our current RHEL provisioning host hosting a bootstrap VM)

I didn't arrive at that conclusion at all while reading this enhancement but it's certainly possible. If we're intending for the live iso run the installer binary and start bootstrapping services while running the live-iso can we detail more of that aspect somewhere? I guess perhaps that came from ambiguity around "installer/bootstrap services" in line 75?

I added a user story to capture the "improve IPI day-1 experience" and will work on adding some more detail around running the bootstrap services on the live ISO (which is clearly a key part of the implementation here, and has already been prototyped - @avishayt can you perhaps help with some more details?

enhancements/baremetal/minimise-baremetal-footprint.md

cgwalters · 2020-06-17T02:38:01Z

See coreos/ignition#935 for a proposal for supporting any CoreOS system to run in an ephemeral mode. With that we could use it across the board, even in clouds. I wouldn't propose necessarily doing pivot-to-controlplane there to start, but we could at least force the bootstrap node to be ephemeral. (Though, bootstrap ephemeral conflicts a bit with openshift/installer#2542 but we could meet in the middle by having /etc and /var ephemeral but still do the persistent OS upgrade, or perhaps better do the pivot "live" there to just the new userspace)

cgwalters · 2020-06-17T02:39:21Z

That said for bare metal in general it does make total sense to use the Live ISO (it's by nature default ephemeral) as a bootstrap node and then "pivot to controlplane/worker" is just "run coreos-installer".

enhancements/baremetal/minimise-baremetal-footprint.md

crawford

This looks good so far.

crawford · 2020-06-18T22:56:17Z

enhancements/baremetal/minimise-baremetal-footprint.md

+
+## Proposal
+
+### User Stories


Yep, that sums it up. If we can pivot to a control plane node or a compute node (and we've proven we can), it's easier for us and the customer to just avoid the potential issues and only pivot to the control plane node.

hexfusion · 2020-06-22T20:25:13Z

enhancements/baremetal/minimise-baremetal-footprint.md

+This proposal builds on work already completed e.g etcd-operator improvements
+but we need to ensure any change in deployment topology is well tested and
+fully supported, to avoid these deployments being an unreliable
+corner-case.


cluster-etcd-operator does not currently support installs of less than 3 master nodes. We would like to add support for this use case explicitly. But we struggle to understand what would be the clear signal from the installer defining this install type. As we continue to add support for these nonstandard install types control-plane (possibly other) components will likely need to tolerate some change to logic. The intention of the installer should be clear to the operator. I am wondering how we can best handle this problem?

cc @crawford @sdodson

cc @ironcladlou @retroflexer

@hexfusion Since the cluster will only be in 2-master mode until the bootstrap host pivots, can this be considered the same way as a master-replacement? I guess in an HA configuration if a master fails, there's no signal to the operator (other than a master going away), so we'd expect etcd things to just work but in a degraded state, until the 3rd master comes up?

@hexfusion , @crawford , @sdodson , @ironcladlou , @retroflexer : Is this item about the etcd operator the only open question about this design enhancement? Are we ready to approve it?

openshift-ci-robot · 2020-06-23T14:42:21Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hardys

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [hardys]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Work on defining the ideas around 3-master deployment on baremetal where we wish to avoid the install-time dependency for a 4th host either in the rack or connected directly to it.

hardys · 2020-06-23T14:54:34Z

Ok I tried to address all the feedback so far (thanks!) and force-pushed to remove the WIP, ready for further review.

dhellmann · 2020-08-07T18:38:34Z

enhancements/baremetal/minimise-baremetal-footprint.md

+but we need to ensure any change in deployment topology is well tested and
+fully supported, to avoid these deployments being an unreliable
+corner-case.
+


What is the impact of this change on the current integration with ACM? That expects to use Hive to run the installer in a pod in the ACM cluster, but it seems that will need to change to run something that attaches the live ISO to one of the hosts instead? I don't think we need to work out all of the details here, but we should at least point out that if we make this change in a way that isn't backwards compatible then we will break the existing ACM integration.

@dhellmann that is a good point - I suspect that means that at least short/medium term the existing bootstrap VM solution would still be required, and we have the same question to answer re any ACM integration with the assisted-install flow?

I've been wondering whether this proposal could be simplified by not considering the "run installer on the live-iso" part, and instead prepare a cluster-specific ISO the user can then boot e.g openshift-install create boostrap-iso ?

That would still imply two possible install paths though, the bootstrap VM case or the alternative based on creation of a bootstrap ISO without dependencies on libvirt.

I don't know if anyone has started thinking deeply about integrating the assisted installer with ACM, or what that means. Perhaps the integration is just running the assisted installer on the ACM cluster and linking to it from the ACM GUI? Perhaps it doesn't make sense to integrate the assisted installer with ACM at all, since the assisted installer doesn't automate things like the IPI installer does and the point of ACM is to have that automation?

ACM assumes the installer manages the entire process. If we change the installer to generate an ISO to replace the bootstrap VM, then we would have to do something somewhere to attach that ISO to the host and boot it. I think to accomplish that, we would end up moving a lot of the features of the IPI installer into some new controller in ACM, and in the process we might also end up with a different path for ACM's integration with the installer because Hive wouldn't know how to drive the tool to mount the ISO.

So as far as I can tell, we're going to have 2 paths somewhere, regardless of what we do.

openshift-bot · 2020-12-01T21:36:27Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2020-12-31T23:27:36Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

romfreiman · 2021-01-01T04:44:05Z

/remove-lifecycle rotten

openshift-bot · 2021-04-01T06:11:46Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2021-05-01T10:30:28Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2021-05-31T13:18:30Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2021-05-31T13:18:45Z

@openshift-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 4, 2020

openshift-ci-robot requested review from jsafrane and sdodson June 4, 2020 15:12

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 4, 2020

openshift-ci-robot requested review from markmc, mhrivnak and beekhof June 4, 2020 15:12

openshift-ci-robot requested a review from dhellmann June 4, 2020 15:13

hardys changed the title ~~[WIP] Baremetal Compact Clusters Enhancement~~ [WIP] Baremetal Compact Clusters Jun 4, 2020

sdodson reviewed Jun 4, 2020

View reviewed changes

enhancements/baremetal/baremetal-compact-clusters.md Outdated Show resolved Hide resolved

hardys force-pushed the baremetal_compact_clusters branch from e09a67a to 79c83f2 Compare June 5, 2020 11:29

hardys changed the title ~~[WIP] Baremetal Compact Clusters~~ [WIP] Minimise Baremetal footprint Jun 5, 2020

dhellmann reviewed Jun 5, 2020

View reviewed changes

avishayt reviewed Jun 7, 2020

View reviewed changes

enhancements/baremetal/minimise-baremetal-footprint.md Outdated Show resolved Hide resolved

romfreiman reviewed Jun 7, 2020

View reviewed changes

enhancements/baremetal/minimise-baremetal-footprint.md Show resolved Hide resolved

markmc reviewed Jun 15, 2020

View reviewed changes

mhrivnak reviewed Jun 15, 2020

View reviewed changes

enhancements/baremetal/minimise-baremetal-footprint.md Show resolved Hide resolved

wking mentioned this pull request Jun 15, 2020

connected assisted installer #376

Merged

hardys commented Jun 17, 2020

View reviewed changes

enhancements/baremetal/minimise-baremetal-footprint.md Show resolved Hide resolved

crawford reviewed Jun 18, 2020

View reviewed changes

hexfusion reviewed Jun 22, 2020

View reviewed changes

hardys force-pushed the baremetal_compact_clusters branch from fd245ee to 2d14e65 Compare June 23, 2020 14:42

hardys changed the title ~~[WIP] Minimise Baremetal footprint~~ Minimise Baremetal footprint, live-iso bootstrap Jun 23, 2020

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 23, 2020

Minimise Baremetal footprint enhancement

6857611

Work on defining the ideas around 3-master deployment on baremetal where we wish to avoid the install-time dependency for a 4th host either in the rack or connected directly to it.

hardys force-pushed the baremetal_compact_clusters branch from 2d14e65 to 6857611 Compare June 23, 2020 14:49

hardys requested review from crawford, dhellmann, markmc and avishayt June 30, 2020 08:36

dhellmann reviewed Aug 7, 2020

View reviewed changes

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 1, 2020

openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 31, 2020

openshift-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 1, 2021

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 1, 2021

openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 1, 2021

openshift-ci bot closed this May 31, 2021


		### User Stories

		As a user of OpenShift, I should be able to install a 3-node cluster (no workers)

		3 target nodes with an image which enables installation to proceed, and the
		end state should be a full supportable 3-node cluster.

Minimise Baremetal footprint, live-iso bootstrap #361

Minimise Baremetal footprint, live-iso bootstrap #361

Conversation

hardys commented Jun 4, 2020

hardys commented Jun 4, 2020

openshift-ci-robot commented Jun 4, 2020

hardys commented Jun 4, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hardys Jun 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romfreiman commented Jun 7, 2020

hardys commented Jun 8, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sdodson Jun 22, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cgwalters commented Jun 17, 2020

cgwalters commented Jun 17, 2020

crawford left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-ci-robot commented Jun 23, 2020

hardys commented Jun 23, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-bot commented Dec 1, 2020

openshift-bot commented Dec 31, 2020

romfreiman commented Jan 1, 2021

openshift-bot commented Apr 1, 2021

openshift-bot commented May 1, 2021

openshift-bot commented May 31, 2021

openshift-ci bot commented May 31, 2021

hardys Jun 8, 2020 •

edited

Loading

sdodson Jun 22, 2020 •

edited

Loading