Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-19303: Changed OKD/FCOS workaround to also support Agent-based Installer #7484

Merged
merged 2 commits into from
Dec 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 64 additions & 17 deletions data/data/bootstrap/files/usr/local/bin/bootstrap-pivot.sh.template
Original file line number Diff line number Diff line change
Expand Up @@ -42,25 +42,72 @@ if [ ! -f /opt/openshift/.pivot-done ]; then
record_service_stage_start "rebase-to-okd-os-image"
{{if .IsFCOS -}}
mnt="$(podman image mount "${MACHINE_OS_IMAGE}")"
{{- if or (.BootstrapInPlace) (eq .Invoker "agent-installer") }}
# SNO setup boots into Live ISO which cannot be rebased
# https://github.com/coreos/rpm-ostree/issues/4547
mkdir /var/mnt/{upper,worker}
mount -t overlay overlay -o "lowerdir=/usr:$mnt/usr" /usr
mount -t overlay overlay -o "lowerdir=/etc:$mnt/etc,upperdir=/var/mnt/upper,workdir=/var/mnt/worker" /etc
systemctl daemon-reload

# Workaround for SELinux denials when launching crio.service from overlayfs
setenforce Permissive
# The bootstrap host during SNO installation and the rendezvous host of Agent-based Installer both boot into a Live
# ISO which cannot be rebased. Until rpm-ostree supports this live rebase [0], the following workaround will mount the
# proper OKD/FCOS Machine OS image over the existing mount at /usr and copy new config files to /etc.
# [0] https://github.com/coreos/rpm-ostree/issues/4547
if grep -q coreos.liveiso= /proc/cmdline; then
mount -t tmpfs -o size=50% none /var/mnt/
rsync -aHAXx "$mnt/" /var/mnt/
mount -t overlay overlay -o lowerdir=/usr:/var/mnt/usr /usr
rsync -rlt --ignore-existing /var/mnt/etc/ /etc/

systemctl start crio.service
# No reboot necessary because SNO setup will reboot system
{{ else }}
pushd "${mnt}/bootstrap"
# shellcheck disable=SC1091
. ./pre-pivot.sh
popd
{{ end -}}
# Agent-based Installer will launch a ephemeral control plane at the rendezvous host which will create and publish
# Ignition configs for the other master nodes. These Ignition configs must match what the in-cluster control plane
# would generate else machine config operator will fail [0]. Because the rendezvous host is booted with a FCOS Live
# ISO without any OKD/FCOS related changes, we have to copy the manifests from OKD Machine OS manually to the
# bootstrap manifests folder of the rendezvous host.
# [0] https://access.redhat.com/solutions/4970731
mkdir -p /var/opt/openshift/manifests
cp -av /var/mnt/manifests/*.* /var/opt/openshift/manifests/

# Load new systemd unit files and configuration such as crio.service after mounting the content of OKD/FCOS Machine
# OS over /usr and copying new files to /etc
systemctl daemon-reload
JM1 marked this conversation as resolved.
Show resolved Hide resolved
JM1 marked this conversation as resolved.
Show resolved Hide resolved

# Apply presets from OKD Machine OS
systemctl preset-all

# On OKD/FCOS prior to commit e859a66 [0] systemd-resolved is used by default and NetworkManager's DNS handling is
# disabled. In this case, CoreDNS fails to listen to 127.0.0.53:53 when Agent-based Installer boots its the
# rendezvous host with a Fedora CoreOS bootimage because by default FCOS' systemd-resolved already listens to this
# port. OKD/FCOS disables resolved's stub listener [1] but the resolved must be restarted for this setting to take
# effect.
# On OKD/FCOS since commit e859a66 [0] systemd-resolved is disabled by default and NetworkManager's DNS handling is
# used. However, the bootimage is vanilla FCOS and thus uses systemd-resolved by default. The latter has to be
# disabled after rebasing to OKD Machine OS and NetworkManager as well as the service to fix /etc/resolv.conf have
# to be started.
# [0] https://github.com/openshift/okd-machine-os/commit/e859a6643330596a8a282aeb4bf853763a2d219e
# [1] https://github.com/openshift/okd-machine-os/blob/28dec35d60ea07069366b22ebdcb296d429b15e9/overlay.d/99okd/etc/systemd/resolved.conf.d/okd-no-dns-stub.conf
if [ -e /etc/systemd/resolved.conf.d/okd-no-dns-stub.conf ]; then
systemctl restart systemd-resolved.service
else
systemctl disable --now systemd-resolved.service
fi
JM1 marked this conversation as resolved.
Show resolved Hide resolved

if systemctl list-unit-files -q fix-resolvconf.service >/dev/null; then
systemctl stop NetworkManager.service
systemctl start fix-resolvconf.service
systemctl start NetworkManager.service
nmcli general reload dns-full
fi

# Workaround for SELinux denials when launching crio.service from overlayfs
setenforce Permissive

# crio.service is not part of FCOS but of OKD Machine OS. It will loaded after systemctl daemon-reload above but has
# to be started manually
systemctl start crio.service

# No reboot necessary because setup will reboot the system automatically
else
pushd "${mnt}/bootstrap"
# shellcheck disable=SC1091
. ./pre-pivot.sh
popd
fi
record_service_stage_success
{{else if .IsSCOS -}}
chmod 0644 /etc/containers/registries.conf
rpm-ostree rebase --experimental "ostree-unverified-registry:${MACHINE_OS_IMAGE}"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
[Unit]
Description=Kubernetes Kubelet
Wants=rpc-statd.service crio.service release-image.service
{{if .IsOKD -}}
Wants=release-image-pivot.service
{{end -}}
After=crio.service release-image.service
{{if .IsOKD -}}
After=release-image-pivot.service
{{end -}}

[Service]
Type=notify
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,7 @@
Description=Pivot bootstrap to the OpenShift Release Image
Wants=release-image.service
After=release-image.service
{{- if or (.BootstrapInPlace) (eq .Invoker "agent-installer") }}
Before=bootkube.service kubelet.service
{{ else }}
Before=bootkube.service
{{ end -}}
Before=bootkube.service kubelet.service dnsmasq.service

[Service]
Type=oneshot
Expand Down