Skip to content

Commit

Permalink
[VCDA-3330 and VCDA-3343] Install Tanzu Core packages and read versio…
Browse files Browse the repository at this point in the history
…ns from extra config (vmware#1329)

* added core pkg in cloud init script

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* kapp controller draft

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* removed taints, will try worker

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* moved core pkg logic to worker 0

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* working cluster creation

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* now working creation

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* kapp success check

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* updating core pkg in rde

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* updating rde with core pkg

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* removed commented cloud init code

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* removed debug output

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* addressed review comments

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* install kapp on worker 0 and metrics server on nth worker

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* bug fix

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* installing core pkgs when 0 worker nodes before resize

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* not waiting for tanzu package install

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* addressed review comments

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>

* fixed renaming

Signed-off-by: ltimothy7 <66969084+ltimothy7@users.noreply.github.com>
  • Loading branch information
ltimothy7 authored Apr 1, 2022
1 parent 5a7d516 commit 47a862d
Show file tree
Hide file tree
Showing 6 changed files with 364 additions and 92 deletions.
149 changes: 84 additions & 65 deletions cluster_scripts/v2_x_tkgm/cloud_init_control_plane.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ write_files:
csi_driver_path=/root/csi-driver.yaml
csi_controller_path=/root/csi-controller.yaml
csi_node_path=/root/csi-node.yaml
kapp_controller_path=/root/kapp-controller.yaml
vmtoolsd --cmd "info-set guestinfo.postcustomization.networkconfiguration.status in_progress"
echo 'net.ipv6.conf.all.disable_ipv6 = 1' >> /etc/sysctl.conf
Expand Down Expand Up @@ -140,6 +141,84 @@ write_files:
systemctl restart containerd
vmtoolsd --cmd "info-set guestinfo.postcustomization.proxy.setting.status successful"
# openbracket(all caps) will be replaced by the open bracket and closebracket (all caps)
# will be replaced by an open bracket.
# This convention is needed so that python's template format function does not view the bash
# $\openbracket/VAR/\closebracket as a format variable that will be replaced by the python format function.
antrea_version="{antrea_version}"
kapp_controller_version=""
metrics_server_version=""
metrics_server_version_valid=true
vmtoolsd --cmd "info-set guestinfo.postcustomization.tkr.get_versions.status in_progress"
tkr_bom_dir=/tmp/tkr_bom
bom_path=$tkr_bom_dir/bom
mkdir -p $bom_path
components_path=$bom_path/components.yaml
imgpkg_path=$tkr_bom_dir/imgpkg
yq_path=$tkr_bom_dir/yq
default_antrea_version="0.11.3"
xml_version_property=$(vmtoolsd --cmd "info-get guestinfo.ovfenv" | grep "oe:key=\"VERSION\"")
init_k8s_version=$(echo $xml_version_property | sed 's/.*oe:value=\"//; s/\(.*\)-.*/\1/')
k8s_version=$(echo $init_k8s_version | tr -s "+" "_")
# download imgpkg, which is needed for getting the components yaml file
wget -nv github.com/vmware-tanzu/carvel-imgpkg/releases/download/v0.24.0/imgpkg-linux-amd64 -O $imgpkg_path
chmod +x $imgpkg_path
# We need to loop through the `X` value in `tkg.X` because of some TKR unexpected design.
# We increment `X`, looking for a valid tkr bom version.
no_tkr_found=false
until $imgpkg_path pull -i projects.registry.vmware.com/tkg/tkr-bom:$OPENBRACKETk8s_versionCLOSEBRACKET -o $bom_path
do
tkg_version=$(echo $OPENBRACKETk8s_version//*.CLOSEBRACKET)
tkg_version=$((tkg_version+1))
k8s_version=$(echo $k8s_version | sed "s/.$/"$tkg_version"/")
if [[ $tkg_version -gt 10 ]]; then
no_tkr_found=true
break
fi
done
mv $bom_path/*.yaml $components_path
# download yq for yaml parsing
wget https://github.com/mikefarah/yq/releases/download/v4.2.0/yq_linux_amd64 -O $yq_path
chmod +x $yq_path
# handle getting antrea version
if [[ -z "$antrea_version" ]]; then
if [[ "$no_tkr_found" = true ]] ; then
echo "no tkr bom found, will use default component versions" &>> /var/log/cse/customization/status.log
antrea_version=$default_antrea_version
else
# will get antrea version from tkr file
antrea_version=$($yq_path e ".components.antrea[0].version" $components_path | sed 's/+.*//')
if [[ -z "$antrea_version" ]] || [[ "$antrea_version" = "null" ]] || [[ "$antrea_version" = "false" ]]; then
antrea_version=$default_antrea_version
else
antrea_version=$(echo $antrea_version | sed "s/v//") # remove leading `v`, which will be added later
fi
fi
fi
# get kapp-controller and metrics-server versions, which will be installed on the worker nodes
# These versions are retrieved here since the antrea version is already retrieved and installed
# on the control plane node, so this avoids retrieving core package versions later
kapp_controller_version=$($yq_path e ".components.kapp-controller[0].version" $components_path | sed 's/v//')
metrics_server_version=$($yq_path e ".components.metrics-server[0].version" $components_path | sed 's/v//')
if [[ -z "$metrics_server_version" ]] || [[ "$metrics_server_version" = "null" ]] || [[ "$metrics_server_version" = "false" ]]; then
metrics_server_version_valid=false
echo "metrics server version not valid" >> /var/log/cse/customization/status.log
fi
# store tkr versions in extra config
vmtoolsd --cmd "info-set guestinfo.postcustomization.tkr.get_versions.kapp_controller $kapp_controller_version"
vmtoolsd --cmd "info-set guestinfo.postcustomization.tkr.get_versions.metrics_server $metrics_server_version"
# cleanup components downloads
rm -rf $tkr_bom_dir
vmtoolsd --cmd "info-set guestinfo.postcustomization.tkr.get_versions.status successful"
vmtoolsd --cmd "info-set guestinfo.postcustomization.kubeinit.status in_progress"
# tag images
coredns_image_version=""
Expand Down Expand Up @@ -167,74 +246,13 @@ write_files:
vmtoolsd --cmd "info-set guestinfo.kubeconfig $(cat /etc/kubernetes/admin.conf | base64 | tr -d '\n')"
vmtoolsd --cmd "info-set guestinfo.postcustomization.kubeinit.status successful"
# open_bracket(all caps) will be replaced by the open bracket and close_bracket (all caps)
# will be replaced by an open bracket.
# This convention is needed so that python's template format function does not view the bash
# $\open_bracket/VAR/\close_bracket as a format variable that will be replaced by the python format function.
antrea_version="{antrea_version}"
vmtoolsd --cmd "info-set guestinfo.postcustomization.tkr.get_versions.status in_progress"
if [[ -z "$antrea_version" ]]; then
tkr_bom_dir=/tmp/tkr_bom
bom_path=$tkr_bom_dir/bom
mkdir -p $bom_path
components_path=$bom_path/components.yaml
imgpkg_path=$tkr_bom_dir/imgpkg
yq_path=$tkr_bom_dir/yq
default_antrea_version="0.11.3"
xml_version_property=$(vmtoolsd --cmd "info-get guestinfo.ovfenv" | grep "oe:key=\"VERSION\"")
init_k8s_version=$(echo $xml_version_property | sed 's/.*oe:value=\"//; s/\(.*\)-.*/\1/')
k8s_version=$(echo $init_k8s_version | tr -s "+" "_")
# install imgpkg, which is needed for getting the components yaml file
wget -nv github.com/vmware-tanzu/carvel-imgpkg/releases/download/v0.24.0/imgpkg-linux-amd64 -O $imgpkg_path
chmod +x $imgpkg_path
# We need to loop through the `X` value in `tkg.X` because of some TKR unexpected design.
# We increment `X`, looking fir a valid tkr bom version.
no_tkr_found=false
until $imgpkg_path pull -i projects.registry.vmware.com/tkg/tkr-bom:$OPEN_BRACKETk8s_versionCLOSE_BRACKET -o $bom_path
do
tkg_version=$(echo $OPEN_BRACKETk8s_version//*.CLOSE_BRACKET)
tkg_version=$((tkg_version+1))
k8s_version=$(echo $k8s_version | sed "s/.$/"$tkg_version"/")
if [[ $tkg_version -gt 10 ]]; then
no_tkr_found=true
break
fi
done
if [[ "$no_tkr_found" = true ]] ; then
echo "no tkr bom found, will use default component versions" &>> /var/log/cse/customization/status.log
antrea_version=$default_antrea_version
fi
if [[ "$no_tkr_found" = false ]]; then
mv $bom_path/*.yaml $components_path
# install yq for yaml parsing
wget https://github.com/mikefarah/yq/releases/download/v4.2.0/yq_linux_amd64 -O $yq_path
chmod +x $yq_path
antrea_version=$($yq_path e ".components.antrea[0].version" $components_path | sed 's/+.*//')
if [[ -z "$antrea_version" ]] || [[ "$antrea_version" = "null" ]] || [[ "$antrea_version" = "false" ]]; then
antrea_version=$default_antrea_version
echo "no antrea version found in tkr bom, will use default antrea version: $OPEN_BRACKETdefault_antrea_versionCLOSE_BRACKET" &>> /var/log/cse/customization/status.log
else
antrea_version=$(echo $antrea_version | sed "s/v//") # remove leading `v`, which will be added later
fi
fi
# cleanup components downloads
rm -rf $tkr_bom_dir
fi
vmtoolsd --cmd "info-set guestinfo.postcustomization.tkr.get_versions.status successful"
vmtoolsd --cmd "info-set guestinfo.postcustomization.kubectl.cni.install.status in_progress"
antrea_path=/root/antrea-$OPEN_BRACKETantrea_versionCLOSE_BRACKET.yaml
wget -O $antrea_path https://github.com/vmware-tanzu/antrea/releases/download/v$OPEN_BRACKETantrea_versionCLOSE_BRACKET/antrea.yml
antrea_path=/root/antrea-$OPENBRACKETantrea_versionCLOSEBRACKET.yaml
wget -O $antrea_path https://github.com/vmware-tanzu/antrea/releases/download/v$OPENBRACKETantrea_versionCLOSEBRACKET/antrea.yml
# This does not need to be done from v0.12.0 onwards inclusive
sed -i "s/image: antrea\/antrea-ubuntu:v$OPEN_BRACKETantrea_versionCLOSE_BRACKET/image: projects.registry.vmware.com\/antrea\/antrea-ubuntu:v$OPEN_BRACKETantrea_versionCLOSE_BRACKET/g" $antrea_path
sed -i "s/image: antrea\/antrea-ubuntu:v$OPENBRACKETantrea_versionCLOSEBRACKET/image: projects.registry.vmware.com\/antrea\/antrea-ubuntu:v$OPENBRACKETantrea_versionCLOSEBRACKET/g" $antrea_path
kubectl apply -f $antrea_path
vmtoolsd --cmd "info-set guestinfo.postcustomization.core_packages.antrea_version $antrea_version"
vmtoolsd --cmd "info-set guestinfo.postcustomization.kubectl.cni.install.status successful"
Expand Down Expand Up @@ -280,6 +298,7 @@ write_files:
fi
vmtoolsd --cmd "info-set guestinfo.postcustomization.kubectl.default_storage_class.status successful"
vmtoolsd --cmd "info-set guestinfo.postcustomization.kubeadm.token.generate.status in_progress"
kubeadm_join_info=$(kubeadm token create --print-join-command --ttl 0 2> /dev/null)
vmtoolsd --cmd "info-set guestinfo.postcustomization.kubeadm.token.info $kubeadm_join_info"
Expand Down
108 changes: 108 additions & 0 deletions cluster_scripts/v2_x_tkgm/cloud_init_node.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,14 @@ write_files:
#!/usr/bin/env bash
catch() {{
kubeconfig_path=/root/kubeconfig.yaml
if [[ -f "$kubeconfig_path" ]]; then
rm $kubeconfig_path
fi
# ensure kubeconfig is null, even if this worker doesn't use the kubeconfig to avoid
# getting the config if the value is not set
vmtoolsd --cmd "info-set guestinfo.postcustomization.control_plane.kubeconfig null"
vmtoolsd --cmd "info-set guestinfo.post_customization_script_execution_status $?"
error_message="$(date) $(caller): $BASH_COMMAND"
echo "$error_message" &>> /var/log/cse/customization/error.log
Expand Down Expand Up @@ -93,6 +101,106 @@ write_files:
kubeadm join --config /root/kubeadm-defaults-join.conf --v=10 &> /root/kubeadm-join.out
vmtoolsd --cmd "info-set guestinfo.postcustomization.kubeadm.node.join.status successful"
# openbracket(all caps) will be replaced by the open bracket and closebracket (all caps)
# will be replaced by an open bracket.
# This convention is needed so that python's template format function does not view the bash
# $\openbracket/VAR/\closebracket as a format variable that will be replaced by the python format function.
vmtoolsd --cmd "info-set guestinfo.postcustomization.core_packages.attempted_install in_progress"
install_kapp_controller={install_kapp_controller}
kubeconfig_path=/root/kubeconfig.yaml
touch $kubeconfig_path
# Kapp-controller is installed on the first worker node
kapp_controller_version="{kapp_controller_version}"
kapp_controller_version=$(echo $kapp_controller_version | sed 's/+.*//' | sed 's/v//')
install_tanzu_cli_packages={install_tanzu_cli_packages}
if [[ "$install_kapp_controller" = true ]]; then
vmtoolsd --cmd "info-get guestinfo.postcustomization.control_plane.kubeconfig" > $kubeconfig_path
if [[ "$install_tanzu_cli_packages" = false ]]; then
# clear extra config if it won't be used again to avoid leaking it
vmtoolsd --cmd "info-set guestinfo.postcustomization.control_plane.kubeconfig null"
fi
export KUBECONFIG=$kubeconfig_path
# install kapp-controller, which is needed for tanzu-cli
kapp_controller_installed=false
if [[ ! -z "$kapp_controller_version" && $kapp_controller_version != "null" ]]; then
kubectl apply -f https://github.com/vmware-tanzu/carvel-kapp-controller/releases/download/v$OPENBRACKETkapp_controller_versionCLOSEBRACKET/release.yml
fi
fi
# Metrics server (currently the only tanzu cli installed package) is installed on the last worker node
tanzu_cli_installed=false
if [[ "$install_tanzu_cli_packages" = true ]]; then
vmtoolsd --cmd "info-get guestinfo.postcustomization.control_plane.kubeconfig" > $kubeconfig_path
# clear extra config to avoid leaking it
vmtoolsd --cmd "info-set guestinfo.postcustomization.control_plane.kubeconfig null"
export KUBECONFIG=$kubeconfig_path
metrics_server_version=""
# Wait for kapp-controller to be ready for at most 8 minutes to be running so that tanzu cli can be fully
# functional for our purposes
kapp_controller_pod=$(kubectl get pods -l=app='kapp-controller' -A -o jsonpath='OPENBRACKET.items[*].metadata.nameCLOSEBRACKET')
kapp_controller_namespace=$(kubectl get pods -l=app='kapp-controller' -A -o jsonpath='OPENBRACKET.items[*].metadata.namespaceCLOSEBRACKET')
kapp_controller_ready_path=/root/kapp_controller_ready.txt
kapp_controller_ready=false
kubectl wait --for=condition=Ready pod/$OPENBRACKETkapp_controller_podCLOSEBRACKET -n $kapp_controller_namespace --timeout=8m > $kapp_controller_ready_path
if [[ -f "$kapp_controller_ready_path" && -s $kapp_controller_ready_path ]]; then
kapp_controller_ready=true
else
kapp_controller_version=""
fi
if [[ "$kapp_controller_ready" = true ]]; then
# install tanzu cli
tanzu_path=/root/tanzu
mkdir $tanzu_path
tanzu_tar_path=$tanzu_path/tanzu_cli.tar.gz
wget https://github.com/vmware-tanzu/tanzu-framework/releases/download/v0.17.0/tanzu-cli-linux-amd64.tar.gz -O $tanzu_tar_path
tar -zxvf $tanzu_tar_path -C $tanzu_path
sudo install $OPENBRACKETtanzu_pathCLOSEBRACKET/v0.17.0/tanzu-core-linux_amd64 /usr/local/bin/tanzu
export HOME=/root
tanzu plugin install package
xml_version_property=$(vmtoolsd --cmd "info-get guestinfo.ovfenv" | grep "oe:key=\"VERSION\"")
init_k8s_version=$(echo $xml_version_property | sed 's/.*oe:value=\"//; s/\(.*\)-.*/\1/')
k8s_version=$(echo $init_k8s_version | tr -s "+" "_")
export KUBECONFIG=$kubeconfig_path
tanzu package repository add tanzu-core --namespace tkg-system --create-namespace --url projects.registry.vmware.com/tkg/packages/core/repo:$OPENBRACKETk8s_versionCLOSEBRACKET
# wait for metrics server to be available
metrics_server_info_str=$(tanzu package available list -A | grep metrics-server)
num_metrics_server_loops=0
while [[ -z "$metrics_server_info_str" ]]; do
sleep 15
((num_metrics_server_loops++))
if [[ $num_metrics_server_loops -gt 20 ]]; then # max 5 minutes
break
fi
metrics_server_info_str=$(tanzu package available list -A | grep metrics-server)
done
# install metrics server
metrics_server_version=$(echo $metrics_server_info_str | sed -n 's/^.*\([0-9]\+\.[0-9]\+\.[0-9]\++vmware.[0-9]\+-tkg.[0-9]\+\).*$/\1/p')
if [[ ! -z "$metrics_server_version" && $metrics_server_version != "null" ]]; then
# similar to other k8s packages, we are not waiting in order to avoid
# timeout issues crashing the cluster creation
tanzu package install metrics-server --namespace tkg-system --create-namespace --package-name metrics-server.tanzu.vmware.com --version $metrics_server_version --wait=false
fi
if [[ -z "$kapp_controller_version" ]]; then
kapp_controller_version="null"
fi
vmtoolsd --cmd "info-set guestinfo.postcustomization.core_packages.kapp_controller_version $kapp_controller_version"
if [[ -z "$metrics_server_version" ]]; then
metrics_server_version="null"
fi
vmtoolsd --cmd "info-set guestinfo.postcustomization.core_packages.metrics_server_version $metrics_server_version"
fi
rm $kubeconfig_path
fi
vmtoolsd --cmd "info-set guestinfo.postcustomization.core_packages.attempted_install successful"
echo "$(date) post customization script execution completed" &>> /var/log/cse/customization/status.log
exit 0
Expand Down
23 changes: 23 additions & 0 deletions container_service_extension/common/constants/server_constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -810,11 +810,34 @@ class PostCustomizationPhase(Enum):
KUBECTL_APPLY_CPI = 'guestinfo.postcustomization.kubectl.cpi.install.status' # noqa: E501
KUBECTL_APPLY_CSI = 'guestinfo.postcustomization.kubectl.csi.install.status' # noqa: E501
KUBECTL_APPLY_DEFAULT_STORAGE_CLASS = 'guestinfo.postcustomization.kubectl.default_storage_class.status' # noqa: E501
KUBECTL_APPLY_KAPP_CONTROLLER = 'guestinfo.postcustomization.kubectl.kapp_controller.install' # noqa: E501
KUBEADM_TOKEN_GENERATE = 'guestinfo.postcustomization.kubeadm.token.generate.status' # noqa: E501
KUBEADM_NODE_JOIN = 'guestinfo.postcustomization.kubeadm.node.join.status'
PROXY_SETTING = 'guestinfo.postcustomization.proxy.setting.status'
CORE_PACKAGES_ATTEMPTED_INSTALL = 'guestinfo.postcustomization.core_packages.attempted_install' # noqa: E501


# TO_INSTALL versions indicate versions that the control plane node retrieved
# for worker node(s) to install. INSTALLED_VERSION refers to the version
# that the worker node(s) were able to install.
@unique
class PostCustomizationVersions(Enum):
TKR_KAPP_CONTROLLER_VERSION_TO_INSTALL = 'guestinfo.postcustomization.tkr.get_versions.kapp_controller' # noqa: E501
TKR_METRICS_SERVER_VERSION_TO_INSTALL = 'guestinfo.postcustomization.tkr.get_versions.metrics_server' # noqa: E501
INSTALLED_VERSION_OF_KAPP_CONTROLLER = 'guestinfo.postcustomization.core_packages.kapp_controller_version' # noqa: E501
INSTALLED_VERSION_OF_METRICS_SERVER = 'guestinfo.postcustomization.core_packages.metrics_server_version' # noqa: E501
INSTALLED_VERSION_OF_ANTREA = 'guestinfo.postcustomization.core_packages.antrea_version' # noqa: E501


@unique
class CorePkgVersionKeys(Enum):
KAPP_CONTROLLER = 'kapp-controller'
METRICS_SERVER = 'metrics-server'
ANTREA = 'antrea'


PostCustomizationKubeconfig = 'guestinfo.postcustomization.control_plane.kubeconfig' # noqa: E501

KUBEADM_TOKEN_INFO = 'guestinfo.postcustomization.kubeadm.token.info'
KUBE_CONFIG = 'guestinfo.kubeconfig'
POST_CUSTOMIZATION_SCRIPT_EXECUTION_STATUS = 'guestinfo.post_customization_script_execution_status' # noqa: E501
Expand Down
Loading

0 comments on commit 47a862d

Please sign in to comment.