Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to openshift 4.11 #241

Merged
merged 3 commits into from
Oct 10, 2022

Conversation

Mo3m3n
Copy link
Contributor

@Mo3m3n Mo3m3n commented Sep 14, 2022

Migrating Openshift 4.11 requires:

@Mo3m3n Mo3m3n force-pushed the build/openshift-4.11 branch 3 times, most recently from af24c7b to 8868e63 Compare September 14, 2022 18:35
@bnallapeta
Copy link
Contributor

@Mo3m3n Tried running it locally on a 4.11 cluster. Fails with the below error:

[bnr@bnr pipelines-service]$ ./ckcp/openshift_dev_setup.sh 
Working directory: /tmp/tmp.ZlBOATYExn
[openshift-gitops]
  - OpenShift-GitOps: OK
  - Argo CD dashboard: .............................OK
  - Argo CD URL: https://openshift-gitops-server-openshift-gitops.apps.ci-ln-6jrgsck-72292.origin-ci-int-gce.dev.rhcloud.com
  - Argo CD Login: OK
  - Register host cluster to ArgoCD as 'plnsvc': 
INFO[0000] ServiceAccount "argocd-manager" created in namespace "kube-system" 
INFO[0001] ClusterRole "argocd-manager-role" created    
INFO[0001] ClusterRoleBinding "argocd-manager-role-binding" created 
FATA[0032] Failed to wait for service account secret: timed out waiting for the condition 

@Mo3m3n
Copy link
Contributor Author

Mo3m3n commented Sep 15, 2022

@bnallapeta
I can't reproduce your error on a fresh 4.11 cluster:

➜ oc version
Client Version: 4.10.22
Server Version: 4.11.0-0.nightly-2022-09-14-233224
Kubernetes Version: v1.24.0+3882f8f
➜ ./ckcp/openshift_dev_setup.sh
Working directory: /tmp/tmp.ADFrj86aYY
[openshift-gitops]                                                  
  - OpenShift-GitOps: OK                                            
  - Argo CD dashboard: ..........................OK                 
  - Argo CD URL: https://openshift-gitops-server-openshift-gitops.apps.ci-ln-v4n4sdb-72292.origin-ci-int-gce.dev.rhcloud.com
  - Argo CD Login: OK                                               
  - Register host cluster to ArgoCD as 'plnsvc':                                                                                        
INFO[0000] ServiceAccount "argocd-manager" created in namespace "kube-system"                                                           
INFO[0000] ClusterRole "argocd-manager-role" created    
INFO[0001] ClusterRoleBinding "argocd-manager-role-binding" created                                                                     
INFO[0001] Created bearer token secret for ServiceAccount "argocd-manager"                                                              
    OK                                                                                            

I will see if I can ask for more specific debug output from openshift-gitops

@Mo3m3n
Copy link
Contributor Author

Mo3m3n commented Sep 15, 2022

Can you provide the following:
oc version
oc describe subscriptions -n openshift-operators openshift-gitops-operator | grep CSV

@Mo3m3n
Copy link
Contributor Author

Mo3m3n commented Sep 15, 2022

@bnallapeta It turns out this requires argocd CLI v2.3.7 or above in order to be able to create serviceaccount token for v1.24 clusters.
I have updated DPENDENCIES to reflect that

DEPENDENCIES.md Outdated Show resolved Hide resolved
@bnallapeta
Copy link
Contributor

@Mo3m3n As discussed, with OpenShift 4.11, it looks like scc: restricted-v2 is automatically applied to deployments which brings the capabilities mentioned below as per this blog that you had linked in the other commit.

  • dropping ALL capabilities from containers
    they still allow explicitly adding the NET_BIND_SERVICE capability
  • defaulting seccompProfile to “runtime/default”
  • requiring allowPrivilegeEscalation to be unset or set to false in security contexts

Do verify this on a new cluster post which we won't have to manually patch things.

@Mo3m3n
Copy link
Contributor Author

Mo3m3n commented Sep 16, 2022

@bnallapeta
Actually even though restricted-v2 scc is applied in Openshift 4.11, this does not prevent the Pod Security admission controller from complaining:

Warning: would violate PodSecurity "restricted:v1.24": allowPrivilegeEscalation != false (container "kcp-syncer" must set securityContex
t.allowPrivilegeEscalation=false), unrestricted capabilities (container "kcp-syncer" must set securityContext.capabilities.drop=["ALL"])
, runAsNonRoot != true (pod or container "kcp-syncer" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "kcp
-syncer" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

Users will get that for the kcp syncer deployment and you can also see the warning for the kcp pod deployment by updating the followin:

+++ b/ckcp/openshift_dev_setup.sh
@@ -275,7 +275,7 @@ patches:
   echo -n "  - kcp $kcp_version: "
   # Deploy ckcp until all resources are successfully appied to OCP cluster 
   local i=0
-  while ! error_msg=$(kubectl apply -k "$ckcp_temp_dir" 2>&1 1>/dev/null); do
+  while ! error_msg=$(kubectl apply -k "$ckcp_temp_dir" ); do

For now it is a warning and my understanding is that in the future it will be denied (unless we update the security policy

Link
In a subsequent release, we intend to move the global configuration to enforce the “restricted” pod security profile globally

We can ignore this for now but my understanding is that we need to patch the deployments sooner or later.

@Mo3m3n Mo3m3n force-pushed the build/openshift-4.11 branch from 6000a22 to 731426a Compare September 19, 2022 17:52
images/kcp-registrar/register.sh Outdated Show resolved Hide resolved
images/kcp-registrar/syncer-patch Outdated Show resolved Hide resolved
@bnallapeta
Copy link
Contributor

@Mo3m3n Please take a look at the failed ckcp test. @xinredhat Kindly assist if its a CI issue rather than the PR issue.

@Mo3m3n
Copy link
Contributor Author

Mo3m3n commented Sep 23, 2022

/retest

1 similar comment
@Mo3m3n
Copy link
Contributor Author

Mo3m3n commented Sep 23, 2022

/retest

# This is enabled by default in restricted-v2 available in openshift >= 4.11
if ! (set +o pipefail && kubectl describe scc restricted restricted-v2 2>/dev/null |
grep -sq 'runtime/default') ; then
kubectl patch scc restricted --type=merge -p '{"seccompProfiles":["runtime/default"]}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not patch SCCs installed by OpenShift. These changes will be removed by the cluster operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I would rather create a clone of that scc, patch it and assign it to the corresponding sa.
Correct ?

Copy link
Contributor

@adambkaplan adambkaplan Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I correct in that the reason this is here is because the restricted SCC doesn't allow any value to be set for the seccomp profile?

Since this is only for ckcp, I think it makes sense to declare OpenShift 4.11 as a required prerequisite. We don't deploy ckcp in the actual service (it is a dev/testing tool).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right @adambkaplan the restricted SCC in 4.10 does not allow have this securityContext.
You may ask, why I am changing that in 4.10 while the target of the PR is 4.11? reason is that 4.11 pushes for a restricted security context(via warnings).
The change works out of the box in 4.11 but the same securitContext requires patching scc in 4.10.
For the sake of consistency I was trying to patch scc in 4.10 to work around that.
We have multiple options:

  • backport the security context change in 4.10. In this case I will need assistance from openshift support. I have tried this doc and I did not succeed.
  • Don't add securityContext in 4.10 and warn users that the securitycontext of kcp pod is not the same between 4.10 and 4.11
  • Don't add security Context in 4.10 and declare that 4.11 is a required Prerequisite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the root issue here is that our ckcp deployment needs this additional security context. For now this will produce a warning. However, in 4.12 OpenShift will start enforcing the restricted profile by default, so we should adhere to best practices.

I personally would keep the additional security context declare 4.11 as a prerequisite for development work. @Roming22 I assume once this lands, we will be upgrading our staging cluster to 4.11 as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adambkaplan Correct, we would either upgrade the existing cluster, or request a new cluster to deploy the service on, before deprecating the 4.10 cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mo3m3n I'm fine with making 4.11 a prerequisite, we don't really have the bandwidth to support multiple OCP versions at the moment. By the time we might do it, 4.12 will be out and since we will probably limit support to n and n-1 version, that would again rule out 4.10.

@Mo3m3n Mo3m3n requested review from adambkaplan and removed request for ramessesii2 September 27, 2022 09:02
@Mo3m3n Mo3m3n force-pushed the build/openshift-4.11 branch from b1d8cb0 to d2d6f6d Compare September 29, 2022 15:14
@Mo3m3n Mo3m3n force-pushed the build/openshift-4.11 branch from d2d6f6d to 8b07995 Compare September 29, 2022 15:53
@Mo3m3n Mo3m3n changed the title Support openshift 4.11 Migrate to openshift 4.11 Sep 29, 2022
@@ -6,11 +6,11 @@

| **Component** | **Version** | **Purpose** | **Comments** |
|-------------------------------|---------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| OpenShift | 4.10 | Platform | Upgrades to next versions 4.11 need to be tested and approved |
| OpenShift | 4.11 | Platform | Upgrades to next versions 4.11 need to be tested and approved |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments - Upgrades to next versions 4.12 need to be tested and approved

Starting from Openshift 4.11 (kubernetes 1.24), service account token
secrets are no longer automatically generated.
Thus service account tokens will be manually created.
Starting from openshift 4.11, the global Pod Security Admission Policy
is set to emit warnings when running pods that does not meet a
restricted security Policy:
https://cloud.redhat.com/blog/pod-security-admission-in-openshift-4.11
Also:
> In a subsequent release, we intend to move the global configuration
> to enforce the “restricted” pod security profile globally
Thus now kcp pods are deployed with a restricted security context

WARNING, this introduces a breaking change and neither syncers nor ckcp
script can be deployed in Openshift 4.10, unless a scc supporting
Seccomp profile runtime default is used in openshift for the
corresponding service accounts
@Mo3m3n Mo3m3n force-pushed the build/openshift-4.11 branch from 8b07995 to 900ea2c Compare October 10, 2022 08:17
@openshift-ci
Copy link

openshift-ci bot commented Oct 10, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Mo3m3n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Roming22 Roming22 merged commit 454593d into openshift-pipelines:main Oct 10, 2022
@Mo3m3n Mo3m3n deleted the build/openshift-4.11 branch October 31, 2022 06:40
Roming22 pushed a commit that referenced this pull request Dec 19, 2022
…e-pr

Enable all-namespaces mode for pvc-cleaner.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants