-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable dynamic creation for admission hooks and update dependencies #1450
Disable dynamic creation for admission hooks and update dependencies #1450
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andreyvelich The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I named the I named the Please let me know what do you think about that naming ? |
Is the secret.yaml in manifest still needed? https://github.com/kubeflow/katib/blob/master/manifests/v1beta1/katib-controller/secret.yaml @andreyvelich |
Basically this secret is needed only to clean-up all resources after removing Katib from the cluster. I think users can also use this secret to create their own certs (We might provide this functionality also). |
ok, SGTM |
@tenzen-y @imilos @zuiurs @trog-levrai I tested this change on GKE cluster with It would be great to see how it works on your platforms since you were facing problems with |
Also I switched to Thus, I had to update |
I downgraded The test seems to be working. |
/hold for reviews |
@andreyvelich https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#api-change-2 I see, you can use 1.19 EKS for test now, we delivered it a few days ago |
Ah looks like I need to make changes in kubeflow/testing https://github.com/kubeflow/testing/blob/master/images/aws-scripts/create-eks-cluster.sh#L29 And build new images, let me know if you need my help @andreyvelich |
Thank you for this information @PatrickXYS! |
@andreyvelich Good assumption, we definitely need to provide backward compatiability. And staying 1.18 should be a good way to go for now |
I installed
PS not an expert with K8S, sorry if the answer is obvious. I also assume I should also delete the secret I created as a workaround to make sure the default init works properly ? |
@trog-levrai Thank you for the testing! First of all, you have to properly uninstall Katib from your minikube clutser. Then, try to use my branch to install updated Katib manifests. |
I tested your changed on K8s v1.19.8 created by kubeadm.
$ kubectl get pods -n kubeflow
NAME READY STATUS RESTARTS AGE
katib-controller-5b7d756869-kj44l 0/1 Init:CrashLoopBackOff 5 11m
katib-db-manager-6895b946bc-g9bwr 1/1 Running 0 11m
katib-mysql-bd448c8b5-pxvbh 1/1 Running 0 11m
$ kubectl logs -n kubeflow $(kubectl get pods -n kubeflow -l app=katib-controller -ojsonpath='{.items[*].metadata.name}') -c cert-generator
INFO: Creating certs in tmpdir /tmp/tmp.cHLBnl Generating RSA private key, 2048 bit long modulus (2 primes)................................................................+++++..........................................+++++e is 65537 (0x010001)INFO: Creating CSR: katib-controller.kubeflow
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
NAME AGE SIGNERNAME REQUESTOR CONDITION
katib-controller.kubeflow 6m16s kubernetes.io/kube-apiserver-client system:serviceaccount:kubeflow:katib-controller Approved,Failed
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
certificatesigningrequest.certificates.k8s.io "katib-controller.kubeflow" deleted
WARN: Previous CSR was found and removed.
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
certificatesigningrequest.certificates.k8s.io/katib-controller.kubeflow created
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
NAME AGE SIGNERNAME REQUESTOR CONDITION
katib-controller.kubeflow 0s kubernetes.io/kube-apiserver-client system:serviceaccount:kubeflow:katib-controller Pending
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
certificatesigningrequest.certificates.k8s.io/katib-controller.kubeflow approved
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
Warning: certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
ERROR: After approving csr katib-controller.kubeflow, the signed certificate did not appear on the resource. Giving up after 1 minute.
$ kubectl get certificatesigningrequests.certificates.k8s.io
NAME AGE SIGNERNAME REQUESTOR CONDITION
katib-controller.kubeflow 9s kubernetes.io/kube-apiserver-client system:serviceaccount:kubeflow:katib-controller Approved,Failed
$ kubectl get certificatesigningrequests.certificates.k8s.io katib-controller.kubeflow -ojsonpath='{.status.conditions}' | jq .
[
{
"lastTransitionTime": "2021-02-28T10:54:40Z",
"lastUpdateTime": "2021-02-28T10:54:40Z",
"message": "This CSR was approved by kubectl certificate approve.",
"reason": "KubectlApprove",
"status": "True",
"type": "Approved"
},
{
"lastTransitionTime": "2021-02-28T10:54:40Z",
"lastUpdateTime": "2021-02-28T10:54:40Z",
"message": "invalid usage for client certificate: server auth",
"reason": "SignerValidationFailure",
"status": "True",
"type": "Failed"
}
] I think that
Thanks. |
Make sense, thank you for the testing @tenzen-y! |
/retest |
1 similar comment
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, thank you for the testing @tenzen-y!
I changed signerName to kubernetes.io/kubelet-serving to make sure that all usages are supported.
Please can you double check on your side if that CSR is working ?
@andreyvelich
I checked your changes.
Then, I got subject organization is not system:nodes
errors.
$ # pods in kubeflow
$ kubectl get pods -n kubeflow
NAME READY STATUS RESTARTS AGE
katib-controller-5b7d756869-6rcbj 0/1 Init:CrashLoopBackOff 5 11m
katib-db-manager-6895b946bc-jsblq 1/1 Running 0 11m
katib-mysql-bd448c8b5-szbdt 1/1 Running 0 11m
$ # certificatesigningrequests status
$ kubectl get certificatesigningrequests.certificates.k8s.io katib-controller.kubeflow -ojsonpath='{.status.conditions}' | jq .
[
{
"lastTransitionTime": "2021-03-01T01:42:20Z",
"lastUpdateTime": "2021-03-01T01:42:20Z",
"message": "This CSR was approved by kubectl certificate approve.",
"reason": "KubectlApprove",
"status": "True",
"type": "Approved"
},
{
"lastTransitionTime": "2021-03-01T01:42:20Z",
"lastUpdateTime": "2021-03-01T01:42:20Z",
"message": "subject organization is not system:nodes",
"reason": "SignerValidationFailure",
"status": "True",
"type": "Failed"
}
]
It looks like common name
and organizations
are not correct.
- Permitted subjects - organizations are exactly ["system:nodes"], common name starts with "system:node:".
It correctly worked when I fixed it.
$ # certificatesigningrequests
$ kubectl get certificatesigningrequests.certificates.k8s.io
NAME AGE SIGNERNAME REQUESTOR CONDITION
katib-controller.kubeflow 112s kubernetes.io/kubelet-serving system:serviceaccount:kubeflow:katib-controller Approved,Issued
# certificatesigningrequests status
$ kubectl get certificatesigningrequests.certificates.k8s.io katib-controller.kubeflow -ojsonpath='{.status.conditions}' | jq .
[
{
"lastTransitionTime": "2021-03-01T02:17:24Z",
"lastUpdateTime": "2021-03-01T02:17:24Z",
"message": "This CSR was approved by kubectl certificate approve.",
"reason": "KubectlApprove",
"status": "True",
"type": "Approved"
}
]
$ # deploy tpe-example.yaml
$ kubectl apply -f examples/v1beta1/tpe-example.yaml
experiment.kubeflow.org/tpe-example created
$ kubectl get pods -n kubeflow
NAME READY STATUS RESTARTS AGE
katib-controller-5b7d756869-5gsbv 1/1 Running 0 3m51s
katib-db-manager-6895b946bc-x26wm 1/1 Running 0 3m52s
katib-mysql-bd448c8b5-zhmhj 1/1 Running 0 3m51s
katib-ui-5dbbdc6596-b6bb2 1/1 Running 0 3m40s
tpe-example-5nljhpbf-6prx9 2/2 Running 0 24s
tpe-example-b5wm8n2t-bfrpr 2/2 Running 0 24s
tpe-example-d2mscqgn-fnxvt 2/2 Running 0 24s
tpe-example-tpe-77dcbf7548-msk67 1/1 Running 0 44s
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like that would be better to use environment variables in hack/cert-generator.sh instead of fixing the namespace.
/lgtm |
Thanks everyone to help on this PR! |
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from #1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Updates Docker image to include changes from kubeflow#1450, and updates operator to latest version of operator framework.
Fixes: #1405.
This PR introduces new mechanism to get certificate for webhooks.
I updated YAMLs for our webhooks.
I added
initContainer
to Katib controller which executescert-generator.sh
script.This script creates
CertificateSigningRequest
,katib-webhook-cert
secret and patches webhooks configurations with appropriate caBundle.Since we have
katib-webhook-cert
secret in the manifest, cleanup process should delete everything.So we don't need to deploy
cert-manager
for Katib.@gaocegege @johnugeorge @yanniszark @kuikuikuizzZ @knkski What do you think about this approach ?
Also I updated controller-runtime to v0.8.2 and
k8s.io
deps to v0.20.4. That requires some changes:FromVolume
resume policy. For that reason, I addedPersistentVolumeReclaimPolicy: Delete
for the PV and once PVC is garbage collected, PV should also be deleted.I still need to make some tests and create new image for cert generator.
It would be great if you can start to review this.
/cc @gaocegege @johnugeorge