-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NO-JIRA: certrotation: improve error messages #1720
NO-JIRA: certrotation: improve error messages #1720
Conversation
@vrutkovs: This pull request explicitly references no jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
f2f23fb
to
85a077e
Compare
@@ -67,6 +67,7 @@ func (c RotatedSigningCASecret) EnsureSigningCertKeyPair(ctx context.Context) (* | |||
if err != nil && !apierrors.IsNotFound(err) { | |||
return nil, false, err | |||
} | |||
var signerExists = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is required to pass unit tests, which do delete+create (and don't set ResourceVersion afterwards). It would be reworked after ApplySecretDoNotUse
is removed
/cc @tkashem @p0lyn0mial |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend refactoring the code in a way that makes it easier to prepare the error messages, but that would be a long term goal.
This PR is a nice stop gap, it looks good to me except for a few minor comments
if originalCABundleConfigMap == nil { | ||
reason = "configmap doesn't exist" | ||
} else if originalCABundleConfigMap.Data == nil { | ||
reason = "configmap is empty" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a minor suggestions:
- can we show the name and namespace of the object that does not exist for reason "configmap doesn't exist" and "configmap is empty"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error would be appended to the message with namespace/name: openshift-kube-apiserver-operator node-system-admin-signer requires a new signing cert/key pair: secret doesn't exist
in SignerUpdateRequired
event
} else if !equality.Semantic.DeepEqual(originalCABundleConfigMap.Data, caBundleConfigMap.Data) { | ||
reason = fmt.Sprintf("signer update %s", signingCertKeyPairLocation) | ||
} | ||
c.EventRecorder.Eventf("CABundleUpdateRequired", "%q in %q requires a new cert: %s", c.Name, c.Namespace, reason) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: what does the c.Name and c.Namespace mean in relation to the object in the reason, or are we saying that the CA bundle configmap located in
c.Namespace/c.Name requires a new cert
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, correct, that would be object namespace / name of the target configmap/secret
@@ -153,7 +159,7 @@ func needNewSigningCertKeyPair(annotations map[string]string, refresh time.Durat | |||
validity := notAfter.Sub(notBefore) | |||
at80Percent := notAfter.Add(-validity / 5) | |||
if time.Now().After(at80Percent) { | |||
return true, fmt.Sprintf("past its latest possible time %v", at80Percent) | |||
return true, fmt.Sprintf("past refresh time (80%% of validity): %v", at80Percent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do you think it will help if we also include the validity in the message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this would show happy path, where certs are updated at 80%, as the cert can be force refreshed in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is dead code actually, I haven't seen this message at all
func needNewTargetCertKeyPair(annotations map[string]string, signer *crypto.CA, caBundleCerts []*x509.Certificate, refresh time.Duration, refreshOnlyWhenExpired bool) string { | ||
func needNewTargetCertKeyPair(secret *corev1.Secret, signer *crypto.CA, caBundleCerts []*x509.Certificate, refresh time.Duration, refreshOnlyWhenExpired bool) string { | ||
if secret.ResourceVersion == "" { | ||
return "secret doesn't exist" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relying on the "ResourceVersion" being empty to indicate that the object does not exist may lead to bugs in the future, sometimes we set the resource version of an existing object to empty string before we send it to the apiserver for update
The more reliable way for us to know whether an object exists is when we get a NotFoundError
from the apiserver, and use the not found error to produce this error message seems more intuitive to me, thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, implemented
85a077e
to
6a9056b
Compare
* Explicitly log "secret/configmap doesn't exist" events * when secret is refreshed differenciate between refresh on 80% of validity and custom refresh time * when CA bundle is updated log which signer update caused it
6a9056b
to
9a49878
Compare
/cc @tjungblu This changes signatures for |
ACK |
@@ -41,7 +41,7 @@ type CABundleConfigMap struct { | |||
EventRecorder events.Recorder | |||
} | |||
|
|||
func (c CABundleConfigMap) EnsureConfigMapCABundle(ctx context.Context, signingCertKeyPair *crypto.CA) ([]*x509.Certificate, error) { | |||
func (c CABundleConfigMap) EnsureConfigMapCABundle(ctx context.Context, signingCertKeyPair *crypto.CA, signingCertKeyPairLocation string) ([]*x509.Certificate, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
silly question while going through the breaking change in CEO, isn't that simply ca.Namespace/ca.Name
at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be a location of the signer used to build this CA, so its secret.Namespace/secret.Name
/lgtm Thanks! |
/cc @soltysh (for approval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: soltysh, tkashem, vrutkovs The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@vrutkovs: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Tested in this job with shortened rotation periods.
This Loki query shows new events:
SignerUpdateRequired openshift-kube-apiserver-operator node-system-admin-signer requires a new signing cert/key pair: secret doesn't exist
instead of "missing notBefore annotation"SignerUpdateRequired openshift-kube-apiserver-operator node-system-admin-signer requires a new signing cert/key pair: past refresh time (80% of validity): 2024-06-24 11:46:13.8 +0000 UTC
CABundleUpdateRequired openshift-kube-controller-manager-operator csr-controller-signer-ca requires a new cert: signer update openshift-kube-controller-manager-operator/csr-signer